Binomial Logistic Regression using SPSS Statistics

Introduction

A binomial logistic regression (often referred to simply as logistic regression), predicts the probability that an observation falls into one of two categories of a dichotomous dependent variable based on one or more independent variables that can be either continuous or categorical. If, on the other hand, your dependent variable is a count, see our Poisson regression guide. Alternatively, if you have more than two categories of the dependent variable, see our multinomial logistic regression guide.

For example, you could use binomial logistic regression to understand whether exam performance can be predicted based on revision time, test anxiety and lecture attendance (i.e., where the dependent variable is "exam performance", measured on a dichotomous scale – "passed" or "failed" – and you have three independent variables: "revision time", "test anxiety" and "lecture attendance"). Alternately, you could use binomial logistic regression to understand whether drug use can be predicted based on prior criminal convictions, drug use amongst friends, income, age and gender (i.e., where the dependent variable is "drug use", measured on a dichotomous scale – "yes" or "no" – and you have five independent variables: "prior criminal convictions", "drug use amongst friends", "income", "age" and "gender").

This "quick start" guide shows you how to carry out binomial logistic regression using SPSS Statistics, as well as interpret and report the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for binomial logistic regression to give you a valid result. We discuss these assumptions next.

SPSS Statistics

Assumptions

When you choose to analyse your data using binomial logistic regression, part of the process involves checking to make sure that the data you want to analyse can actually be analysed using a binomial logistic regression. You need to do this because it is only appropriate to use a binomial logistic regression if your data "passes" seven assumptions that are required for binomial logistic regression to give you a valid result. In practice, checking for these seven assumptions just adds a little bit more time to your analysis, requiring you to click a few more buttons in SPSS Statistics when performing your analysis, as well as think a little bit more about your data, but it is not a difficult task.

Before we introduce you to some of these assumptions, do not be surprised if, when analysing your own data using SPSS Statistics, one or more of these assumptions is violated (i.e., not met). This is not uncommon when working with real-world data rather than textbook examples, which often only show you how to carry out binomial logistic regression when everything goes well! However, don’t worry. Even when your data fails certain assumptions, there is often a solution to overcome this. First, let's take a look at some of these assumptions:

Assumption #1: Your dependent variable should be measured on a dichotomous scale. Examples of dichotomous variables include gender (two groups: "males" and "females"), presence of heart disease (two groups: "yes" and "no"), personality type (two groups: "introversion" or "extroversion"), body composition (two groups: "obese" or "not obese"), and so forth. However, if your dependent variable was not measured on a dichotomous scale, but a continuous scale instead, you will need to carry out multiple regression, whereas if your dependent variable was measured on an ordinal scale, ordinal regression would be a more appropriate starting point.
Assumption #2: You have one or more independent variables, which can be either continuous (i.e., an interval or ratio variable) or categorical (i.e., an ordinal or nominal variable). Examples of continuous variables include revision time (measured in hours), intelligence (measured using IQ score), exam performance (measured from 0 to 100), weight (measured in kg), and so forth. Examples of ordinal variables include Likert items (e.g., a 7-point scale from "strongly agree" through to "strongly disagree"), amongst other ways of ranking categories (e.g., a 3-point scale explaining how much a customer liked a product, ranging from "Not very much" to "Yes, a lot"). Examples of nominal variables include gender (e.g., 2 groups: male and female), ethnicity (e.g., 3 groups: Caucasian, African American and Hispanic), profession (e.g., 5 groups: surgeon, doctor, nurse, dentist, therapist), and so forth. You can learn more about variables in our article: Types of Variable.
Assumption #3: You should have independence of observations and the dependent variable should have mutually exclusive and exhaustive categories.
Assumption #4: There needs to be a linear relationship between any continuous independent variables and the logit transformation of the dependent variable. In our enhanced binomial logistic regression guide, we show you how to: (a) use the Box-Tidwell (1962) procedure to test for linearity; and (b) interpret the SPSS Statistics output from this test and report the results.

You can check assumption #4 using SPSS Statistics. Assumptions #1, #2 and #3 should be checked first, before moving onto assumption #4. We suggest testing these assumptions in this order because it represents an order where, if a violation to the assumption is not correctable, you will no longer be able to use a binomial logistic regression (although you may be able to run another statistical test on your data instead). Just remember that if you do not run the statistical tests on these assumptions correctly, the results you get when running binomial logistic regression might not be valid. This is why we dedicate a number of sections of our enhanced binomial logistic regression guide to help you get this right. You can find out about our enhanced content as a whole on our Features: Overview page, or more specifically, learn how we help with testing assumptions on our Features: Assumptions page.

In the section, Test Procedure in SPSS Statistics, we illustrate the SPSS Statistics procedure to perform a binomial logistic regression assuming that no assumptions have been violated. First, we introduce the example that is used in this guide.