A binomial logistic regression is used to predict a dichotomous dependent variable based on one or more continuous or nominal independent variables. It is the most common type of logistic regression and is often simply referred to as logistic regression. However, in Minitab they refer to it as binary logistic regression. In many ways a binomial logistic regression can be considered as a multiple linear regression, but for a dichotomous rather than a continuous dependent variable.
For example, you could use a binomial logistic regression to understand whether the presence of heart disease can be predicted from physical activity level, cholesterol concentration, glucose concentration and body composition. Heart disease is the dichotomous dependent variable (i.e., presence of heart disease is either "Yes" or "No"). Physical activity level (in minutes per week), cholesterol concentration (mmol/L) and glucose concentration (mmol/L) are continuous independent variables and body composition is a nominal independent variable (i.e., with three groups: "Normal", "Overweight" and "Obese"). Another example where you could use a binomial logistic regression is to understand whether the premature failure of a new type of light bulb (i.e., before its one year warranty) can be predicted from the total duration the light is on for, the number of times the light is switched on and off, and the temperature of the ambient air. In this case, premature failure is the dichotomous dependent variable (i.e., the light bulb fails within its one year warranty: "Yes" or "No"). The other three variables used to predict the light bulb failure are all continuous independent variables: the total duration the light is on for (in minutes), the number of times the light is switched on and off and the ambient air temperature (in °C).
In this guide, we show you how to carry out a binomial logistic regression using Minitab, as well as interpret and report the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for a binomial logistic regression to give you a valid result. We discuss these assumptions next.
Note: We do not currently have a premium version of this guide in the subscription part of our website. If you would like us to add a premium version of this guide, please contact us.
Binomial logistic regression has seven assumptions. You cannot test the first two of these assumptions with Minitab because they relate to your study design and choice of variables. However, you should check whether your study meets these assumptions before moving on. If these assumptions are not met, there is likely to be a different statistical test that you can use instead. Assumptions #1 and #2 are explained below:
Note: Ordinal independent variables can be used, but they must be treated as either continuous or nominal variables. However, you can treat some ordinal variables as continuous and some as nominal; they do not all have to be treated the same. Examples of ordinal variables include Likert items (e.g., a 7-point scale from "strongly agree" through to "strongly disagree"), amongst other ways of ranking categories (e.g., a 3-point scale explaining how much a customer liked a product, ranging from "Not very much", to "It is OK", to "Yes, a lot").
Assumptions #3, #4, #5 and #6 relate to the nature of your data and can be checked using Minitab. You have to check that your data meets these assumptions because if it does not, the results you get when running a binomial logistic regression might not be valid. In fact, do not be surprised if your data violates one or more of these assumptions. This is not uncommon. However, there are possible solutions to correct such violations (e.g., transforming your data) such that you can still use binomial logistic regression. Assumptions #3, #4, #5 and #6 are explained below:
In practice, checking for assumptions #3, #4, #5 and #6 will probably take up most of your time when carrying out a binomial logistic regression. However, it is not a difficult task, and Minitab provides all the tools you need to do this.
In the section, Test Procedure in Minitab, we illustrate the Minitab procedure required to perform binomial logistic regression assuming that no assumptions have been violated. First, we set out the example we use to explain the binomial logistic regression procedure in Minitab.
A marathon is a very hard race and many who have never ran a marathon before do not finish. A sport scientist is interested in reducing this dropout rate by discovering what might predict whether a first-time marathon runner quits the race. In order to do this a researcher randomly interviewed many finishers and non-finishers, who were also first-time marathon runners, at a number of marathon races across the world. They asked how long they had been training for the marathon, whether they were running for a charity, their age and whether the marathon was considered a 'prestigious' marathon (e.g., the London Marathon, which draws huge crowds).
Therefore, in this example, the dichotomous dependent variable is finished_race, which has two categories: "Yes" and "No". The length of training prior to the marathon was a continuous independent variable, training_duration (in months), and participants' age was also a continuous independent variable, age (in years). Whether a participant was running for a charity was a dichotomous independent variable, charity, with two categories: "Yes" and "No". In total, 203 first-time runners were recruited.
Note: The example and data used for this guide are fictitious. We have just created them for the purposes of this guide.
In Minitab, we entered our four variables into the first fours columns (, , and ). Under column we entered the name of the dichotomous dependent variable, finished_race, as follows: . Then, under column we entered the name of the continuous independent variable, training_duration, as follows: . Next, under column we entered the name of the dichotomous independent variable, charity, as follows: . In the final column, , we entered the name of the continuous independent variable, age, as follows: . The data setup is shown below:
Published with written permission from Minitab Inc.
Note: It does not matter which order you enter the variables into Minitab.
In this section, we show you how to analyse your data using a binomial logistic regression in Minitab when the six assumptions set out in the Assumptions section have not been violated. Therefore, the six steps required to run a binomial logistic regression in Minitab are shown below:
Click Stat > Regression > Binary Logistic Regression > Fit Binary Logistic Model... on the top menu, as shown below:
Published with written permission from Minitab Inc.
You will be presented with the following Binary Logistic Regression dialogue box:
Published with written permission from Minitab Inc.
Transfer the dichotomous dependent variable, C1 finished_race, into the Response: box. Then, transfer the continuous independent variables – C2 training_duration and C4 age – into the Continuous predictors: box. Finally, transfer the categorical independent variable, C3 charity, into the Categorical predictors: box. You will end up with the following dialogue box:
Published with written permission from Minitab Inc.
Note: To transfer the various variables, you first need to click inside the various boxes (e.g., the Response: box) and all eligible variables that can be transferred will appear in the main left-hand box (e.g., C1 finished_race). This will activate the button (it is usually faded: ). Since the Response: box is where you put your dependent variable, you need to select the appropriate variable in the main left-hand box and either press the button or simply double-click on the variable (i.e., C1 finished_race in our example). You now need to follow the same procedure, but for the independent variables.
Click the button. You will be presented with the Binary Logistic Regression: Results dialogue box, as shown below:
Published with written permission from Minitab Inc.
Change the Display of results: option to and the Coefficients: option to . You will be presented with the following:
Published with written permission from Minitab Inc.
Click the button. You will be returned to the Binary Logistic Regression dialogue box.
Click the button. This will generate the results.
You will notice that there is a lot of output produced by Minitab after you have run the binary logistic regression procedure. We summarize some of the most important parts of the output, as shown below:
This output provides three important pieces of information:
In this example, the Hosmer-Lemeshow test is not statistically significant (p = .721), which indicates that the model fits the data well. The p-values for the training_duration, charity and age coefficients indicate that only training duration (p < .0005) and age (p = .022) are statistically significant predictors of dropout in a marathon race amongst first-time runners.
Note: A Classification Table is very useful to produce, but is not produced automatically by Minitab. Nonetheless, it can be produced in Minitab by selecting the correct options in the binary logistic regression procedure and following these up with further tests. Producing this table will allow you to calculate percentage accuracy in classification (PAC), Sensitivity, Specificity, positive predictive value and negative predictive value, all potentially useful measures in evaluating your data.
When you report the output of your binomial logistic regression, it is good practice to include:
Based on the Minitab output above, you could report the results as follows:
A binomial logistic regression was run to understand the effects of training duration, running for charity and age on dropout in a marathon race for first-time runners. The Hosmer-Lemeshow test showed that the model fitted the data well, p = 721. Both time spent training for the marathon (p < .0005) and a runner's age (p = .0022) statistically significantly predicted dropout. However, running for a charity did not statistically significantly predict dropout, p = .373.
In addition to reporting the results as above, a diagram can be used to visually present your results. This can make it easier for others to understand your results. Furthermore, you can use Minitab to make predictions about dropout (the dependent variable) based on values you define for your independent variables. This is a separate procedure available in Minitab that you can use once you have run the binary logistic regression procedure.