Login

Two-way ANOVA in Stata

Introduction

The two-way ANOVA compares the mean differences between groups that have been split on two independent variables (called factors). The primary purpose of a two-way ANOVA is to understand if there is an interaction between the two independent variables on the dependent variable.

For example, you could use a two-way ANOVA to understand whether there is an interaction between educational level and degree type on salary (i.e., your dependent variable would be "salary", measured on a continuous scale using US dollars, and your independent variables would be "educational level", which has three groups – "undergraduate", "master's" and "PhD" – and "degree type", which has five groups: "business studies", "psychology", "biological sciences", "engineering" and "law"). Alternately, you could use a two-way ANOVA to understand whether there is an interaction between physical activity level and gender on blood cholesterol concentration in children (i.e., your dependent variable would be "blood cholesterol concentration", measured on a continuous scale in mmol/L, and your independent variables would be "physical activity level, which has three groups – "low", "moderate" and "high" – and "gender", which has two groups: "males" and "females").

Note: If you have three independent variables rather than two, you need a three-way ANOVA.

If you have a statistically significant interaction between your two independent variables on the dependent variable, you can follow up this result by determining whether there are any "simple main effects", and if there are, what these effects are (e.g., perhaps females with a university education had a greater interest in politics than males with a university education). We come back to "simple main effects" later.

In this "quick start" guide, we show you how to carry out a two-way ANOVA using Stata, as well as interpret and report the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for a two-way ANOVA to give you a valid result. We discuss these assumptions next.

Stata

Assumptions

There are six "assumptions" that underpin the two-way ANOVA. If any of these six assumptions are not met, you cannot analyze your data using a two-way ANOVA because you will not get a valid result. Since assumptions #1, #2 and #3 relate to your study design and choice of variables, they cannot be tested for using Stata. However, you should decide whether your study meets these assumptions before moving on.

Fortunately, you can check assumptions #4, #5 and #6 using Stata. When moving on to assumptions #4, #5 and #6, we suggest testing them in this order because it represents an order where, if a violation to the assumption is not correctable, you will no longer be able to use a two-way ANOVA. In fact, do not be surprised if your data fails one or more of these assumptions since this is fairly typical when working with real-world data rather than textbook examples, which often only show you how to carry out a two-way ANOVA when everything goes well. However, don’t worry because even when your data fails certain assumptions, there is often a solution to overcome this (e.g., transforming your data or using another statistical test instead). Just remember that if you do not check that your data meets these assumptions or you test for them incorrectly, the results you get when running a two-way ANOVA might not be valid.

In practice, checking for assumptions #4, #5 and #6 will probably take up most of your time when carrying out a two-way ANOVA. However, it is not a difficult task, and Stata provides all the tools you need to do this.

In the section, Test Procedure in Stata, we illustrate the Stata procedure required to perform a two-way ANOVA assuming that no assumptions have been violated. First, we set out the example we use to explain the two-way ANOVA procedure in Stata.

Stata

Example

A researcher was interested in whether an individual's interest in politics was influenced by their level of education and gender. Therefore, the dependent variable was "interest in politics", and the two independent variables were "gender" and "level of education".

In particular, the researcher wanted to know whether there was an interaction between education level and gender. Put another way, was the effect of level of education on interest in politics different for males and females?

To answer this question, a random sample of 60 participants were recruited to take part in the study – 30 males and 30 females – equally split by level of education: school, college and university (i.e., 10 participants in each group). Each participant in the study completed a questionnaire that scored their interest in politics on a scale of 0 to 100, with higher scores indicating a greater interest in politics.

Participants' interest in politics was recorded in the variable, Int_Politics, their gender in the variable, Gender, and their level of education in the variable, Edu_Level. In variable terms, the researcher wanted to know if there was an interaction between Gender and Edu_Level on Int_Politics.

Stata

Setup in Stata

In Stata, we separated the individuals into their appropriate groups by using two columns representing the two independent variables, and labelled them Gender and Edu_Level. For Gender, we coded "Male" as 1 and "Female" as 2, and for Edu_Level, we coded "School" as 1, "College" as 2 and "University" as 3. The participants' interest in politics – the dependent variable – was entered under the variable name, Int_Politics. The setup for this example can be seen below:

Managing value labels within the data editor for the two-way ANOVA in Stata

Published with written permission from StataCorp LP.

The scores for the independent variables, Edu_Level and Gender, as well as the scores for the dependent variable, Int_Politics, were then entered into the Data Editor (Edit) spreadsheet, as shown below:

Data editor for the two-way ANOVA in Stata

Published with written permission from StataCorp LP.

Stata

Test Procedure in Stata

In this section, we show you how to analyze your data using a two-way ANOVA in Stata when the six assumptions in the previous section, Assumptions, have not been violated. You can carry out a two-way ANOVA using code or Stata's graphical user interface (GUI). After you have carried out your analysis, we show you how to interpret your results. First, choose whether you want to use code or Stata's graphical user interface (GUI).

Stata

Code

In the first section below, we set out the code to carry out a two-way ANOVA. All code is entered into Stata's Command box, as illustrated below:

Command box in Stata

Published with written permission from StataCorp LP.

The code to run a two-way ANOVA on your data takes the form:

anova DependentVariable FirstIndependentVariable##SecondIndependentVariable

Using our example where the dependent variable is Int_Politics and the two independent variables are Gender and Edu_Level, the required code would be:

anova Int_Politics Gender##Edu_Level

Therefore, enter the code and press the "Return/Enter" button on your keyword.

Command box for the two-way ANOVA in Stata

You can see the Stata output that will be produced here. If there is a statistically significant interaction, you can carry out simple main effects. We discuss this later.

Stata

Graphical user interface (GUI)

  1. Click Statistics > Linear models and related > ANOVA/MANOVA > Analysis of variance and covariance on the top menu as shown below:
    Main menu for the two-way ANOVA in Stata

    Published with written permission from StataCorp LP.


    You will be presented with the following anova - Analysis of variance and covariance dialogue box:
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.

  2. Select the dependent variable, Int_Politics, from within the Dependent variable: drop-down box, and click on the three dot button, Three dot button shown, to the far right of the Model: drop-down box.
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.


    You will be presented with the following Create varlist with factor variables dialogue box:
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.

  3. Keep the Factor variable option selected in the –Type of variable– area. In the –Add factor variable– area, select the 2-way full factorial option from within the Specification: drop-down box. You will be presented with a second Variables drop-down box, as shown below:
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.

  4. For Variable 1:, select Gender under the Variables drop-down box and default under the Base drop-down box. For Variable 2:, select Edu_Level under the Variables drop-down box and default under the Base drop-down box. Next, click on the Add to varlist button, which will add the Model term, Gender##Edu_Level, to the Varlist: box.
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.

    Note: We have not ticked the check box, Check box shown, under c. for either of our two independent variables, Gender or Edu_Level. This is because Assumption #2 of a two-way ANOVA is that both independent variables are "factorial variables" (i.e., categorical variables); that is, Gender has two categories (i.e., Male and Female), whilst Edu_Level has three categories (i.e., School, College and University).

  5. Click on the OK button. You will be presented with the anova - Analysis of variance and covariance dialogue box, but now with the Model term, Gender##Edu_Level, having been added in the Model: box, as shown below:
    Two-way ANOVA main options box

    Published with written permission from StataCorp LP.

  6. Click on the OK button. This will generate the Stata output for the two-way ANOVA, shown in the next section.

Stata

Output of the two-way ANOVA in Stata

If your data passed assumption #4 (i.e., there were no significant outliers), assumption #5 (i.e., your dependent variable was approximately normally distributed for each group of the independent variable) and assumption #6 (i.e., there was homogeneity of variances), which we explained earlier in the Assumptions section, you will only need to interpret the following Stata output for the two-way ANOVA:

Output for the two-way ANOVA

Published with written permission from StataCorp LP.

The Gender, Edu_Level and Gender#Edu_Level rows in the output above explain whether we have statistically significant effects for our two independent variables, Gender and Edu_Level, and for their interaction, Gender#Edu_Level.

We first look at the Gender#Edu_Level interaction because this is the most important result we are after. We can see from the Prob > F column that we have a statistically significant interaction at the p = .0016 level. You may wish to report the results of Gender and Edu_Level as well. We can see from the output above that there was no statistically significant difference in interest in politics between Gender (p = .4987), but there were statistically significant differences between educational levels (p < .0005).

Finally, if you have a statistically significant interaction, you will also need to report simple main effects; that is, the effect of one of the independent variables at a particular level of the other independent variable. In our example, this would involve determining the mean difference in interest in politics between genders at each educational level, as well as between educational level for each gender (e.g., perhaps females with a university education had a greater interest in politics than males with a university education). Alternately, if you do not have a statistically significant interaction, you might report the main effects instead. Both the simple main effects and main effects can be calculated using Stata.

Stata

Reporting the results of a two-way ANOVA

When you report the output of your two-way ANOVA, it is good practice to include:

Based on the Stata output above, we could report the results of this study as follows (N.B., we have also included an example of simple main effects):

  • General

A two-way ANOVA was run on a sample of 60 participants to examine the effect of gender and education level on interest in politics. There was a significant interaction between the effects of gender and education level on interest in politics, F(2, 52) = 7.33, p = .0016. Simple main effects analysis showed that males were significantly more interested in politics than females when educated to university level (p = .002), but there were no differences between gender when educated to school (p = .465) or college level (p = .793).

1