The chi-square goodness-of-fit test is a single-sample nonparametric test, also referred to as the one-sample goodness-of-fit test or Pearson's chi-square goodness-of-fit test. It is used to determine whether the distribution of cases (e.g., participants) in a single categorical variable (e.g., "gender", consisting of two groups: "males" and "females") follows a known or hypothesised distribution (e.g., a distribution that is "known", such as the proportion of males and females in a country; or a distribution that is "hypothesised", such as the proportion of males versus females that we anticipate voting for a particular political party in the next elections). The proportion of cases expected in each group of the categorical variable can be equal or unequal (e.g., we may anticipate an "equal" proportion of males and females voting for the Republican Party, or an "unequal" proportion, with 70% of those voting for the Republican Party being male and only 30% female).
When you carry out a chi-square goodness-of-fit test, "hypothesising" whether you expect the proportion of cases in each group of your categorical variable to be "equal" or "unequal" is critical. Not only is it an important aspect of your research design, but from a practical perspective, it will determine how you carry out the chi-square goodness-of-fit test in SPSS, as well as how you interpret and write up your results.
In this "quick start" guide, we show you how to carry out a chi-square goodness-of-fit test using SPSS when you have "equal" expected proportions (e.g., you anticipated an "equal" proportion of males and females voting for the Republican Party). In addition, we explain how to interpret the results from this test. However, if you have "unequal" expected proportions (e.g., you anticipated 70% of those voting for the Republican Party being male and only 30% female), we show you how to do this in our enhanced chi-square goodness-of-fit guide, which you can learn about here. Therefore, assuming that you would like to know the SPSS procedure and interpretation of the chi-square goodness-of-fit test when you have equal expected proprotions, you first need to understand the different assumptions that your data must meet in order for a chi-square goodness-of-fit to give you a valid result. We discuss these assumptions next.
When you choose to analyse your data using the chi-square goodness-of-fit test, part of the process involves checking to make sure that the data you want to analyse can actually be analysed using a chi-square goodness-of-fit test. You need to do this because it is only appropriate to use a chi-square goodness-of-fit test if your data meets four assumptions that are required for a chi-square goodness-of-fit test to give you a valid result. In practice, checking for these assumptions is a relatively simple process, only requiring you to use SPSS. Letâ€™s take a look at these four assumptions:
Therefore, before proceeding, check that your study design meets assumptions #1, #2 and #3. Assuming they do, you will now need to check that your data meets assumption #4, which you can do using SPSS. We explain how to test for assumption #4 and how to interpret the SPSS output in our enhanced chi-square goodness-of-fit guide to help you get this right. You can find out about our enhanced content as a whole here, or more specifically, learn how we help with testing assumptions here.
In the section, Procedure, we illustrate the SPSS procedure required to perform a chi-square goodness-of-fit test assuming that no assumptions have been violated and when you have equal expected proportions. First, we set out the example we use to explain the chi-square goodness-of-fit procedure in SPSS.
A website owner, Christopher, wants to offer a free gift to people that purchase a subscription to his website. New subscribers can choose one of three gifts of equal value: a gift voucher, a cuddly toy or free cinema tickets. After 1000 people have signed up, Christopher wants to review the figures to see if the three gifts offered were equally popular.
In this case, the three gifts – a gift voucher, a cuddly toy or free cinema tickets – reflect the three groups of the categorical variable, gift_type. The 1000 people that have signed up reflect the "cases" (i.e., cases can be anything from "people", to "animals", "objects", "organisations", and so forth).
There are two methods of entering data into SPSS in order to run a chi-square goodness-of-fit test in SPSS. Common to both methods is a column in the SPSS data file for the categorical variable, which in this example, we shall name gift_type. We have assigned codes of "1" for the gift certificate, which we labelled "Gift Certificate", "2" for the cuddly toy, which we labelled "Cuddly Toy", and "3" for the free cinema tickets, which we labelled "Cinema Tickets". If the frequency data has already been summated for the various categories, we need to create a second column that contains the respective frequency counts; we have called this variable frequency. This type of data entry is shown below:
Published with written permission from SPSS, IBM Corporation.
Note: If you have entered your data in this way, you cannot run the chi-square goodness-of-fit test without first "weighting" your cases. This is a procedure that tells SPSS that you have summated your categories. It is required because it changes the way that SPSS deals with your data in order to run the chi-square goodness-of-fit test. If you are unsure how to weight your cases, we show you how to do this in our enhanced chi-square goodness-of-fit guide.
Alternatively, you may have the data in raw form (i.e., you have not summated the frequencies). In this case, you do not need a second column as SPSS can calculate the frequencies of occurrence of each category for you. This would mean that, in this example, there are 1000 rows of data, of which the beginning of said data is shown below:
Published with written permission from SPSS, IBM Corporation.
If you are still unsure how to enter your data accurately into the Data View and Variable View of SPSS, we should you how to do this in our enhanced chi-square goodness-of-fit test guide. You can learn about our enhanced data setup content in general here.
The four steps below show you how to analyse your data using a chi-square goodness-of-fit test in SPSS when you have hypothesised that you have equal expected proportions (N.B., if you are unclear about the differences between equal and unequal expected proportions, see the Introduction). Also, it is important to note that this procedure will only give you the correct results if you have set up your data correctly in SPSS (N.B., if you have entered the summated frequencies for each group of your categorical variable, this procedure will only work if you have already "weighted" your cases, as we explained in the Data Setup section earlier, but if you have entered all of your data into SPSS in raw form, this procedure will not give the correct results). In our enhanced chi-square goodness-of-fit test guide, we show all the SPSS procedures for when you have equal and unequal expected proportions, as well as when you have to weight your cases or have not summated your data. If you only need to follow this "quick start" guide for equal expected proportions (without the weighting of cases), the four steps you need are shown below. At the end of these four steps, we show you how to interpret the results from this test.
Click Analyze > Nonparametric Tests > Legacy Dialogs > Chi-square... on the top menu as shown below:
Note: If you are on older versions of SPSS, you will not have to go through the Legacy Dialogs menu.
Published with written permission from SPSS, IBM Corporation.
You will be presented with the Chi-square Test dialogue box, as shown below:
Published with written permission from SPSS, IBM Corporation.
Transfer the gift_type variable into the Test Variable List: box by using the button, as shown below:
Published with written permission from SPSS, IBM Corporation.
Keep the All categories equal option selected in the –Expected Values– area as we are assuming equal proportions for each category.
Click the button to generate the output.
The SPSS output that is generated for the chi-square goodness-of-fit test will depend on whether you have hypothesised that the proportion of cases expected in each group of the categorical variable is equal or unequal. For a complete explanation of the output you have to interpret for the chi-square goodness-of-fit test for both scenarios (as well as the testing of Assumption #4), you can access our enhanced chi-square goodness-of-fit guide, as well as all of our other SPSS guides here. Below we show you the SPSS output when you have hypothesised that the proportion of cases expected in each group of your categorical varaible is equal.
The table below, gift_type, provides the observed frequencies (Observed N) for each gift, as well as the expected frequencies (Expected N), which are the frequencies expected if the null hypothesis is true. The difference between the observed and expected frequencies is provided in the Residual column.
Published with written permission from SPSS, IBM Corporation.
The table below, Test Statistics, provides the actual result of the chi-square goodness-of-fit test. We can see from this table that our test statistic is statistically significant: χ^{2}(2) = 49.4, p < .0005. Therefore, we can reject the null hypothesis and conclude that there are statistically significant differences in the preference of the type of sign-up gift, with less people preferring the "Cuddly Toy" (N = 230) compared to either the "Gift Certificate" (N = 370) or the "Cinema Tickets" (N = 400).
Published with written permission from SPSS, IBM Corporation.
In our enhanced chi-square goodness-of-fit test guide, we show you how to write up the results if you need to report this in a dissertation/thesis, assignment or research report. We do this using the Harvard and APA styles. You can learn more about our enhanced content here.