Login

Principal Components Analysis (PCA) using SPSS Statistics

Introduction

Principal components analysis (PCA, for short) is a variable-reduction technique that shares many similarities to exploratory factor analysis. Its aim is to reduce a larger set of variables into a smaller set of 'artificial' variables, called 'principal components', which account for most of the variance in the original variables.

There are a number of common uses for PCA: (a) you have measured many variables (e.g., 7-8 variables, represented as 7-8 questions/statements in a questionnaire) and you believe that some of the variables are measuring the same underlying construct (e.g., depression). If these variables are highly correlated, you might want to include only those variables in your measurement scale (e.g., your questionnaire) that you feel most closely represent the construct, removing the others; (b) you want to create a new measurement scale (e.g., a questionnaire), but are unsure whether all the variables you have included measure the construct you are interested in (e.g., depression). Therefore, you test whether the construct you are measuring 'loads' onto all (or just some) of your variables. This helps you understand whether some of the variables you have chosen are not sufficiently representative of the construct you are interested in, and should be removed from your new measurement scale; (c) you want to test whether an existing measurement scale (e.g., a questionnaire) can be shortened to include fewer items (e.g., questions/statements), perhaps because such items may be superfluous (i.e., more than one item may be measuring the same construct) and/or there may be the desire to create a measurement scale that is more likely to be completed (i.e., response rates tend to be higher in shorter questionnaires). These are just some of the common uses of PCA. It is also worth noting that whilst PCA is conceptually different to factor analysis, in practice it is often used interchangeably with factor analysis, and is included within the 'Factor procedure' in SPSS Statistics.

In this "quick start" guide, we show you how to carry out PCA using SPSS Statistics, as well as the steps you'll need to go through to interpret the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for PCA to give you a valid result. We discuss these assumptions next.

SPSS Statistics

Assumptions of a principal components analysis (PCA)

When you choose to analyse your data using PCA, part of the process involves checking to make sure that the data you want to analyse can actually be analysed using PCA. You need to do this because it is only appropriate to use PCA if your data "passes" four assumptions that are required for PCA to give you a valid result. In practice, checking for these assumptions requires you to use SPSS Statistics to carry out a few more tests, as well as think a little bit more about your data, but it is not a difficult task.

Before we introduce you to these four assumptions, do not be surprised if, when analysing your own data using SPSS Statistics, one or more of these assumptions is violated (i.e., not met). This is not uncommon when working with real-world data rather than textbook examples. However, even when your data fails certain assumptions, there is often a solution to try and overcome this. First, let’s take a look at these four assumptions:

You can check assumptions #2, #3, #4 and #5 using SPSS Statistics. Just remember that if you do not run the statistical tests on these assumptions correctly, the results you get when running PCA might not be valid. This is why we dedicate number of articles in our enhanced guides to help you get this right. You can find out about our enhanced content as a whole on our Features: Overview page, or more specifically, learn how we help with testing assumptions on our Features: Assumptions page.

In the section, Procedure, we illustrate the SPSS Statistics procedure that you can use to carry out PCA on your data. First, we introduce the example that is used in this guide.

Testimonials
TAKE THE TOUR


SPSS Statistics

Example used in this guide

A company director wanted to hire another employee for his company and was looking for someone who would display high levels of motivation, dependability, enthusiasm and commitment (i.e., these are the four constructs we are interested in). In order to select candidates for interview, he prepared a questionnaire consisting of 25 questions that he believed might answer whether he had the correct candidates. He administered this questionnaire to 315 potential candidates. The questions were phrased such that these qualities should be represented in the questions. Questions Qu3, Qu4, Qu5, Qu6, Qu7, Qu8, Qu12, Qu13 were associated with motivation; Qu2, Qu14, Qu15, Qu16, Qu17, Qu18, Qu19 were associated with dependability; Qu20, Qu21, Qu22, Qu23, Qu24, Qu25 for enthusiasm; and Qu1, Qu9, Qu10, Qu11 for commitment. The director wanted to determine a score for each candidate so that these scores could be used to grade the potential recruits.

SPSS Statistics

Data setup in SPSS Statistics

In our enhanced PCA guide, we show you how to correctly enter data in SPSS Statistics to run a PCA. You can learn about our enhanced data setup content on our Features: Data Setup page. Alternately, see our generic, "quick start" guide: Entering Data in SPSS Statistics.

SPSS Statistics

SPSS Statistics procedure to carry out a principal components analysis (PCA)

The 18 steps below show you how to analyse your data using PCA in SPSS Statistics when none of the five assumptions in the previous section, Assumptions, have been violated. At the end of these 18 steps, we show you how to interpret the results from your PCA. If you are looking for help to make sure your data meets assumptions #2, #3, #4 and #5, which are required when using PCA, and can be tested using SPSS Statistics, we help you do this in our enhanced content (see our Features: Overview page to learn more).

The SPSS Statistics procedure for PCA is not linear (i.e., only if you are lucky will you be able to run through the following 18 steps and accept the output as your final results). You will often have to re-run these 18 steps based on (a) the results from your assumptions tests that are run during this procedure and (b) the values of the initial components that are extracted when you carry out these 18 steps. In re-running your analysis, you may have to select different options from the SPSS Statistics procedure below, or follow additional SPSS Statistics procedures to arrive at the best possible solution. We explain more about these next steps in the Output section later. First, follow the 18 steps below to attain your initial SPSS Statistics output:

Note: The procedure that follows is identical for SPSS Statistics versions 18 to 29, as well as the subscription version of SPSS Statistics, with version 29 and the subscription version being the latest versions of SPSS Statistics. However, in version 27 and the subscription version, SPSS Statistics introduced a new look to their interface called "SPSS Light", replacing the previous look for versions 26 and earlier versions, which was called "SPSS Standard". Therefore, if you have SPSS Statistics versions 27 to 29 (or the subscription version of SPSS Statistics), the images that follow will be light grey rather than blue. However, the procedure is identical.

  1. Click Analyze > Dimension Reduction > Factor... on the main menu, as shown below:
    Menu for a principal components analysis (PCA) in SPSS Statistics

    Published with written permission from SPSS Statistics, IBM Corporation.


    You will be presented with the Factor Analysis dialogue box below:
    'Factor Analysis' dialogue box for a principal components analysis (PCA) in SPSS. Variables on the leftPublished with written permission from SPSS Statistics, IBM Corporation.

  2. Transfer all the variables you want included in the analysis (Qu1 through Qu25, in this example), into the Variables: box by using the Right arrow button, as shown below:
    'Factor Analysis' dialogue box for a principal components analysis (PCA) in SPSS Statistics. Variables transferred on the right

    Published with written permission from SPSS Statistics, IBM Corporation.

  3. Click on the Descriptives button. You will be presented with the Factor Analysis: Descriptives dialogue box, as shown below:
    'Factor Analysis: Descriptives' dialogue box for a principal components analysis (PCA) in SPSS

    Published with written permission from SPSS Statistics, IBM Corporation.

  4. In addition to the option that is already selected by default (i.e., Initial solution in the –Statistics– area), also check Coefficients, KMO and Bartlett's test of sphericity, Reproduced and Anti-image from the –Correlation Matrix– area. You will end up with the following screen:
    'Factor Analysis: Descriptives' dialogue box for a principal components analysis (PCA) in SPSS. Options selected

    Published with written permission from SPSS Statistics, IBM Corporation.

  5. Click on the Continue button. You will be returned to the Factor Analysis dialogue box.
  6. Click on the Extraction button and you will be presented with the Factor Analysis: Extraction dialogue box, as shown below:
    'Factor Analysis: Extraction' dialogue box for a principal components analysis (PCA) in SPSS

    Published with written permission from SPSS Statistics, IBM Corporation.

  7. Keep all the defaults, but also select Scree plot in the –Display– area, as shown below:
    'Factor Analysis: Extraction' dialogue box for a principal components analysis (PCA) in SPSS. 'Scree plot' selected

    Published with written permission from SPSS Statistics, IBM Corporation.

  8. Click on the Continue button. You will be returned to the Factor Analysis dialogue box.
  9. Click on the Rotation button and you will be presented with the Factor Analysis: Rotation dialogue box, as shown below:
    'Factor Analysis: Rotation' dialogue box for a principal components analysis (PCA) in SPSS

    Published with written permission from SPSS Statistics, IBM Corporation.

  10. Select the Varimax option in the –Method– area. This will activate the Rotated solution option in the –Display– area and will be checked by default (if not, make sure it is selected). Also select Loading plot(s) in the –Display– area. You will end up with a screen similar to below:
    'Factor Analysis: Rotation' dialogue box for a principal components analysis (PCA) in SPSS. Options selected

    Published with written permission from SPSS Statistics, IBM Corporation.


    Although not necessary in this guide, you are free to choose other rotation options to best achieve 'simple structure' (discussed later). The most common alternative is Direct Oblimin, which is an oblique transformation.
  11. Click on the Continue button. You will be returned to the Factor Analysis dialogue box.
  12. Click on the Scores button. You will be presented with the Factor Analysis: Factor Scores dialogue box, as shown below:
    'Factor Analysis: Factor Scores' dialogue box for a principal components analysis (PCA) in SPSS

    Published with written permission from SPSS Statistics, IBM Corporation.

  13. Check the Save as variables option and then keep the Regression option selected. You will end up with a screen similar to below:
    'Factor Analysis: Factor Scores' dialogue box for a principal components analysis (PCA) in SPSS. Options selected

    Published with written permission from SPSS Statistics, IBM Corporation.

  14. Click on the Continue button. You will be returned to the Factor Analysis dialogue box.
  15. Click on the Options button. You will be presented with the Factor Analysis: Options dialogue box, as shown below:
    'Factor Analysis: Options' dialogue box for a principal components analysis (PCA) in SPSS

    Published with written permission from SPSS Statistics, IBM Corporation.

  16. Check the Sorted by size and Suppress small coefficients option. Change the Absolute value below: from ".10" to ".3". You will end up with a screen similar to below:
    'Factor Analysis: Options' dialogue box for a principal components analysis (PCA) in SPSS. Options selected

    Published with written permission from SPSS Statistics, IBM Corporation.

  17. Click on the Continue button. You will be returned to the Factor Analysis dialogue box.
  18. Click on the OK button to generate the output.
Testimonials
TAKE THE TOUR


SPSS Statistics

Analysing the results of a principal components analysis (PCA)

The output generated by SPSS Statistics is quite extensive and can provide a lot of information about your analysis. However, you will often find that the analysis is not yet complete and you will have to re-run the SPSS Statistics analysis above (possibly more than once) before you get to your final solution. Below we briefly explain the seven steps that you will need to follow to interpret your PCA results, and where required, perform additional analysis in SPSS Statistics. We take you through all these sections step-by-step with SPSS Statistics output in our enhanced PCA guide. You can learn more about our enhanced content on our Features: Overview page. First, take a look through these seven steps:

If you are unsure how to interpret your PCA results, or how to check for linearity, carry out transformations using SPSS Statistics, or conduct additional PCA procedures in SPSS Statistics such as Forced Factor Extraction (see Step #4), we show you how to do this in our enhanced PCA guide. We also show you how to write up the results from your assumptions tests and PCA output if you need to report this in a dissertation/thesis, assignment or research report. We do this using the Harvard and APA styles. You can learn more about our enhanced content on our Features: Overview page.

Join the 10,000s of students, academics and professionals who rely on Laerd Statistics.TAKE THE TOUR
1