The Pearson product-moment correlation, often shortened to Pearson correlation or Pearson's correlation, is used to assess the strength and direction of association between two continuous variables that are linearly related. Its coefficient, r, indicates the strength and direction of this relationship and can range from -1 for a perfect negative linear relationship to +1 for a perfect positive linear relationship. A value of 0 (zero) indicates that there is no relationship between the two variables.
For example, you could use a Pearson's correlation to understand whether there is an association between test performance and revision time (i.e., your two variables would be "test performance", measured as the exam mark achieved, and "revision time", measured in hours per week). If there was a strong, positive association, we could say that more time spent revising was associated with higher test performance. Alternately, you could use a Pearson's correlation to understand whether there is an association between blood pressure and time spent exercising (i.e., your two variables would be "blood pressure", measured in mm/Hg, and "time spent exercising", measured in hours per week). If there was a moderate, negative association, we could say that exercising more per week is associated with lower blood pressure.
In this guide, we show you how to carry out a Pearson's correlation using Minitab, as well as interpret and report the results from this test. However, before we introduce you to this procedure, you need to understand the different assumptions that your data must meet in order for a Pearson's correlation to give you a valid result. We discuss these assumptions next.
A Pearson's correlation has four assumptions. You cannot test the first of these assumptions with Minitab because it relates to your study design and choice of variables. However, you should check whether your study meets this assumption before moving on. If this assumption is not met, there is likely to be a different statistical test that you can use instead. Assumption #1 is explained below:
Note: If either of your two variables were measured on an ordinal scale, you need to use Spearman's correlation instead of Pearson's correlation. Examples of ordinal variables include Likert scales (e.g., a 7-point scale from "strongly agree" through to "strongly disagree"), amongst other ways of ranking categories (e.g., a 5-point scale for measuring job satisfaction, ranging from "most satisfied" to "least satisfied"; or a 3-point scale explaining how much a customer liked a product, ranging from "Not very much" to "Yes, a lot").
Assumptions #2, #3 and #4 relate to the nature of your data and can be checked using Minitab. You have to check that your data meets these assumptions because if it does not, the results you get when running a Pearson's correlation might not be valid. In fact, do not be surprised if your data violates one or more of these assumptions. This is not uncommon. However, there are possible solutions to correct such violations (e.g., transforming your data) such that you can still use a Pearson's correlation. Assumptions #2, #3 and #4 are explained below:
In practice, checking for assumptions #2, #3 and #4 will probably take up most of your time when carrying out a Pearson's correlation. However, it is not a difficult task and Minitab provides all the tools you need to do this.
In the section, Test Procedure in Minitab, we illustrate the Minitab procedure required to perform a Pearson's correlation assuming that no assumptions have been violated. First, we set out the example we use to explain the Pearson's correlation procedure in Minitab.
An educator wants to determine whether students' exam scores were related to revision time. For example, as students spent more time revising, did their exam score also increase (i.e., a positive relationship) or did the opposite happen (i.e., a negative relationship)? As such, scores in a maths exam were measured on a scale from 0 to 100 and the amount of time spent revising was measured in hours.
To carry out the analysis, the researcher recruited 40 students. The amount of time revising (i.e., the variable, Revision time) and the exam scores (i.e., the variable, Exam score) were recorded for all 40 participants. Expressed in variable terms, the researcher wanted to correlate Exam score and Revision time. A Pearson's correlation was used to determine whether there was a statistically significant relationship between exam score and revision time.
Note: The example and data used for this guide are fictitious. We have just created them for the purposes of this guide.
In Minitab, we entered our two variables into the first two columns ( and ). Under column we entered the name of one of the two variables, Exam score, as follows: . Then, under column we entered the name of the second of our two variables, Revision time, as follows: . Finally, we entered the scores for Exam score into the column and the scores for Revision time into the column. This is illustrated below:
Published with written permission from Minitab Inc.
Note: It does not matter which of your two variables you enter under C1 or C2. We have just entered the data into Minitab this way in our example.
In this section, we show you how to analyse your data using a Pearson's correlation in Minitab when the four assumptions in the previous section, Assumptions, have not been violated. Therefore, the three steps required to run a Pearson's correlation in Minitab are shown below:
Click Stat > Basic Statistics > Correlation... on the top menu, as shown below:
Published with written permission from Minitab Inc.
You will be presented with the following Correlation dialogue box:
Published with written permission from Minitab Inc.
Note: In Minitab 17 you will be presented with only a slightly different dialogue box, as shown below:
If you are using Minitab 17 you need to make sure that "Pearson correlation" is selected as the Method: option (i.e., ).
Transfer your two variables, Exam score and Revision time from the main left-hand box (i.e., C1 Exam score and C2 Revision time) into the Variables: box. You will end up with the dialogue box shown below:
Published with written permission from Minitab Inc.
Note: To transfer the two variables, you first need to click inside the main left-hand box and select both variables (e.g., C1 Exam score and C2 Revision time). This will activate the button (it is usually faded: ). Then, simply press the button. Alternately, you can simply double-click on the two variables (i.e., C1 Exam score and C2 Revision time), which will automatically transfer them into the Variables: box.
Click the button. The output that Minitab produces is shown below.
The Minitab output for a Pearson's correlation is shown below:
The output contains two important pieces of information:
In this example, Pearson's correlation coefficient is 0.853 (i.e., the "Pearson correlation of Exam score and Revision Time = 0.853" line), which indicates a strong positive correlation between the two variables, Exam score and Revision time, according to Cohen (1998):
Coefficient Value | Strength of Association |
0.1 < | r | < .3 | small correlation |
0.3 < | r | < .5 | medium/moderate correlation |
| r | > .5 | large/strong correlation |
where | r | means the absolute value or r (e.g., | r | > .5 means r > .5 and r < -.5). Therefore, the Pearson correlation coefficient in this example (r = .853) suggests a strong correlation. If instead, r = -.853, you would also have had a strong correlation, albeit a negative one. This result shows that more time spent revising is associated with better (i.e., higher) exam scores. You can also see that Pearson's correlation coefficient is statistically significant as p < .0005 (i.e., the "P-Value = 0.000" line, which actually means p < .0005), which satisfies the common cut-off for statistical significance of p < .05.
Note: We present the output from the Pearson's correlation above. However, since you should have tested your data for the assumptions we explained earlier in the Assumptions section, you will also need to interpret the Minitab output that was produced when you tested for these assumptions. This includes: (a) the scatterplots you used to check if there was a linear relationship between your two variables (i.e., Assumption #2); (b) the same scatterplots that you will have used to check there were no significant outliers (i.e., Assumption #3); and (c) the Shapiro-Wilk test for normality to check whether your two variables were approximately normally distributed (i.e., Assumption #4). Also, remember that if your data failed any of these assumptions, the output that you get from the Pearson's correlation procedure (i.e., the output we discuss above) might no longer be valid, and you may have to carry out a different statistical test to analyse your data.
When you report the output of your Pearson's correlation, it is good practice to include:
Based on the results above, we could report the results of this study as follows:
A Pearson's product-moment correlation was run to assess the relationship between exam score and time spent revising in 40 students. There was a strong positive correlation between exam score and time spent revising, r(38) = .853, p < .0005.
In addition to reporting the results as above, a diagram (e.g., a scatterplot) can be used to visually present your results. This can make it easier for others to understand your results and is easily produced in Minitab.