Laerd Statistics LoginCookies & Privacy

Independent-samples t-test using R, Excel and RStudio

Introduction

The independent-samples t-test, also known as the independent t-test, independent-measures t-test, between-subjects t-test or unpaired t-test, is used to determine whether there is a difference between two independent, unrelated groups (e.g., employed versus unemployed people, males versus females, low versus high anxiety students, etc.) in terms of the mean of a continuous dependent variable (e.g., salary, running speed, exam score, etc.). More specifically, the independent-samples t-test is used to determine whether the mean difference between these two groups is statistically significant.

For example, you could use an independent-samples t-test to understand whether the number of hours teenagers watch television each week differs based on gender (i.e., the dependent variable is "weekly tv time", measured in minutes, and the independent variable is "gender", which has two groups: "males" and "females"). Alternatively, you could use an independent-samples t-test to understand whether there is a difference in 10 km running performance between athletes consuming a carbohydrate drink compared to athletes consuming water (i.e., the dependent variable is "10 km running performance", measured in minutes and seconds, and the independent variable is "type of drink", which has two groups: "carbohydrate drink" and "water").

In this introductory guide to the independent-samples t-test, we first set out a couple of study designs where the independent-samples t-test is most often used. Next, we set out the assumptions of the independent-samples t-test. Making sure that your study design, variables and data pass these assumptions is critical because if they do not, the independent-samples t-test is likely to be the incorrect statistical test to use. On page 2 of this introductory guide we set out the example we use to illustrate how to carry out an independent-samples t-test using R, before showing how to set up your data using Microsoft Excel, R and RStudio. On page 3 we demonstrate the R code that can be used in RStudio to carry out an independent-samples t-test, including useful descriptive statistics. Finally, on page 4 of this introductory guide we explain how to interpret the main results of the independent-samples t-test where you will determine whether there is a statistically significant difference between your two independent, unrelated groups in terms of the mean of your dependent variable. To continue with this introductory guide, go to the next section.

SPSS Statistics

Study Designs

An independent-samples t-test is most often used to analyse the results of three different types of study design: (a) determining if there is a mean difference between two independent groups; (b) determining if there is a mean difference between two interventions; and (c) determining if there is a mean difference between two change scores (also known as gain scores). To learn more about the first two of these three types of study design where the independent-samples t-test can be used, see the examples below:

Note: Whilst an independent-samples t-test can be used to determine if there is a mean difference between two change scores, a one-way ANCOVA is more commonly recommended.

Some degree courses include mandatory 1-year internships (also known as placements), which are considered to help students’ job prospects after graduating. Therefore, imagine that a researcher wanted to determine whether students who enrolled in a 3-year degree course that included a mandatory 1-year internship (also known as a placement) got better graduate salaries than students who did not undertake an internship. The researcher was specifically interested in students who undertook a Finance degree.

A total of 60 first-year graduates who had undertaken a Finance degree were recruited to the study. Of these 60 graduates, 30 had undertaken a 3-year Finance degree that included a mandatory 1-year internship. This group of 300 graduates represented the "internship group". The other 30 had undertaken a 3-year Finance degree that did not include an internship. This group of 30 graduates represented the "no internship group". The first-year graduate salaries of all 60 graduates were recorded in US dollars.

Therefore, in this study the dependent variable was "salary", measured in US dollars, and the independent variable was "course type", which had two independent groups: "internship group" and "no internship group". The two groups were independent because no graduate could be in more than one group and the students in the two groups could not influence each other’s salaries.

The researcher analysed the data collected to determine whether salaries were greater (or smaller) in the internship group compared to the no internship group. An independent-samples t-test was used to determine whether there was a statistically significant difference in the salaries between the internship group and the no internship group.

Difference between two TREATMENT/EXPERIMENTAL GROUPS

Some parents use financial rewards (i.e., money) as an incentive to encourage their children to get top marks in their exams (e.g., an "A" grade or what might be a score of 80 or more out of 100). Therefore, imagine that an educational psychologist wanted to determine whether financial rewards increased academic performance amongst school children.

A total of 26 students were randomly assigned to one of two groups. In one group, the school children were offered $500 if they got an "A" grade in their maths exam. This is called the "experimental group". In the other group, the school children are not offered anything, irrespective of how well they performed in the same maths exam. This is called the "control group". All 26 students undertook the same maths exam. After the students have taken the maths exam, their scores (between 0 and 100 marks) were recorded.

Therefore, in this study the dependent variable was "exam result", measured from 0 to 100 marks, and the independent variable was "financial reward", which had two independent groups: "experimental group" and "control group". The two groups were independent because no student could be in more than one group and the students in the two groups were unable to influence each other’s exam results.

The researcher analysed the data collected to determine whether the exam results were better (or worse) amongst students in the experimental group compared to the control group. An independent-samples t-test was used to determine whether there was a statistically significant difference in the exam results between the experimental group and control group.

In this "quick start" guide we show you how to carry out an independent-samples t-test using R, with the help of Microsoft Excel (Excel) and RStudio. We also show you how to interpret and report the results from this test. However, before we show you how to carry out an independent-samples t-test using R, you need to understand the different assumptions that your data must meet for an independent-samples t-test to give you a valid result. We discuss these assumptions in the next section.

R and RStudio

Assumptions:
Can I use the independent-samples t-test?

The first and most important step in an independent-samples t-test analysis is to check whether it is appropriate to use this statistical test. After all, the independent-samples t-test will only give you valid/accurate results if your study design and data "pass" six assumptions that underpin the independent-samples t-test.

In many cases, the independent-samples t-test will be the incorrect statistical test to use because your data "violates" (i.e., does not meet) one or more of these assumptions. This is not uncommon when working with real-world data, which is often "messy", as opposed to textbook examples. However, there is often a solution, whether this involves using a different statistical test, or making adjustments to your data so that you can continue to use an independent-samples t-test.

Before discussing these options further, we briefly set out the six assumptions of the independent-samples t-test, three of which relate to your study design and how you measured your variables (i.e., Assumptions #1, #2 and #3 below), and three which relate to the characteristics of your data (i.e., Assumptions #4, #5 and #6 below):

Since assumptions #1, #2 and #3 relate to your study design and how you measured your variables, if any of these three assumptions are not met (i.e., if any of these assumptions do not fit with your research), the independent-samples t-test is the incorrect statistical test to analyse your data. It is likely that there will be other statistical tests you can use instead, but the independent-samples t-test is not the correct test.

After checking if your study design and variables meet assumptions #1, #2 and #3, you should now check if your data also meets assumptions #4, #5 and #6 below:

Therefore, before running an independent-samples t-test it is critical that you first check whether your data meets assumptions #4, #5 and #6. In some cases, failure to meet one or more of these assumptions will make the independent-samples t-test the incorrect statistical test to use. In other cases, you may simply have to make some adjustments to your data before continuing to analyse it using an independent-samples t-test.

When you are confident that your data has met all six assumptions described above, you can carry out an independent-samples t-test to determine whether there is a difference between the two groups of your independent variable in terms of the mean of your dependent variable. In the sections that follow we show you how to do this using R (with Excel and RStudio), based on the example we set out on the next page.

1 2 3 4