Laerd Statistics LoginCookies & Privacy

Independent-samples t-test using R, Excel and RStudio (page 2)

On the previous page you learnt about the type of research where an independent-samples t-test can be used and the critical assumptions of the independent-samples t-test that your study design, variables and data must meet in order for the independent-samples t-test to be the correct statistical test for your analysis. On this page, we set out the example we use to illustrate how to carry out an independent-samples t-test using R, before showing how to set up your data using Microsoft Excel, R and RStudio. Therefore, start by reading the example we use throughout this introductory guide in the next section.

R and RStudio

Example

A researcher wanted to know whether exercise could improve a person’s cardiovascular health. One measure of cardiovascular health is the concentration of cholesterol in the blood, measured in mmol/L, where lower cholesterol concentrations are associated with improved cardiovascular health. For example, a cholesterol concentration of 3.57 mmol/L would be associated with better cardiovascular health compared to a cholesterol concentration of 6.04 mmol/L.

In this fictitious study, the researcher recruited 21 participants who were classified as being "sedentary" (i.e., they engaged in only low daily activity and did not exercise). These 21 participants were randomly assigned to one of two groups. One group underwent an exercise intervention where participants took part in a 6-month exercise programme consisting of four 1-hour exercise sessions per week. This experimental group was called the "exercise" group. The other group continued with their typical daily activities (i.e., they remained "sedentary"). This group was called the "control" group. After 6 months, the cholesterol concentration of participants (in mmol/L) was measured in the exercise group and the control group.

Note: To ensure that the assumption of independence of observations was met, as discussed earlier, participants could only be in one of these two groups and the two groups did not have any contact with each other.

To determine whether cardiovascular health had improved as a result of the exercise intervention, the researcher ran an independent-samples t-test to determine whether there was a statistically significant difference in mean cholesterol concentration between the exercise group and the control group.

Therefore, in this study the continuous dependent variable is cholesterol concentration and the categorical (dichotomous) independent variable is exercise trial, which has two groups: "exercise" and "control".

R and RStudio

Setup to run the independent-samples t-test using R

R is a very powerful statistical programming language, but it does not come with a spreadsheet-style interface like Microsoft Excel (called "Excel" from this point forward), IBM SPSS Statistics, Stata, Minitab, and other statistical software. This makes data entry a little more challenging, but there are ways to use Excel and another software package called RStudio to make the process easier. Therefore, the three steps below set out how to set up your data to run an independent-samples t-test using R, with the help of Excel and RStudio. We also briefly explain the alternatives if you do not want to use Excel and RStudio.

Therefore, in the three sections that follow, we first show you how to create your data set in Excel, then explain how to install the tidyverse R package into R using RStudio, before finally showing you how to import your data set from Excel into R using RStudio. If you find any of the following instructions unclear or if there are other guides you would like to see added to Laerd Statistics, please contact us.

STEP ONE
Create your data set in Microsoft Excel

The following five steps will show you how to enter your data in Excel.

Saving your Microsoft Excel file

Setting up your dependent variable

Setting up your independent variable

Entering your data

When entering your data into Excel, each row should only include the data for one case, where a case in our example is a participant. However, in your study a case could be an object, animal, cell, or something else, depending on what you are measuring in your research.

Therefore, on row Row 1 of Excel we entered the data for one of our participants, as shown below:

Entering data for your dependent and independent variable in Excel when running an independent-samples t-test

In the example above, the participant in row Row 2 of Excel had a cholesterol concentration of 4.56 mmol/L and was in the control group. Therefore, we entered "4.56" into the cell under the dependent variable, "cholesterol", and "control" into the cell under the independent variable, "group".

Note: Please enter the name of your two groups using text (e.g., "control" or "exercise") and not numerical coding (e.g., "1" to represent "control" and "2" to represent "exercise"). Whilst it is possible to use numerical coding rather than text, the instructions to set up your data to run an independent-samples t-test using R in RStudio in this guide are based on text and not numerical coding.

STEP TWO
Install the tidyverse package into R using RStudio

The tidyverse R package consists of a number of useful R packages, including readxl, will allows you to import files from Microsoft Excel into R using RStudio. Therefore, in the five steps that follow we show you how to install the tidyverse R package:

Open RStudio

Find the tidyverse R package using RStudio

Install the tidyverse R package into R using RStudio

Now that you have successfully installed the tidyverse R package into R you can go to the next section where we show you how to import your data from Excel into R using RStudio.

STEP THREE
Import your data from Microsoft Excel into R using RStudio

Assuming you have set up your data using the format in Step 1 and installed the tidyverse R package in Step 2 in the previous section, you can finally import your data set from Excel into R using RStudio. We show you how to do this in the four steps that follow:

Your data is now set up correctly in RStudio. In the next page we show you how to run an independent-samples t-test using R in RStudio.

1 2 3 4