Share to Facebook Email to a Friend Share to Twitter Stumble It Delicious Digg This Yahoo MySpace Reddit

Selecting Statistical Tests - PREDICTION

Are you trying to predict a score or a group? You have four options to choose from:

(1) you are predicting a score from a single independent variable. This score (dependent variable) that you are trying to predict has to be a continuous variable (either interval or ratio in nature). Simply put, this test allows you to predict a number; as long as this number represents a continuous measurement like height, weight, length, temperature then you are OK. For example, you might be looking to predict blood sugar levels or height jumped. Importantly, you are trying to predict this score based on only one continuous, independent variable.

The diagram below presents a general schematic to help you visualize this type of statistical test:

An image

http://statistics.laerd.com © Lund Research Ltd 2010

Let us consider an example using the diagram above. We want to know if the time to complete a 10 km run is related to a person's aerobic capacity. In this case Variable 'B' is the time to run 10 km and Variable 'A' is a person's aerobic capacity. By running this type of statistical test we can calculate the missing values (question marks). This way, we can generate a formula to estimate the time in takes to run 10 km when all we know is a person's aerobic capacity.

Is this your situation? No, keep reading on...

(2) you are predicting a score from multiple independent variables. This score (dependent variable) that you are trying to predict has to be a continuous variable (interval or ratio in nature). Simply put, you are looking at predicting a number; as long as this number represents a continuous measurement like height, weight, length, temperature then you are OK. For example, you might be looking to predict blood sugar levels or height jumped. Importantly, you are trying to predict this score based on more than one independent variable, which can be a combination of various variable types but must include at least one continuous variable.

The diagram below presents a general schematic to help you visualize this type of statistical test:

An image

http://statistics.laerd.com © Lund Research Ltd 2010

Let us consider an example using the diagram above. We want to know if the time to complete a 10 km run is related to a person's aerobic capacity, whether they are a runner or not, and their height. In this case Variable 'D' is the time to run 10 km, Variable 'A' is a person's aerobic capacity, Variable 'B' is whether they are a runner or not, and Variable 'C' is their height. By running this type of statistical test we can calculate the missing values (question marks). This way, we can generate a formula to estimate the time in takes to run 10 km when we know a person's aerobic capacity, running status and height.

Is this your situation? No, keep reading on...

(3) you are predicting a dichotomous group. Here you are trying to determine whether someone or something can be classified into a dichotomous group. That is, a categorical variable with only two possible categories available for you to choose from; sometimes presented as an "either/or" choice. For example, you might be predicting whether, based on the information you have, you can predict an individual's gender (male/female) or another study where you are trying to predict phone ownership (yes/no).

The diagram below presents a general schematic to help you visualize this type of statistical test:

Predicting a dichotomous group

http://statistics.laerd.com © Lund Research Ltd 2010

Is this your situation? No, keep reading on...

(4) you are predicting a discrete group. Here you are trying to determine whether someone or something can be classified into one of many possible groups (more than two). That is, a categorical variable that is not dichotomous but has more than two possible categories available for you to choose from. For example, you might be predicting whether, based on the information you have, you can predict an individual's diabetes risk category (low/medium/high risk).

The diagram below presents a general schematic to help you visualize this type of statistical test:

Predicting a discrete group.

http://statistics.laerd.com © Lund Research Ltd 2010

Is this your situation?