Online College Courses for Credit

Scatterplots/Bivariate Data

Scatterplots/Bivariate Data

Author: Al Greene

- Demonstrate how to create a scatterplot between two quantitative variables measured on the same case
- Give a few examples where the learner has to determine which variable is responsive and which is explanatory
- Present how to interpret a scatterplot (direction, form, strength, and outliers)
- Emphasize that association does not necessarily mean causation
- Introduce the correlation coefficient (r) and how it represents the strength and direction of a linear association
- Discuss how lurking variables can make it appear that there is a relationship between two variables
- Demonstrate how to convert a scatterplot into a histogram, so you can describe the shape, center and spread of the distribution of the responsive variable

This packet covers a lot of things dealing with scatterplots. We talk about how to make a scatterplot, what the relationship between the two variables can mean, how to measure the strength of the relationship with the correlation coefficient, and how it can appear that there is a relationship between variables through a lurking or confounding variable.

See More

What's in this packet

This packet has video on how to make a scatterplot. We also show you how to interpret the scatterplot, decide which variable is responsive and which is explanatory, and how to explain the relationship between the two variables. Some terms that may be new to you are:


  • Scatterplot
  • Responsive Variable
  • Explanatory Variable
  • Association
  • Correlation
  • Correlation Coefficient
  • Lurking Variables

Source: Greene


Seeing as the video has most of the information you need, I will just have a few definitions here for your reference.

Bivariate data - data that has two variables per observation, usually an x and y variable.

Scatterplot - graph displaying categorical data, with an x-axis and y-axis.

Response Variable - the variable that is explained by the other.

Explanatory Variable - the variable that explains the other.

Association - The relationship between two variables. Can be positive or negative, can be strong of weak, can be linear or curvilinear.

Correlation Coefficient(r) - The quantitative representation of association. This is a number between -1 and 1. Anything close to -1 is a strong negative linear association, anything close to 1 is a strong positive linear association, and anything close to 0 is a weak linear association, or no linear association.

Lurking variables - variables that are not considered in the bivariate data, yet affect the relationship between the two variables. Also called a counfounding variable.

Source: Greene


This video shows you two variables, x and y, and how to make a scatterplot from them. It also talks about correlation, relationships, and lurking variables.

Source: Greene