Online College Courses for Credit

4 Tutorials that teach Correlation and Causation
Take your pick:
Correlation and Causation
Common Core: S.ID.9

Correlation and Causation

Author: Katherine Williams

Describe the relationship between correlation and causation.

See More

Try Our College Algebra Course. For FREE.

Sophia’s self-paced online courses are a great way to save time and money as you earn credits eligible for transfer to many different colleges and universities.*

Begin Free Trial
No credit card required

46 Sophia partners guarantee credit transfer.

299 Institutions have accepted or given pre-approval for credit transfer.

* The American Council on Education's College Credit Recommendation Service (ACE Credit®) has evaluated and recommended college credit for 33 of Sophia’s online courses. Many different colleges and universities consider ACE CREDIT recommendations in determining the applicability to their course and degree programs.


Video Transcription

Download PDF

This tutorial covers correlation and causation. It is really important to make sure that you don't make a big mistake, that you don't say that just because things are correlated that they're causing each other to happen. Correlation does not imply causation.

So if you've picked two variables, and you run a line of best fit for them, and you find out that the r is 1, there's a perfect linear correlation with them. They are strongly associated. You cannot say that one variable causes the other variable to happen without doing some other tests and making some other assertions.

Correlation is just saying that the two variables or events have a linear association, that they go together, that as one increases, the other increases too, or that as one increases, the other decreases in a predictable way. Causation is saying that that one event or variable causes the other one to happen. And we can only say causation when we've proven it. And we prove it through doing a random controlled experiment with a large sample. Absent of that random controlled experiment with a large sample, you cannot say that there is causation.

Now if you find a strong correlation, the reason we can't say there's causation-- one example is because there could be a confounding variable. So there should be something kind of lurking out in the side and confusing the relationship between the cause and the explanatory variable and the response variable.

So here, for example, if you're looking at the relationship between uniforms and test scores, and you find a strong relationship-- students who wear uniforms have higher test scores, or who wear uniforms more often have higher test scores. Now there could be a lurking variable. There could be a confounding variable, like parental income. Perhaps it's true that more parents with more money send their students to schools that wear uniforms, and the extra money also is affecting students' test scores. The parents are able to provide tutors and things like that. That confounding variable puts a question mark on the relationship between the explanatory variable and the response variable.

So we know we still have correlation. They are associated with each other. But we don't know that we have causation. It could also be that the causation is reversed. Perhaps you studied the number of-- sorry, perhaps you studied whether or not people own minivans and how many children they have, and you find that if you have more minivans-- sorry, if you own a minivan, you're more likely to have a baby.

So is it the minivan that is causing the babies to be born? Or is it the other way around? That once you have a lot of babies, you're more likely to buy a minivan in order to drive them around. So because we don't know what the direction of the cause and effect is, we can't say for sure that it's a causal relationship. We can only say that the two variables are correlated.

This has been your tutorial on correlation and causation.

Terms to Know

A phenomenon whereby an increase in one variable directly leads to an increase or decrease in another variable.


A statistic which measures the strength and direction of the linear association between two quantitative variables.