Hi. This tutorial covers correlation and causation. So let's start with a data set-- pretty interesting data set. What we have here are a bunch of different countries, and then the life expectancy in that country is listed here and the number of people per TV are listed here. So we have a bivariate data set.
Notice that the table's just continued here, so these aren't separate data sets. It's just put up that way. So if we just look at a couple-- so if we look at the country of Angola, the life expectancy is 44 years, and there are 200 people for every TV. So that just means that there are not a lot of TVs in the country.
If we look at Australia, the life expectancy is 76.5, and there are two people per TV. So for every one television, there's two people. So in Australia, there's a lot of televisions in the country. Cambodia-- they live to be about 49 and 1/2 years old-- 177 people per TV. So if you scan the data set, you can see that, generally, people with-- countries with a higher life expectancy have a small number of people per TV.
We look at Japan-- Japan is 79 for life expectancy, 1.8 people per TV. United States-- 75.5, 1.3 people per TV. Go to Uganda, though-- 51 is the life expectancy, 191 people per TV. So let's take a look at this graph. I also have a best fit line drawn in here. So we can see that there is a negative association.
So that is the number of people per TV increases-- the life expectancy decreases. So the r value is negative 0.8, so that's a strong negative association between people per TV and life expectancy. So that's what I have here-- strong negative correlation between people per TV and life expectancy. So now, this is the important question that we're getting at in this tutorial is, does that mean that increasing the people per TV ratio-- which would mean decreasing the number of TVs in a country-- will lower life expectancy?
So does that mean, if the United States were to get rid of TVs-- so if they were to increase their people per TV ratio, would that make the life expectancy go down? So if we remove TV from a country, does that mean that people will not live as long? And of course, the answer to that is no. So a strong correlation does not imply causation.
So just because two variables are correlated, that does not mean that one causes a change in the other. So just to make sure we have a good working definition of correlation and causation, correlation is the strength and direction of a linear association between two variables. So in this example, we did have a strong correlation-- strong negative correlation.
And then causation-- sometimes known as cause and effect-- is when variation in an explanatory variable causes variation in the response variable. So in this case, no, a change in people per TV does not cause a change in life expectancy. So what was happening here is that we had a lurking variable. So a lurking variable will sometimes be responsible for variation both in explanatory and a response variable.
So in the example, the lurking variable is the wealth of a country. So the wealth of the country affects both life expectancy and people per TV. So if a country is very wealthy, generally, their population will live longer because they'll have better access to health care, better nutrition, better access to clean water, and so forth.
And because they're wealthier, they're also probably going to have more TVs, because TVs are kind of a luxury item that wealthier-- people in wealthier countries will be able to afford. So the wealth of the country with the lurking variable will affect both life expectancy and people per TV. So because you have this lurking variable, it's important that you don't say that there's a cause and effect here.
The best evidence for causation comes from a randomized controlled experiment, rarely from an observational study. So the data that was collected here came from an observational study, because you're just observing life expectancy and then the people per TV ratio. You're not imposing any sort of treatment to look for a change. It was just an observational study.
If we wanted to do-- make a causation implication, it'd be important that we end up doing an experiment, rather than an observational study. All right, so that has been the tutorial on correlation and causation. Thanks for watching.