Source: Thermometer, Public Domain: http://en.wikipedia.org/w/index.php?title=File:Thermometer_CF.svg&page=1 Ice Cream, Public Domain http://commons.wikimedia.org/wiki/File:RaspberrySherbet.jpg Gender Symbols created by Joseph Gearin American Flag, Public Domain: http://en.wikipedia.org/w/index.php?title=File:Flag_of_the_United_States.svg&page=1
In this tutorial, you're going to learn about variables. Now, you've probably heard this term variables before. But it doesn't mean something like x or a y like it does in algebra. In statistics, a variable is any attribute that we can measure about the population. And we're going to use them in the study. And it's very important when we create a study that we carefully define the variables that we want measured.
So all of these are things that we could find out about people-- the state they live in, their ethnicity, their zip code, whether or not they smoke. All sorts of things are variables here. Now, depending on what we want to know, we might not need to know all of these things. We might only want to know some of these things. So for instance, if I was doing a political poll, I wouldn't really necessarily need to know if they were a smoker or even the number of times they eat out per week.
I might want to know only the circled ones-- their age, their gender, their state, political affiliations, zip code, ethnicity, and city. Because all of those potentially have some bearing on a political poll, whereas if I was doing some kind of a weight loss study, I might not need political affiliation. But I might want their favorite food. So I might use these variables if I was doing some kind of a weight loss study.
Some studies try to determine a cause-and-effect relationship between two variables in that one variable causes the other. An increase in one corresponds to an increase or decrease in the other. In those cases, we define the one that causes the other as the explanatory variable. And you can have more than one. And variables that are the result are called response variables.
So an example would be the number of hours you study and your grade on the exam. We would hypothesize that the number of hours that you study, as that increases, your grade on the exam will increase as well. So this helps to explain your grade. Another example of explanatory and response variables would be the average monthly temperature and ice cream sales.
We would assume that as the temperatures get warmer, that ice cream sales would go up in kind. Something that's a little bit less obvious is whether or not gender, which is a categorical variable, plays a role in which political party people will choose. Are males more likely to be Republican? Or are women more likely to be independent voters? We don't know. But that would be an interesting question to investigate.
We don't know if gender plays a role in political party. But it would be something worth looking into. So to recap, variables are what we choose to measure in a study. And then the variables of interest to us, those little ones that were circled in green on those previous slides. Those depend on the questions that we're trying to answer. We don't need every measurable thing, just the ones that we're interested in.
And if a cause and effect relationship is thought to exist, we can break the variables even further down into explanatory and response variables. The terms that we used this tutorial are variables, variables of interest, and explanatory and response. Good luck, and we'll see you next time.
This tutorial is going to teach you about confounding variables. Now, you may have heard of this word "confounding" before, and you may think that it means confusing or something of that nature. But what we really mean when we say confounding is suppose that a researcher wishes to know whether a high-protein diet will help lab rats gain more weight than a low-protein diet.
So she's got 26 lab rats, and she selects the 13 smallest to receive the low-protein diet and the 13 largest to receive the high-protein diet. And then she weighs the rats at the end to determine their weight gain. And she finds that the rats on the high-protein diet gain more weight. So can you think of anything that she did wrong in this study? I'm going to ask you to pause the video, scribble down a response, and then hit pause again.
So, hopefully, you came up with something the researcher did wrong here, and this would be an example of confounding. Confounding is when two variables get mixed up with one another, and so you can't tell the effect of one variable from the effect of the other variable. So what does that mean in this case? Well, in this case, the effects of the diets, which is what the researcher was trying to figure out, whether or not the high-protein diet caused the rats to gain more weight, was confounded by the fact that she put the heaviest rats with the high-protein diet.
We don't know if they gained weight because the high-protein diets were effective or because maybe there was just something about them being heavy that caused them to gain more weight anyway. We can't tell what exactly was the case here. So these are the two variables of interest in our study here, and the high-protein diet was supposed to be the explanatory variable. The weight gain was supposed to be the response variable. And the researcher was going to try to figure out a link between the two.
But what happened was because of the way she assigned the rats, the fact that the heaviest rats received the high-protein food limited the conclusion that she could draw here. She wasn't able to draw the direct conclusion that she was hoping for. And that's confounding. We would like to limit confounding in our experiments, if we can.
So here's another example. A high school math teacher, hoping to have his students do well on the final, offers an optional review session. And he says, well, no one who's ever attended the review sessions ever scored less than a B. So scribble down again-- hit pause-- what is the teacher trying to imply, and explain why he can't imply what it is he's trying to imply. So go ahead, hit pause, and then we'll come back.
What you should have come up with is that he's trying to imply that the review sessions will cause the students to do better. Now, that may be true. However, we don't know if that's the case. Because it's optional, it could be that only his best and brightest students are attending the review session. And these are students that may have done well on the final exam anyway. So the effects, if any, of the review session are confounded by the intrinsic motivation of students to show up to the session.
Now, you don't have to say it exactly this way. But if you can identify why it might not be that the review session causes the higher grades, then you're doing a pretty good job. And so, to recap, confounding occurs when we have a variable that we're trying to create as an explanatory variable in an experiment, but we're not able to say that it causes something else to happen, because some other variable gets in the way. And it limits the conclusions that we can draw from our supposed explanatory variable.
So the confounding variable inhibits a cause-and-effect conclusion. And often, it's one that we didn't think to measure, and that's really problematic. So the terms that we use were confounding, and we also talked about confounding variables, which is the one that inhibits your conclusions. Good luck. We'll see you next time.