Source: All images created by Dan Laub
[MUSIC PLAYING] Hi, Dan Laub here. And in this lesson, we want to discuss representing how two data sets are related. But before we get started, let's talk about the objective for this lesson. By the end of the lesson, we want to be able to identify and interpret the slope in a straight line model. So let's get started.
Recall that one of the goals of using the correlation coefficient is to determine if and how closely two variables are related. The two simplest ways in which two variables can be related is that as one of them increases, the other increases. Or if one increases, the other decreases.
When viewing a scatter plot, such as this, there are two axes. The horizontal axis and the vertical axis. We call the horizontal axis the x-axis. And the values presented along this axis, the x values of the points given in the scatterplot. If the horizontal axis is the x-axis, then the vertical axis is called the y-axis. And the values presented along this axis we call the y values of the points given in the scatter plot.
If the data in a scatter plot closely follows a straight line, it is important to be able to describe how one variable changes due to changes in another variable. Being able to describe this relationship gives us some insight regarding the prediction of trends in the data. In the event that the data is generally spread out, our prediction will not be accurate. But if the data is clustered closely, our prediction could be plausible.
The steepness of the line is explained by what is known as the line slope. If the slope of a line is positive, the line rises from left to right. If the slope is negative, the line falls from left to right.
Recall from previous lessons that are independent and dependent variables. The slope of a line describes how much the y variable, or the dependent variable, changes if the x variable, which is the independent variable, increases. If the slope is positive, the y variable increases as the x variable increases. If the slope is negative, on the other hand, the y variable decreases as the x variable increases.
So let's look at an example here of annual income of an individual and the value of their home. So we're looking at this, we see a scatter plot that you see right here in front of you. And the line that we see is the line of best fit based upon the observations on the scatter plot. In this case, the slope is positive. And we can see that based upon the fact that it goes upward from left to right. What's a positive slope mean in this particular case? Well, pretty clear. The greater one's income, the higher the value of their home.
Another example we can look at here is a scatter plot that illustrates the amount of time an individual spends exercising per day and their body fat percentage. And so in this particular instance, notice that there's a downward sloping line. And intuitively speaking, we would think this makes sense, because the more somebody exercises, in all likelihood, they have a lower level of body fat.
Now, that's not necessarily the only thing that could reduce body fat, but it does make sense that there would be a relationship here. And there's a downward sloping line, meaning it starts out high on the left-hand side and starts downward sloping. As one exercises more, the body fat percentages tend to generally decrease.
So let's revisit the objective of this lesson, just to make sure we covered it. We wanted to be able to identify and interpret the slope in a straight line model, which we did. We talked about how downward sloping lines and upward sloping lines, or positive slopes and negative slopes, could tell us different things about the data of the two variables that we were looking at in a scatterplot. So again, my name is Dan Laub. And hopefully, you got some value from this lesson.
How much the y-variable changes if the x-variable increases.