This tutorial is going to cover interpretation of what the slope and y-intercept of a least-squares regression line actually mean. This tutorial will specifically focus on:
A slope is a rate of change. You've talked about rates of change a lot in everyday life.
Slope of Regression Line
The amount y changes (on average) for a one unit increase in x.
In linear regression, slope is the average rate of change. You know that there's actual data points on there, so you can calculate the rates of change between two data points. But you’re only interested in the average over all the points. That is the average increase or decrease in the response variable that corresponds to an increase of one in the explanatory variable.
Suppose you had something like distance hat equals 15 times time, where time is in hours and distance is in miles.
That 15 would be 15 miles per hour. But because this is a regression equation, that 15 is an average speed. There's no guarantee that you go 15 miles every single hour. But what this indicates is that for each additional hour in time, the distance increases by 15 miles on average.
The miles from Minneapolis Saint Paul airport and the airfare to these destinations from Minneapolis Saint Paul.
The equation of the regression line was airfare hat-- predicted airfare-- equals 113.11 plus 0.137 times miles. What do those mean? The 0.137 times miles is the slope.
That's the rate of change. It's how quickly the airfare changes if you increase the miles by one. So for each additional mile, the airfare is predicted to increase by 0.137 dollars, or 13.7 cents.
A couple of important ideas as you interpret these values:
1. It's for every additional mile. You can't leave this word out. You can't say for every mile because it has to do with the fact that it doesn't start at zero miles costing zero dollars.
2. You have to say it's a predicted increase. Because this airfare hat, we're not figuring out actual airfares. Remember this is an average. We're using it to predict the additional airfare for each mile. It's not a hard and fast rule, and that's why it can be used to predict airfare but not to actually assign airfares. Lastly, we're using units-- miles and dollars. You can't say airfare increases by 0.137. You must specify 0.137 whats? 0.137 dollars. For each additional mile, it is increased by this many dollars.
Look at the y-intercept. The y-intercept here was 113.11. The y-intercept is the value of the y, which is the response-- which is, in this case, airfare-- when the value of x, which is the explanatory variable-- in this case, miles-- is zero. When a flight is zero miles, that's when the explanatory is zero, the airfare is predicted to be $113.11.
Intercept of Regression Line
The expected y value when x = 0
Remember-- you're talking about an ordered pair. So it's zero miles and $113.11. You need to have both those numbers in there, because this number really corresponds to an ordered pair on the graph. Secondly, just like the slope was a prediction, this is also a prediction.
Now, it's not a meaningful prediction. And again, we'll get to that in a minute. But it's a prediction. It's because this line is a prediction line. It's a best fit line. It's not actually finding airfares for us. And just like last time, make sure you include units.
So here, the y-intercept didn't make a whole lot of contextual sense. You wouldn't buy a ticket for $113.11 just to go nowhere. The reason has to do with the range of miles values for which this is an appropriate airfare guess, for which this line is an appropriate airfare guess. What we have here is we have miles values from 407 up to almost 1,300 miles.
So what that means is we can use this line within that range of 407 to 1,294 to make reasonable predictions on airfare. So if we wanted to go to, say, San Antonio, we could certainly use this line to do that, because the distance from Minneapolis to San Antonio is within this range. So it's reasonable to use this line to make predictions within this 400 to 1,300 range.
Anything outside of that range might not be reasonable. So we'll sometimes just acknowledge the fact that maybe the y-intercept isn't part of that reasonable prediction range. And so maybe it doesn't have a whole lot of contextual sense. But it's a good idea to know how to interpret it anyway.
So do this one on your own. Interpret the slope and y-intercept involving sodium content and calories for certain hot dogs. Interpret what those values mean.
You should have identified the 160 as being if a hot dog had zero calories, then the predicted sodium would be 160 milligrams. And the 2.5 means for each additional calorie a hot dog has, the sodium content is predicted to increase by 2.5 milligrams. And remember that 2.5 milligram per calorie increase is an average. This is not a hard and fast rule.
The slope is a rate of change and it explains how an increase in the explanatory variable affects the response. The y-intercept shows you what's predicted for the response when the explanatory value is zero, and sometimes it doesn't have a meaningful interpretation. It doesn't mean you shouldn't know how to interpret it, just sometimes it doesn't make sense in context. Sometimes it falls outside that reasonable predictions window.
Source: This work adapted from Sophia Author Jonathan Osters.
The amount by which the response variable increases or decreases, on average, when the explanatory variable increases by one.
The predicted value of the response variable when the explanatory variable is zero.