Source: Table by Katherine Williams
This is your tutorial on residuals. Residuals are the value or the observed y minus the value of the predicted y. So you're doing a subtraction there. And then you can calculate these for each of your data points. If your residual value is positive, that means that your observed y was higher than your predicted y. If your residual value is negative, that means your observed y is less than your predicted y. If your residual value is 0, that would mean that your observed y is exactly the same as your predicted y.
Here's an example. In this set, we're looking at how age is compared to glucose level. And we have the observed glucose levels-- what we actually found from patients-- and then the predicted glucose levels-- what's predicted by a set of data that we've collected and analyzed to come up with what we expect should be the glucose level for each age. So then we're going to find the residuals.
Residuals are the observed minus the predicted. Now observed, we call the y value. Sometimes predicted value you'll see written as a y with like a little carat on top. It's called y hat. So our residuals would be y minus y hat. So 99 minus 84 would give us 15. 65 minus 71.8, negative 6.8. 79 minus 74, 5. 75 minus 83.4, negative 8.4. And 87 minus 91.8, negative 4.8.
And like we talked about before, some of our residuals are positive, some are negative. That's OK. None of them are zero. We don't have any cases where the observed exactly matched the predicted.
Now with your residuals, you can make a plot of them. And with your plot, it can help you to evaluate the regression line. If you notice some patterns in your residuals plot, if you notice that they are all positive, or there's some sort of pattern to them, then you're going to have some sort of issue with your regression line. We haven't really looked at how you calculate regression lines yet. That will be covered in a later tutorial. But it's good to know that residuals help us to evaluate that because it's indicating issues with our line. Because there's a pattern to the residuals, and there shouldn't be.
This has been your tutorial on residuals. They're calculated by taking the observed y and subtracting the predicted y, y minus y hat. Thank you for watching.