Source: Graphs created by the author
This tutorial is going to teach you about lines of best fit. Now the idea of a line of best fit is that there's going to be a line that sort of roughly approximates what's going on with the data in the form of a single line. So we're not going to talk about how, exactly, to calculate a line of best fit, but we're going to sort of understand what it's going to be used for.
And so the line-- you can sort of imagine a line going through the pack of points like this. And that line is going to be called a best fit line, or a trend line, or a regression line. So one easy visual way to do it is place an oval over the top of your points. And at the oval can be symmetric along what, in math, we call the minor axis, which is essentially cutting it the hamburger way.
Or you can cut it along the longer, major axis, which is typically called the hot dog way. So you're going to essentially cut it the hot dog way. And that's a pretty good approximation at a line of best fit.
Now it has a couple of different properties to it. So roughly half the points fall above and below the line. In this particular example about five of them fall pretty near the line, and there are three that are substantially below, and three that are substantially above.
Now just having points above or below the line isn't good. This is a poor choice of a trend line. Not only does it not cut the oval hot dog way, but there's a pattern to how the points are above or below the line. If I know that a point is above the line, I know that it's on the right. And if I know that a point is below the line, I know that it's on the left.
So I don't want it to look like that. This is a better trend line, because these points that are above are sort of peppered throughout. And the ones that are below are sort of peppered throughout. So you don't want a pattern to how the points are off from the line.
So what is a trend line used for? A line of best fit is used to give approximations for values of x. Give approximations to values of y. Even on places where there is an existing value of y. So for instance for 7. There's a difference between the actual value of y at 7, and what the line predicts as the value of y at 7.
So what does this line predict for if x was 6 and 1/2? Well, we just go up from 6 and 1/2 until we get to the line, and figure out how high it is at that point. It's at about 625. And so we can say that the prediction for x being 6 and 1/2 is y being 625.
And so to recap. A line of best fit can understand the general trend of what's going on. How the y values relate to the x values. A good trend line will cut down the middle, and have sort of a peppering of points above and below it. Random scatter, as opposed to some systematic flaw in it.
So we talked about a line of best fit, which is also called the regression or a trend line. Good luck, and we'll see you next time.
.
A line that closely approximates the response values for given explanatory values when the form of the scatterplot is linear.