This tutorial covers the least-squares line. The least-squares line is the most common way to find the line of best fit. Now, it's calculated by minimizing the sum of the squares of the vertical distances from the line of best fit to each point.
So now, the sum of the squares-- the reason we are squaring it is to make sure that the differences-- the positive and negatives aren't just cancel each other out. And for the least-squares line, we're looking at those vertical distance. So let's see a visual example of how that would work.
This blue line here is our least-squares line for these 1, 2, 3, 4, 5 points. Now, the vertical distances are from the point to that line, so these lines that I'm adding on now are showing the vertical differences. And a program like a spreadsheet program will be able to do all those calculations for you in order to find out which one-- which line offers the smallest difference of those-- smallest sums of the squared differences.
Now, we've already talked about this. It's the most common way to find the line of best fit, and it's missing minimizing the sum of the squares of those vertical distances from the line of best fit each point. Now, goes through the point that is the mean of the x's and the mean of the y's, and that's always going to be true of the least-squares line.
Additionally, the slope of our least-squares line-- it's r times the standard deviation of y divided by the standard deviation of x. So let's look at an example. Before we look at our example, just to refresh, when we talk about x bar, that's the mean of the x's, and we're using x with a bar on top. And this is for a sample.
And y bar is going to be the mean of the wise for a sample, and it's the y with the bar on top. So here's our example. And in our example, we have five subjects, their ages, and their current glucose levels. And because we're doing a sample, our slope formula is going to be right here.
And we need to find our mean of the x's and mean of the y's in order to get a point on our least-squares line, and then we can calculate our slope using the r and then the two standard deviations. So I use my spreadsheet program in order to calculate the standard deviations, and I found that the standard deviation of x was 14.7 the standard deviation of y was 12.8.
And then similarly, I've previously calculated the r to find that it was 0.637. In order to find the mean of the x's, I added up all the x's and divided by 5, and I got 37.6. And I did the same thing in order to find the mean of the y's. I added up the five y values and then divided by 5, and ended up with 81.
Now, I'm going to move the data out of the way so that we can find the equation for our least-squares line. So first, I'm going to calculate the slope. So for the slope, we have r times the standard deviation of y-- so 0.637 times 12.8. And then we're going to divide by the standard deviation x-- so divided by 14.7. And that is for finding our slope.
Now, when I type that into a calculator, I get 0.5547. And this is rounded, but that's OK. Now that I have a slope and I have a point in the line, we can solve to find out what our y-intercept is going to be. So we're going to start off with y equals-- and our slope is 0.5547x-- plus b.
For the x, I'm going to use the x bar value. For the y, I'm going to use the y bar value. So we're going to have 81 equals 0.5547 times 37.6 plus b. Now, when I multiply those two values together, I get 20.8567.
And then I'm going to subtract that value from both sides, and that is going to give me my b. It's going to let me know what the y-intercept is for this least-squares line. And we find out that 60.1433 is the b value.
Now, in order to combined together the b and the slope into my least-squares line, we're just going to write out y equals 0.5547x plus 60.1433. And your spreadsheet program can calculate this line for you, but this is how you would do that at least partially by hand. So this has been your tutorial on least-squares line and how to find it.