This tutorial is going to teach you about linear equations by focusing on:
As lines fit to data, figuring out the equations of those lines is what will be the focus of this tutorial.
You may recall the following equation, y = mx + b. And y and x are variables. x is recognized as the explanatory variable and y as the response variable. But the other two values are numbers, and they represent something.
The value of m is called the slope. The slope is a rate of change. You may have heard several terms of rates of change, like miles per hour or meters per second or miles per gallon in a car. But what it is, it's an increase of 1 in x corresponds to an increase of m in y.
If this was 30 miles per gallon, that means that an increase of one gallon would correspond to an increase of 30 miles that you could travel.
On the other hand, the b in the equation is called the y-intercept. It's the value when x is 0. So the line will pass through (0, b)-- whatever this number is. And again, the x is the explanatory variable and the y is our response variable.
What we have to do is find out the points that are on the line, and figure out by how much vertically this went up between 1 and 2.
So this was an increase of 1 and x, which means this is an increase of m in the y direction, an increase of that value of the slope-- so maybe it's 5 or 20 or one-fifth.
The line also here, as it passes through the y-axis, passes through 0-- because we're not going to the right or left, 0, adds up b.
In statistics, we do change it up a little bit.
Instead of y, it is called y hat. Y hat is the notation for the prediction. There are values of y that are not predictions. They're actual data points. But because we're doing a best fit, this is our best guess as the value of y-- is the prediction. Anything with a hat is called a prediction.
For some reason, the value is switched up here. The slope becomes b1 and the intercept becomes b0. You might also see a plus bx or ax plus b as well. If you're using a calculator you might even see both of them. They're not any different from each other.
Again, the slope is b1. The y-intercept is b0. Suppose that you have a trend line and the equation is y hat equals 12 plus 0.5x. So what's the predicted y-coordinate? What is the y hat when x equals 20?
You could draw this onto a graph and figure it out. You can also do this algebraically. You know that the predicted y is 12 more than half of the x and what the value of x is. Figure it out directly from there-- 12 more than half of 20 is 12 more than 10 which is 22.
How about this one? This trend line, we don't know its equation, but I know that it passes through (4, 500) and (12, 900).
So what's the equation of that line? Two pieces of information are needed: the slope and the y-intercept. First, find going the slope. You can see visually that from (4, 500) to (12, 900), it went up 8 in the x direction. But it also actually went up 400 in the y direction. So a change of 400 in the y direction divided by a change of 8 in the x direction means that for every 1 increase in the x direction, it actually went up 50 in the y. And so the slope is 50.
To figure out the y-intercept, take x,y pair (12, 900), any pair will give you the same answer.
Put 12 temporarily in for x and 900 temporarily in for y hat, and solve it algebraically for b0. 50 times 12 is 600. Subtract it over, you get 300. Put it all together-- b1 is 50, b0 is 300. The equation is y hat equals 300 plus 50x.
One thing that's important to note is that the best fit line will change if you switch the explanatory and response variables. That's why it's important to choose at the beginning which one is the explanatory verses which one is the response variable.
If you take a look below, slope is a rate of change. So miles per gallon, for instance, would be the rate of change in the example on the right. If you switched to put gallons on the y-axis and miles on the x-axis, the rate of change here would actually be measuring gallons per mile, which is a different number.
If a car is getting an average of 20 miles per gallon that will actually only be one-twentieth gallons per mile. So it's a different line. One thing that's important to note though is the value of the correlation coefficient is going to be the same for each of these two graphs, but the line itself is different, and that's why we need to end up choosing which one is the explanatory versus which one is the response.’
Fitting a line to data points on the scatter plot requires a bit of algebraic savvy. There are two parts to the equation of a line, the slope and the y-intercept. Linear equations is figuring out the equations of those lines
1- The slope is a rate of change. It's how quickly the response variable y changes when the explanatory variable x increases by
2. The y-intercept is the value of y when x equals 0.
In actual practice, we're going to want to put variable names and units attached to these, not just X's and Y's.
Source: This work adapted from Sophia Author Jonathan Osters.
The rate of change relating the increase or decrease in y to an increase of 1 in x.
The value of y when x = 0.