In this tutorial, you're going to explore positive correlation, and negative correlation. Specifically you will focus on:
Correlation is going to allow you to observe the strength and direction of a linear association between two quantitative variables. It's a number between negative 1 and positive 1, and anything-- any correlation coefficient between negative 0.5 and positive 0.5 is considered a weak association between the two quantitative variables.
Anything with an absolute value of between 0.5 and 0.8, so that's positive 0.5 to positive 0.8, or negative 0.5 to negative 0.8, is considered a moderate core-- moderately strong correlation .
A very strong correlation would be anything nearer to 1, so positive 0.8 to positive 1, or negative 0.8 to negative 1.
A positive correlation is going to be a tendency of the response variable to increase in response to an increase in the explanatory variable, while a negative correlation is going to be the tendency of the response variable to decrease in response to an increase in the explanatory variable.
Here is what this might look like. This is a correlation coefficient r of negative 0.99.
Which means it's almost a perfectly straight linear relationship. You can see that as the explanatory variable on the x-axis increases, that means that the response variable on the y-axis has a tendency to fall as you move to the right.
Here's a negative trend. It's negative, but it's not terribly strong, so this is an r of negative 0.5.
This is a relatively zero correlation. In a relatively zero correlation the points will appear to be a cloud. There's no discernible association between the explanatory and the response. Another way you can get a correlation coefficient of 0, is if all the response variables are the same.
If all the points lined up in a straight horizontal line, that would also give you a correlation coefficient of zero.
Here's a moderately strong positive association of 0.7.
A strong positive association with the correlation coefficient of 0.9.
Notice that there's a huge difference between 0.99 in terms of strength, and 0.9, but this is still a strong positive correlation that we're seeing.
And a weak positive association, you can sort of see the points rising as you go to the right, but it's not very strong. This is a weak positive association.
One thing that's worth noting is that the numbers, like correlation, very rarely tell the entire story. If you take a look at these two tables, the correlation coefficient for each of them is 0.82 in both cases. Based on that, you might think that they look similar when they're graphed.
The first one looks like this.
You can see it's a fairly strong positive association, just as you would expect.
The other one looks like this.
It's a strong association, but it's not linear. This follows a non-linear form.
This is a nonlinear relationship that x and y have. A line isn't going to model this accurately at all. Even though they have the same correlation coefficient, one has a line being a correct model for the data set, and the other does not.
Consider this data set.
If you see that the correlation is a number that is very, very low, near zero, you might assume there's no relationship between x and y in this case. You would be wrong. If you graph the data you can see there's a clear non-linear trend in the data set.
The correlation coefficient only measures the strength of a linear relationship between x and y.
It's important to always, always graph your data.
Correlation is a way to quantify the strength and the direction of a linear association, or a linear relationship between two quantitative variables that lie on a scatter plot. A strong linear association will be a number near positive 1 or negative 1. There are also moderate correlation coefficients, and weak correlation coefficients. Weak linear associations will have a correlation coefficient near zero.
Also a set of data might have low correlation, but strong nonlinear association. Always plot your data, and then you'll see the association first hand.
Good luck.
Source: This work adapted from Sophia Author Jonathan Osters.
The type of correlation present when two variables have a correlation coefficient generally less than or equal to -0.5.
Associations between two variables that can be modeled better with a curve than a line.
The type of correlation present when two variables have a correlation coefficient generally greater than or equal to 0.5.
The type of correlation present when two variables have a correlation coefficient generally between -0.5 and 0.5.