Source: Scatterplots by Katherine Williams
This tutorial talks about measures of variation. Now, another name for measures a variation is measures of spread. With measures of variation or spread, it's giving us important information about the distribution of data. Using only the measure of center is not enough. So if we only look at where the center of our data is, we won't get a complete enough picture of how our data is distributed, if we don't also look at the measures of variation or spread.
Examples of measure variation or spread include range, standard deviation, and variance. Each of these three measures of variation, range, standard deviation, and variance, will be covered in another tutorial that will go through how to calculate them and how to kind of interpret the value that you obtain.
One thing to note is that if you have a high measure of variation, then the data is very spread out from the center. You have a lot of values that are very low or very high. And then, your center is kind of far away from that. If you have a low measure of variation, the data is bunched up around the center. So let's look at an example.
Let's pretend we're looking at a very simplified example of the temperatures of three days in two different cities. And in one city, we have the first day it's 0 degrees, the second day it's 25 degrees out, and the third day it's 50 degrees out. Now, I know that's not very likely. But it's going to help to [INAUDIBLE] our point.
And in the other city, we had a temperature of 24 degrees, 25 degrees, and 26 degrees. Now, in both cities, the middle value, the center, the median, is 25. So they both have the same center. But if we look at the spread, here, the data is a lot more spread out. It goes from the minimum of 0 to the maximum of 50. So this is highly varied and the data is very spread out from the center.
In this example, the data is much less spread out. It goes only from 24 to 26. So in this example, they have a very low measure of spread, very low variation. So this is why it's important to look at both center and spread. Because without knowing both the center and the spread, you can have two very different data sets and not know the difference if you're only looking at the center. So this has been your tutorial on the measures of variation.