Source: Image created by the author
This tutorial is going to explain different measures of variation and why they're necessary. It's not good enough to report just an average, or a measure of center when you're talking about a data set. Suppose I was going to compare and contrast the January high, low, and average temperatures for Buffalo Grove, Illinois versus Valdez, Alaska. Well, the arrange for both of these cities is 21 degrees in January.
However, if you look at the typical high temperature in Buffalo Grove, it's a little higher than the typical high temperature in Valdez. And the low temperature in Buffalo Grove is a little bit lower than the low temperature in Valdez. Buffalo Grove's temperatures, although they average the same as Valdez, are a little bit more variable. They're spread out a little bit more. It gets a little colder at night and a little warmer in the day. Valdez's temperatures seem a little bit more consistent. And so it would be inappropriate to just compare them based on their averages.
So we need to understand how variable the values are around whatever measure of center we're choosing to use. And just like measures of center, there are several measures of spread. There's the range, standard deviation, and something called the interquartile range. All of these are covered in more detail in another tutorial. But in any case, whatever measure of variation you use, a high value means that the data set is not consistent, that it's more spread out.
Whereas a low value indicates that the values are not very spread out, that they're tightly clustered together. And when the data does deviate from the center, it's not by very much. And you can have measures of spread or measures of variation that are zero, which would indicate that all the data values are, in fact, the same.
So to recap, variation indicates the extent to which the data set values are close together. So there are many ways to measure variation, and all of those methods have a simple rule, which means that a high value means that the data are more varied and a smaller value means that the data are less varied. So we talked about measures of variation, same thing as measures of spread. Variation and spread are synonyms that we'll be using fairly extensively throughout these tutorials. Good luck. We'll see you next time.