This tutorial will discuss the five number summary of a data set. You will learn about:
The five number summary takes larger data sets and makes a more manageable and easier to understand.
By breaking down large datasets from lots of numbers to just five, this method can help to summarize the center and variability.
Two of the numbers in the five number summary are the smallest and largest, the minimum and the maximum.
Suppose you have a list of the heights of the Chicago Bulls basketball team:
It's easy to see that the shortest person on the team is 71 inches tall and the tallest person on the team is 84 inches tall. Those are two of the numbers in the five number summary. The three remaining numbers will be based on the median.
The median measures of center of a data set: it's the middle of an ordered set of data. Currently, this is alphabetical by last name. So we should rearrange it so that it's in height order.
The number in the middle is the median, 79. Dividing at that point, you are left with two groups: a low group and a high group.
Then, take the median of each of those data sets. Now you have 74 in the low group, 81 in the high group, and 79 in the middle.
Those three numbers are called quartiles.They divide the data set into four equal parts, first quartile (minimum), second quartile (median), and third quartile (maximum):
What you'll notice is that 25% of the data falls at or below the first quartile. Half the data set falls at or below the median. And 75% of the data falls at or below the third quartile.
So the five number summary consists of the minimum, the first quartile, the median, the third quartile, and the maximum. The benefits of this particular summary is that you'll notice is that about 25% of the data falls within each of these bands here. So what you can understand about the data set is where lots of data values lie.
For instance, there are more data values in a narrower range. There's the same amount of data values here between 79 and 81 as there are between 74 and 79. It's the same number of data values, but they fall in a more narrow range. So you can tell the data are more clustered together in this area than they were in this area.
This makes it fairly obvious that:
It consists of quartiles: the minimum (first quartile), median (third quartile), and the maximum. It allows us to understand where clusters of data points might be and where the data might be more spread out.
Thank you and good luck!
Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS