Use Sophia to knock out your gen-ed requirements quickly and affordably. Learn more
×

Five Number Summary

Author: Sophia

What's Covered

This tutorial will discuss the five number summary of a data set. You will learn about:

  1. Five number summary
  2. Minimum and Maximum

1. Quartiles

The five number summary takes larger data sets and makes a more manageable and easier to understand.

Term to Know

  • Five Number Summary
  • A brief overview of a data set consisting of the minimum, the first quartile, the median, the third quartile, and the maximum.

By breaking down large datasets from lots of numbers to just five, this method can help to summarize the center and variability.


2. Minimum and Maximum

Two of the numbers in the five number summary are the smallest and largest, the minimum and the maximum.

Suppose you have a list of the heights of the Chicago Bulls basketball team:


It's easy to see that the shortest person on the team is 71 inches tall and the tallest person on the team is 84 inches tall. Those are two of the numbers in the five number summary. The three remaining numbers will be based on the median.

The median measures of center of a data set: it's the middle of an ordered set of data. Currently, this is alphabetical by last name. So we should rearrange it so that it's in height order.


The number in the middle is the median, 79. Dividing at that point, you are left with two groups: a low group and a high group.

Then, take the median of each of those data sets. Now you have 74 in the low group, 81 in the high group, and 79 in the middle.

Those three numbers are called quartiles.They divide the data set into four equal parts, first quartile (minimum), second quartile (median), and third quartile (maximum):

Terms to Know

  • First/Lower Quartile
  • The number at which approximately 25% of the data set falls at or below that value.
  • Second Quartile/Middle Quartile/Median
  • The number at which approximately 50% of the data set falls at or below that value.
  • Third/Upper Quartile
  • The number at which approximately 75% of the data set falls at or below that value.
  • Quartiles
  • The values that divide the data set into four equal partitions.

What you'll notice is that 25% of the data falls at or below the first quartile. Half the data set falls at or below the median. And 75% of the data falls at or below the third quartile.

So the five number summary consists of the minimum, the first quartile, the median, the third quartile, and the maximum. The benefits of this particular summary is that you'll notice is that about 25% of the data falls within each of these bands here. So what you can understand about the data set is where lots of data values lie.

For instance, there are more data values in a narrower range. There's the same amount of data values here between 79 and 81 as there are between 74 and 79. It's the same number of data values, but they fall in a more narrow range. So you can tell the data are more clustered together in this area than they were in this area.

This makes it fairly obvious that:

  • 25% of the data falls at or below the first quartile
  • 50% falls at or below the median
  • 75% falls at or below the third quartile
  • all the data falls at or below the maximum

Summary

It consists of quartiles: the minimum (first quartile), median (third quartile), and the maximum. It allows us to understand where clusters of data points might be and where the data might be more spread out.

Thank you and good luck!

Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS

Terms to Know
First/Lower Quartile

The number at which approximately 25% of the data set falls at or below that value.

Five Number Summary

A brief overview of a data set consisting of the minimum, the first quartile, the median, the third quartile, and the maximum.

Quartiles

The values that divide the data set into four equal partitions.

Second Quartile/Middle Quartile/Median

The number at which approximately 50% of the data set falls at or below that value.

Third/Upper Quartile

The number at which approximately 75% of the data set falls at or below that value.