### Free Educational Resources

+
4 Tutorials that teach Histograms
Common Core: 6.SP.4 S.ID.1

# Histograms

##### Rating:
(4)
• (3)
• (1)
• (0)
• (0)
• (0)
Author: Sophia Tutorial
##### Description:

This lesson will explain histograms.

(more)

Sophia’s self-paced online courses are a great way to save time and money as you earn credits eligible for transfer to over 2,000 colleges and universities.*

No credit card required

28 Sophia partners guarantee credit transfer.

281 Institutions have accepted or given pre-approval for credit transfer.

* The American Council on Education's College Credit Recommendation Service (ACE Credit®) has evaluated and recommended college credit for 25 of Sophia’s online courses. More than 2,000 colleges and universities consider ACE CREDIT recommendations in determining the applicability to their course and degree programs.

Tutorial

This tutorial is going to teach you about histograms and binning. You will learn about:

1. Histograms
2. Binning
3. Histograms and Bar Graphs

## 1. Histograms

Histograms are a type of distribution for quantitative data.

Histogram

A distribution of data that shows the frequency of different ranges of values. Each frequency is the height of a bar.

When you have a quantitative data set oftentimes the values are spread out over a large range of values.

Suppose there's an elementary school class in Muncie, Indiana that chooses to keep track of the high temperature on each of the 180 school days. In Indiana, the temperature can get low in the winter, down to zero degrees Fahrenheit, and maybe near 90℉ at the beginning or at the end of the school year. In order to understand the overall trend of the data, you might not be interested in every single individual temperature. Maybe instead, you care about how many days were in the 20℉s, that is, days that the temperature was 20℉ up to 29.999℉, or in the 30s℉, 30℉ up to 39.99℉, or in the 40s℉, et cetera.

The idea that we can break those temperatures that occur over a wide range into more manageable intervals and categorize them that way is called binning.

Binning

The method of deciding what widths of categories should be used on a histogram

Binning allows us to make a frequency table out of those categories.

So suppose the Muncie School District recorded the temperature on every day. But then they categorized them by whether they were in 0℉s, 10℉s, 20℉s, 30℉s, 40℉s, 50℉s, 60℉s, 70℉s, 80℉s, or 90℉s.

This means there was one day out of the year that it hit the 90℉s, seven days were in the 80℉s. Once you decide on bin widths, you can create a frequency table and then a histogram.

A histogram is somewhat similar to a bar graph in that, on the horizontal axis, you’re going to take the temperatures, which are our categories now. The only difference is these categories are numbers. And it makes sense that we would put 0 as being first, and 10 as being second, and 20 as being third. Our bins will go from 0 to 10, 10 to 20, 20 to 30, 30 to 40, et cetera. Our frequencies, just like a bar graph, will go up the vertical axis.

As you can see from this histogram, the first bin goes from 0 degrees to 10 degrees, and there are 10 days that fall into that category. The second bin goes from 10 degrees to 20 degrees, and because there are 16 days there, that bar goes all the way up to 16. Every bar follows from the rule from the table.

## 2. Binning

How does the way we bin data change the histogram?

In the original histogram, data was classified by 10s. But what if you chose to classify it by 5 degree intervals instead? Instead of going 0 to 10, what if those 10 days were split up between 0 to 4 and 5 to 9, 10 to 14, and 15 to 19? Well then the bins might look different.

Suppose there were four days between 0 and 4 degrees. And six of those 0s days were between 5 and 9 degrees. What we have here is we took one bin and split it into two bins. And if we do that with every one of our bins we end up with twice as many bins and twice as many bars on our histogram.

In this new histogram, the bars are not as tall as they were before but they do still give the same overall shape as they did before. However, there's not very many bars overall. There's a lot of data in one part of the graph not a lot in the other parts. You'll note that in the 90 to 95 bin, there's no bar. The reason is when we broke up that bar the one data value that was in the 90s was actually in the 95 to 99. When there's no data in a particular bin there's not going to be any bar that extends up from the x-axis.

So binning is important. You can have problems if you make the bins too narrow. In the previous examples, we had bins of width 10 degrees and bins of width 5 degrees. You could have made them bins of width 1 or 2 degrees and still have a legitimate histogram. But maybe you wouldn't have gotten the same overall shape of the distribution.

There are two main problems you may have with binning: the pancake effect and the skyscraper effect:

• Bins that are too narrow can create the pancake effect: too many bins with almost nothing in them. You don't really get to see the overall shape of the data.
• Bins that are too wide can create the skyscraper effect. Suppose that your bins go from 0 degrees to 50 degrees and then 50 degrees to 100 degrees. If you have too few bins and lots of data in them, you still don't get where the shape of the distribution looks like. You know that most of the data is in one bin and not the other, but you don't know where in the first bin that data is. The classes and bins were too wide, so you don't get the overall understanding.

## 3. Histograms and Bar Graphs

You might confuse a bar graph with a histogram from time to time. But there are a couple of differences between the two kinds of graphs.

There's mainly two key differences.

1. In a histogram, the boxes touch. This it makes sense because the bins run one into the other, like with the temperature example. The information goes right from the 0s into the 10s, so it makes sense to have the boxes right next to each other. In a bar graph, bars don't have to do that.
2. In a histogram, the order of the boxes matters. In a bar graph, typically, there's no reason to believe that one category has a higher value than the other. Suppose that you have a bar graph of number of students enrolled in different college majors. There's no reason to put economics further to the right than chemistry, because one is not numerically greater than the other. However, in a histogram the values further to the right are, in fact, numerically greater than the values to the left. Because histograms deal with higher numbers and lower numbers, the order of the boxes matter.

Histograms are distributions for quantitative data. Typically, they're more spread out data. You use binning to the spread out data and create bars using the frequencies in those bins. Histograms can look like bar graphs, but are different. It's important to appropriately bin them so that you don't get the pancake effect and you don't get the opposite problem the skyscraper effect.

Thank you and good luck!

Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS

Terms to Know
Binning

The method of deciding what widths of categories should be used on a histogram

Histogram

A distribution of data that shows the frequency of different ranges of values. Each frequency is the height of a bar.