In this tutorial, you're going to learn about stem-and-leaf plots. This tutorial will discuss:
While these graphs have a funny name, they actually serve a really good purpose and are very versatile.
Stem-and-Leaf Plot/Stemplot
A distribution of quantitative data that shows natural numerical breaks in the data as categories called "stems" and individual values as "leaves."
Many quantitative data sets can be displayed in stem-and-leaf plots, and this is one of them.
This data sets represents the 50 states in the United States, and these numbers are the percent of college students that are enrolled in public colleges. So in one state, 95% of its college students are in public schools. Whereas in another state, only 52% are.
To create a stem-and-leaf plot, you’ll want to follow these steps:
1. Decide on a natural classification. Here, 10 seems like an obvious choice. Those are going to be our bins. You should also make a choice for your bins based on some digit. In this case, go by the tens digit. If these numbers were in the hundreds, maybe you’d want to go by the hundreds digit, but you could still go by the tens if you wanted to.
2. Create “stems.” These are going to be the stems based on the bins you selected. The 9 means that this is going to be a state with 90 or something in the 90s percent of their students at public school; the 8 represents states where the number is in the 80s. Write them in order, least to greatest or greatest to least, up or down the page (the direction doesn’t really matter) to the left of some vertical line.
3. List the values by their ones digit ascending away from those stems. Those are considered the leaves. When it's done, it will look like this.
Notice that 43% of the students in one state were at a public school, etc. Further, notice that, if a value appears more than once, you will list it more than once. 80 appears three times, for example. And notice that those numbers are ascending away from the stem.
There's one more important feature of a stem-and-leaf plot. You need to be able to tell someone who's looking at this what they're looking at. You know that, in our graph, the “6 bar 2” means that there's a state that has 62% of its students going to public colleges. So tell the reader that by saying, in a key, “4 bar 3 means 43%.”
We're telling our reader how these numbers should be interpreted.
In our previous graph, the 80s had more than any other grouping. In fact, they had more than twice as much as any other single grouping, which looked a little strange.
Is there anything we can do about that? Suppose that you decided that tens was too wide of a bin. What you could do is break it down to by fives, and then write two 8s, a low 8 and a high 8. 85 to 89 for the high, 80 to 84 for the low. If you’re going to split one bucket, you need to split them all.
If you split the stems into lows and highs, the graph will look like the one above. Because this separates the stems so that no one has so much more data than any other, this is a little bit more of an appropriate visual than the first one.
Take a look at this set of GPAs, high school GPAs for these students. Make a stem-and-leaf plot of these GPAs.
There are two main ways that this graph could be shown.
Suppose you are interested in the differences between girls' GPAs, like Amy, Holly, Jenny, Katherine, and et cetera, with the boys' GPAs. You could compare those by putting one group of leaves go to the right of the stem and another group of leaves go to the left of the stem.
A back-to-back stem-and-leaf plot would look like this:
Back-to-Back Stem-and-Leaf Plot
Two stem-and-leaf plots on the same set of stems. This allows us to compare the distributions of two different categories.
This graph rounds the numbers again, saying that “3 bar 1 means the GPA rounds to 3.1.” Here, the girls' GPAs are on the left. The boys' GPAs are on the right. This allows you to compare the distributions of boys' GPAs to girls' GPAs, which shows that the girls' GPAs are typically a little bit higher.
Why use a stem-and-leaf plot instead of other graphical displays, like histograms or dot plots?
Stem-and-leaf plots have a couple of advantages:
The drawback to stem-and-leaf plots is they get difficult to create if the data set is too big. For example, the graph at the beginning had 50 data values, and it might be very difficult to see all the data values at once.So the data set with 50 data values is on the edge of a stem-and-leaf plot being a useful display for that data set.
Stem-and-leaf plots are very useful displays of quantitative data. The reason why to use stem-and-leaf plot is that there are many ways you can use them and they are very versatile. To make these plots, start by creating bins from natural numerical breaks so that the reader can identify the numbers and then make a key. In order to make the plot clearer, you can split stems, round the values, create leaves with double digits, or you compare across categories using a back to back plot.
Thank you and good luck!
Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS
Two stem-and-leaf plots on the same set of stems. This allows us to compare the distributions of two different categories.
A distribution of quantitative data that shows natural numerical breaks in the data as categories called "stems" and individual values as "leaves."