Source: Graphs created by author
In this tutorial, you're going to learn about the different shapes that distributions can take. Now, a distribution is a way to visually show how many times a variable takes a certain value. So it's the values the variable takes and how often.
Now, shape describes the data points as a whole. And what we're going to do is use qualifying descriptors to identify how the distribution of a data set looks when it's graphed. So let's run through a couple of examples.
One word that we can use to describe a distribution is the word symmetric. A symmetric distribution will have the same mean as it does its median. And if plotted, it will look like two mirror images on the same plot. So on this distribution, this is symmetric. And this line here is both the mean and the median of this distribution.
This distribution is also symmetric. The mirror line, if there was one, you could place it here, the line of symmetry. This one is also symmetric. The line of symmetry is right here. Now, this doesn't happen terribly often. Only a few distributions are actually truly symmetric.
Often we get distributions that look something like this. And they're close. But they're not exactly symmetric. And when you say the word symmetric, you mean exactly.
And so what happens is we end up needing to use qualifiers, like approximately symmetric or roughly symmetric or nearly symmetric, to make it clear that we don't mean that a histogram like this is exactly symmetric because it's not. If you take a look, this tail is just a little bit longer than this tail. So it's approximately symmetric. It's roughly symmetric. But it's not exactly symmetric.
Now, certain distributions aren't even close to symmetric. They're not approximately symmetric. Many asymmetric distributions are called skewed distributions. And they're characterized by a hump, which is sort of a dense grouping with lots of points at certain values, like in this bar, and some values that only have a few occurrences, like in these four bars here.
This is visually what we call a tail. And the tail occurs to one side of the median of the distribution. If the tail is on the right side of the median, we're going to call it skewed to the right, or positively skewed. And if the tail is to the left of the median, we're going to say the distribution is skewed to the left, or negatively skewed. We say positive and negative because right is more positive on the number line and left is more negative on the number line.
One more term that we should know is a uniform distribution. Uniform distributions are a certain kind of symmetric distribution. Imagine you put a line of symmetry here. They are symmetric. And it's a distribution where all the values are equally distributed, so something like if I rolled a die six times, maybe I get one 6, one 5, one 4, one 3, one 2, and one 1.
Now, here's the thing. We can also use the same qualifiers as we were talking about with symmetric on a uniform distribution as well. So suppose that I rolled the die 600 times, I would expect about 100 of each. But maybe I only got 95 1s and I got 102 2s. The distribution will look almost uniform, so we can use those words approximately, nearly, almost uniform in place of the word exactly uniform.
And so to recap, distributions when graphed have many descriptors that we can use to describe their shape. One was symmetric. And symmetric distributions visually have mirror halves. And mathematically what we mean is that they have the same mean and median. Their mean and median are the same number.
Uniform distributions are a specific type of a symmetric distribution that are visually very flat. And skewed distributions have a hump on one side of the median and a tail on the other side of the median. If the tail is on the right side of the median, we're going to call it skewed to the right, or positively skewed. And if the tail is to the left of the median, we're going to say the distribution is skewed to the left, or negatively skewed.
So we talked about distributions that may have been symmetric, or skewed to the right or positively skewed, skewed to the left or negatively skewed. And we talked about uniform distributions as well. Good luck. And we'll see you next time.
Source: SOURCE: BIMODAL HEIGHT DISTRIBUTION; CREATIVE COMMONS: HTTP://EN.WIKIPEDIA.ORG/WIKI/FILE:BIMODALANTS.PNG OTHER GRAPHS CREATED BY THE AUTHOR
In this tutorial, you're going to learn about unimodal distributions versus bimodal distributions. Uni means one. Bi means two. Modal means the number of modes each distribution has.
So oftentimes distributions will have a clear peak to their shape, and they won't peak anywhere but just one place on the distribution. So for instance, this distribution has a peak right here. This distribution peaks further to the right. And the next distribution peaks further to the left. But they all have a clear peak.
All of these are called the unimodal distributions. And the tallest bar is called the mode. So this is the mode here. This is the mode there. And this is the mode here.
You might, though, have a distribution that will have two distinct regions with lots of data points and a gap in the middle. When this happens, the two peaks form on the distribution, and those are both called modes. So a distribution like this is called bimodal. So there's a peak here and another peak over here.
Now, technically there's only one bin here that's the mode. It's the tallest one. But these are very tall relative to the others around them, so they're sort of local modes. And there are two of them-- a peak, a gap, and a peak.
Now, sometimes you have a distribution that appears bimodal, one here and then a gap and then another one here. It appears to be bimodal. But upon further examination of heights, it's possible that you have two different distributions that happened to be graphed on the same set of axes.
So there might be some variable, some hidden variable, that causes the bimodality. And when viewed separately, you end up with two unimodal distributions. They just happened to be graphed on the same set of axes.
You might also encounter something like this if there were something like test scores with students who didn't study versus students who did study. You might find that there are two kind of extremes for that distribution of test scores.
And any distribution with more than two peaks is called multimodal. And you can still-- this distribution, for instance, has four peaks. But you can have the same issues with these as you did with the bimodal distribution in that it may be multiple distributions graphed on the same set of axes.
And so to recap, some distributions are unimodal. That's single-peaked distributions. And others are bimodal that are clearly double-peaked, and some even that are multi-modal.
Now, sometimes a bimodal distribution is simply two unimodal distributions graphed together. And oftentimes there's a reason for the bimodality. So we talked about unimodal/single-peaked, bimodal/double-peaked, and multimodal/multiple peaks. Good luck, and we'll see you next time.