Source: Graphs created by author
In this tutorial, you're going to learn about the different shapes that distributions can take. Now, a distribution is a way to visually show how many times a variable takes a certain value. So it's the values the variable takes and how often.
Now, shape describes the data points as a whole. And what we're going to do is use qualifying descriptors to identify how the distribution of a data set looks when it's graphed. So let's run through a couple of examples.
One word that we can use to describe a distribution is the word symmetric. A symmetric distribution will have the same mean as it does its median. And if plotted, it will look like two mirror images on the same plot. So on this distribution, this is symmetric. And this line here is both the mean and the median of this distribution.
This distribution is also symmetric. The mirror line, if there was one, you could place it here, the line of symmetry. This one is also symmetric. The line of symmetry is right here. Now, this doesn't happen terribly often. Only a few distributions are actually truly symmetric.
Often we get distributions that look something like this. And they're close. But they're not exactly symmetric. And when you say the word symmetric, you mean exactly.
And so what happens is we end up needing to use qualifiers, like approximately symmetric or roughly symmetric or nearly symmetric, to make it clear that we don't mean that a histogram like this is exactly symmetric because it's not. If you take a look, this tail is just a little bit longer than this tail. So it's approximately symmetric. It's roughly symmetric. But it's not exactly symmetric.
Now, certain distributions aren't even close to symmetric. They're not approximately symmetric. Many asymmetric distributions are called skewed distributions. And they're characterized by a hump, which is sort of a dense grouping with lots of points at certain values, like in this bar, and some values that only have a few occurrences, like in these four bars here.
This is visually what we call a tail. And the tail occurs to one side of the median of the distribution. If the tail is on the right side of the median, we're going to call it skewed to the right, or positively skewed. And if the tail is to the left of the median, we're going to say the distribution is skewed to the left, or negatively skewed. We say positive and negative because right is more positive on the number line and left is more negative on the number line.
One more term that we should know is a uniform distribution. Uniform distributions are a certain kind of symmetric distribution. Imagine you put a line of symmetry here. They are symmetric. And it's a distribution where all the values are equally distributed, so something like if I rolled a die six times, maybe I get one 6, one 5, one 4, one 3, one 2, and one 1.
Now, here's the thing. We can also use the same qualifiers as we were talking about with symmetric on a uniform distribution as well. So suppose that I rolled the die 600 times, I would expect about 100 of each. But maybe I only got 95 1s and I got 102 2s. The distribution will look almost uniform, so we can use those words approximately, nearly, almost uniform in place of the word exactly uniform.
And so to recap, distributions when graphed have many descriptors that we can use to describe their shape. One was symmetric. And symmetric distributions visually have mirror halves. And mathematically what we mean is that they have the same mean and median. Their mean and median are the same number.
Uniform distributions are a specific type of a symmetric distribution that are visually very flat. And skewed distributions have a hump on one side of the median and a tail on the other side of the median. If the tail is on the right side of the median, we're going to call it skewed to the right, or positively skewed. And if the tail is to the left of the median, we're going to say the distribution is skewed to the left, or negatively skewed.
So we talked about distributions that may have been symmetric, or skewed to the right or positively skewed, skewed to the left or negatively skewed. And we talked about uniform distributions as well. Good luck. And we'll see you next time.
A display of data that shows the values the data take and how often those values occur.
A distribution where the mean and median are the same. It will appear to have a "mirror line" at the median of the distribution.
A distribution where the majority of values are on one side of the distribution, and there are only a few values on the other.
A distribution where the majority of values are low, and there only a few high values that form a "tail" to the right of the median.
A distribution where the majority of values are high, and there only a few low values that form a "tail" to the left of the median.
A distribution where all values are equally likely.