In this lesson, students will learn how to represent the distribution of skewed data.
If you recall from a previous lesson, normal distributions have density curves that are symmetric and bell shaped. The mean, median, and mode of the normal distribution are all the same and equal to the center value of the density curve. However, there are situations where, unlike normal distributions, a distribution may not be symmetrical. We call these distributions skewed distributions.
The first distribution that you see below is skewed. It reflects a situation in which there would be a lot of values concentrated toward the lower end of the distribution relative to the higher end. This is typically how the housing market is distributed. This second distribution is skewed, and it reflects a situation in which there will be a lot of values concentrated toward the higher end of the distribution relative to the lower end. A good example of this distribution would be the mileage on the odometer of used cars.
Refer back to the home values graph. On the bottom horizontal axis, you have the price listed in thousands of dollars. The range is from around 100,000 up almost to 800,000. The majority of the values are concentrated right there, right around the $300,000 mark. This is going to be a right skewed or a positively skewed distribution curve.
Why is it right skewed? Because there are quite a few observations right there around that $300,000 point, but there are also a lot of values of homes that are priced much higher. That would pull the mean up while leaving the median or the middle value relatively low.
In this case, the mode would be the smallest value, then the median, and then the mean. You'd read the graph along the horizontal axis from left to right.
When you look at a right skewed distribution, the far right-hand side is pointing to the right. Consider that as an arrow. Think of the tail as an arrow pointing to the right to remember it's a right skewed distribution. Based upon which direction that tail actually points gives us a sense of what kind of distribution we're dealing with. The mean would be $400,000, which would be higher than the median at $325,000 and higher than the mode of $300,000. This is a key indicator of a right skewed or a positively skewed distribution.
A left skewed distribution, on the other hand, would be an example such as the mileage on used cars. If you look at the distribution of the mileage of used cars, you notice that there are similarities to the other graph. However, it's not going to necessarily look exactly the same. This type of distribution is called a left skewed distribution or a negatively skewed distribution. In this case, the mode is greater than the median and the median is greater than the mean.
You see a relatively low number on the lower end in terms of the value of miles on a car. This is because people might wait a lot longer to trade in their vehicles, and they might have had to have higher mileage on them. You'd see a left skewed distribution. Notice that the tail points to the left.
So how do we identify differences between distributions? Look at this graph that shows a normal distribution. It's symmetrical. The mean, the median, and the mode would be identical, and they'd be the value right there in the middle.
Bank account balances would be an example of skewed distribution, and it would be right skewed. Some people have quite a bit of money in their bank account, but many don't. In this situation the mode and the median are going to be relatively low, and the mean's going to be higher. That's simply because there are some large balances that would pull the mean up.
For an example of a left skewed distribution, look at the heart rate of a sample of people. You're going to see the heart rate measured in beats per minute, and the mean beats per minute is probably somewhere around the 70 and 80 mark. You will see some beats that are higher than that and some that are lower than that. However, there will be very few people with a low heart rate. That's going to bring you to a situation that's going to reflect a left skewed distribution.
If the mean happens to be greater than the median and the mode, it is a right skewed distribution. If the mean happens to be equal to the median and the mode, it's going to be a normal distribution. If the mean would be less than the median and the mode, it is a left skewed distribution.
Source: This work is adapted from Sophia author Dan Laub.