Source: Graphs created by the author
In this tutorial, you're going to learn about something called the 68-95-99.7 rule. The name sounds strange, but it'll make a whole lot more sense in a little bit.
This is the normal distribution. The normal distribution is single peaked, and symmetric. And so what we have is a distribution that can be described exclusively by its mean and standard deviation. All the normal curves that we're going to deal with, all the normal distributions look exactly the same. They all look exactly like this.
The only thing that might make them a little different is some of them have a wider standard deviation, and some of them are centered at different places. But being that that's all the difference is we can describe it saying, N, mean, comma, standard deviation.
When we first calculated standard deviations, we stated that a good amount of the data points were going to fall within one standard deviation of the mean. And that wasn't super well-defined, but in a normal distribution it is well defined. In a normal distribution you get a good amount of the data points in the first standard deviation.
Now how much of that? It's 68%. 68% of all the data falls within the first standard deviation of the mean. From one standard deviation below the mean, here, to one standard deviation above the mean, here.
If we go out yet another standard deviation, two standard deviations below the mean, all the way up to two standard deviations above the mean, we get 95% of the data points. And if we go all the way out to three standard deviations below the mean, and three standard deviations above, we get almost the entire set of data. 99.7%.
This is why it's called the 68-95-99.7 rule. Let's see it again. 68. 95. 99.7 rule. One, two, and three standard deviations out.
And so here's the rule written out. About 68% of the data values are within one standard deviation of the mean. 95% are within two standard deviations of the mean, that means above or below. And about 99.7% of the data, almost all the data, fall within three standard deviations of the mean.
Because of the symmetry of the normal distribution, we can actually examine this rule further. We can say that 68% of the data falls within one standard deviation. But because of the symmetry, that means that 34% falls between one standard deviation below, and the mean. And another 37-- 34% falls within-- between the mean, and one standard deviation above the mean.
We can continue on this logic to say that these green bars, between one and two standard deviations below the mean, contains 13 and 1/2%. And because of the symmetry it's the same on the right side. We obtained 13.5% by saying that between the first-- between the stand- the two standard variations on each side that's 95%.
68% is within here, and so the remaining 27, 95 minus 68, fall within the two green bars. And because they have the same area it's 13 and 1/2%. Using that same logic again we can understand that about 2.35% of the data points fall within the blue bars.
And way, way out in the tails. You get almost none of it, but it's worth talking about anyway because it makes the full 100%. Another 0.15% falls within each of those tails further out than three standard deviations away.
So let's do an example. A particular battery from a company has a lifetime that is normally distributed with a mean of 500 hours, and standard deviation of 18 hours. What percent of batteries last between 482 and 518 hours? Pause the video and then come back to it.
What you should have come up with is it's 68%. Between one standard deviation below, and one standard deviation above. That was a softball question. Try this one. What percent of batteries last between 446 hours, and 536 hours?
What you should have come up with this time was-- we can take a look at it this way. 446 is three standard deviations below the mean. 536 is two standard deviations above the mean. We could calculate it as the full 95% percent for this green area, plus one of these 2.35 blue bars.
Or we could have calculated it as 99.7%, which is all of this blue area, but we don't want this part, which is 2.35%. Either way, we end up with 97.35%.
Try one more. What percent of batteries last longer than 464 hours? Pause the video one more time.
What you should have come up with is-- there's a couple of different ways to do this. 464 is two standard deviations below the mean. You could have just started adding things up. 13 and 1/2, 34, 34, 13 and 1/2, 2.35, and 0.15. That would work.
Another way to do it would be to say, oh. 95% plus this remaining 2 and 1/2$ percent in this entire tail. Or you could have said, well, take a look at this entire curve that's in 100%, and the only part we don't want is this 2 and 1/2% on the left.
In whatever way that you do it, you should end up with the same answer, 97 and 1/2% of batteries last longer than 464 hours.
So to recap. The 68-95-99.7 Rule is a way to generate approximate percents, and these are very close to the correct percents, but they're approximate values. Percent of values that will be within a particular interval of the normal distribution on the integer standard deviations away.
We can also use the symmetry of the normal distribution to find more percents than just those 68, 95, and 99.7, like we did in a couple of those last examples. This rule does not work if the values are not on integer standard deviations. That means whole numbers of standard deviations away from the mean. One, two, three, negative 1, negative 2, and negative 3.
So we talked about the 69-95-99.7 Rule, and all of its different permutations. Good luck, and we'll see you next time.