Source: All graphs created by Dan Laub;Image of bottled water, PD, http://www.clker.com/clipart-water-13.html; Image of smart owl, PD, http://www.clker.com/clipart-smart-owl-2.html; Image of Statue of Liberty, PD, http://www.clker.com/clipart-11386.html; Image of rain, PD, http://www.clker.com/clipart-6488.html; Image of speed limit sign, PD, http://www.clker.com/clipart-13015.html
Hi, Dan Laub here. And in this lesson, we're going to discuss normal distributions and probability. But before we do so, let's cover the objective for this lesson. By the end of this lesson, you should be able to understand the relationship between the probability of events occurring and the area located under a normal distribution curve. So let's get started.
Remember from previous lessons that a density curve in a normal distribution has a bell-shaped curve with the mean located in the middle of the curve, as pictured here. The idea behind using experimental probability is to determine the likelihood of an event occurring. And normal distributions help us do this, since they represent a wide variety of real world situations for which we may be interested in determining such likelihoods.
Typically, when large enough samples are drawn from a population that we are interested in, such data takes on a normal distribution. This is a topic that we will explore further in this lesson. For example, consider the distribution of a volume of bottled water.
While a bottling company may attempt to put 500 milliliters of water in each bottle, the equipment, people, and processes used to bottle the water are not perfect. And as a result, it is possible that some bottles will contain more or less than the stated 500 milliliters. If enough random samples of bottles are taken, we will start to get a sense for how much the volumes vary from 500 milliliters. And we'll likely see a pattern of distribution that resembles a normal distribution, in the sense that most values are clustered around the mean, with very few values being far from it.
Another example of a distribution that resembles a normal distribution when a large number of random samples are drawn is intelligence quotient, or IQ scores. Obviously, a small random sample of individuals would not provide us with a very representative distribution of the population as a whole. But with enough samples, we will start to recognize a normal distribution.
When looking at a normal distribution, the total area under the bell-shaped curve is equal to 1% or 100% since all possible outcomes of an event are going to fall somewhere along this curve. Generally speaking, the area that we see under the curve that exists between two points that represent specific outcomes, is equivalent to the probability of any value falling between those two values occur.
For example, let's consider the average rainfall for New York City over the course of a year. Suppose that we are interested in the probability that a given year has above average rainfall, and want to use a graph of a normal distribution to estimate that probability. Well, as you can see here, since the mean or average rainfall is indicated by the midpoint of the normal distribution curve, the probability that a given year would have a greater than average amount of rainfall is indicated by the area of the graph under the curve and to the right of the mean. Since this represents 50% of the area under the curve, the probability is equal to 0.50.
In this example, as well as others, the probability of a certain range of outcomes equals the area under the normal curve located between those outcomes. As you may recall, a standard deviation provides one with a sense of how closely data happens to be centered around the mean of a data set. For normal distributions, the mean is the center of the distribution. While the standard deviation is represented by how far away from the mean it falls on the horizontal axis.
Let's consider the example of the speed of drivers on a particular section of a major highway, assuming that it follows a normal distribution. Let's further assume that the mean speed of all drivers over a given period time is 74 miles per hour with a standard deviation of five miles per hour. As you can see in the graph, the mean of 74 miles per hour is located in the middle of the distribution, with a number of standard deviations away from the mean indicated on the horizontal axis.
Suppose that we are interested in determining the probability that a given driver is driving between 75 and 80 miles per hour. By looking at the area indicated here on the graph, we can see that that probability of a given driver driving between these two speeds is approximately 0.306. Now, what if we were interested in the probability of a driver going faster than 84 miles per hour? Well, in this case, you can see the area highlighted on the graph, which translates to a probability of approximately 0.023.
In another situation, suppose that we wanted to know the probability that someone was driving slower than 70 miles per hour. As you can see on the graph here, the area highlighted suggests that this probability is roughly equal 2.212. So let's go back to our objective just to make sure we cover what we said we would.
We wanted to be able to understand the relationship between the probability of events occurring and the area located under a normal distribution curve, which we did. We went through several different examples and illustrated how the area under the curve is representative of the probability of an event occurring. So again, my name is Dan Laub. And hopefully you got some value from this lesson.