Source: All graphs created by Dan Laub; Image of Facebook icon, PD, http://bit.ly/1DWn3hK; Image of shoe, PD, http://www.clker.com/clipart-high-heels-red-shoe.html; Image of z table, PD, http://bit.ly/1NS7pI5
Hi, Dan Laub here. In this lesson, we're going to discuss determining the likelihood of a mean. But before we get started, let's cover the objective for this lesson. By the end of this lesson, you should be able to determine the likelihood of a mean occurring by using a sampling distribution. So let's get started.
Remember that when taking your random sample, the sample means will generally follow a normal distribution, if the population is normally distributed or the sample size is sufficiently large. Knowing this, we are focused on the distribution of the sample means and not concerned with the individual values of a variable. Recall that if we were given a sample mean, we can determine a z-score.
Sample means that have large positive z-scores or large negative z-scores are quite unlikely to occur. In fact, we can think of those with large positive z-scores as unusually large sample means. While those that have large negative z-scores would be considered unusually small sample means.
When looking at a normally distributed variable, approximately 68% of observations will fall within one standard deviation of the mean, being either greater than or less than the mean value. Additionally, 95% of observations will fall within two standard deviations of the mean. And 99.7% of observations will fall within three standard deviations of the mean.
This being the case, 95% of sample means will have a z-score that falls between negative 2 and 2. Sample means in this range indicate that they are generally likely to occur within a particular sample.
When we know the z-score of a sample mean, we can determine the likelihood of its occurrence by looking up an area corresponded to the specific z-score in a table. The area, which extends from the z-score moving away from the mean, is called a one-sided tail area. If we double this one-sided tail area, we get the two-sided tail area.
Consider the example of the number of friends that one may have on Facebook. If the population mean is 175 and the standard deviation is 40, the large majority of observations, roughly 95% of them, will fall between the range of z-scores of negative 2 and 2. Or between 95 and 255 friends.
In this distribution, it would be quite unlikely that the number of Facebook friends of a randomly chosen person would be a very high number, such as 500. It would also be quite unlikely that it would be very low, such as 20. On the other hand, some very likely numbers of friends for a randomly selected person would be 150 or 220. In other words, values that fall within a couple of standard deviations of the mean number of 175.
Suppose that we were interested in the distribution of women's shoe sizes, and knew that the mean of the population was a size 8.4, and a population standard deviation was 1.3. How can we determine the likelihood of the sample mean being equal to a particular shoe size? Well, first of all, the normal distribution should be changed to a standard normal distribution by using the z-score equation of z-score is equal to the value minus the mean divided by the standard deviation.
Next, let's assume that the sample mean is a size 8.21 with a sample standard deviation of 0.13 and a sample size of 100. In this example, the z-score would be equal to 8.21 minus 8.4 and the difference divided by 0.13, or approximately negative 1.46. If we were to consider a sample mean of 8.54 instead, the z-score would be equal to 8.54 minus 8.4 divided by 0.13, or approximately 1.08.
When considering a sample mean of let's say, size 8.45, we would determine the z-score by using this formula-- 8.45 minus 8.4 divided by 0.13. In this case, the z-score would be equal to approximately 0.38. Since this is less than one standard deviation above the population mean, it is very likely that a sample mean could be equal to 8.45.
If we look at the area under the graph, as indicated right here, we see the one-sided tail area. When we double this, as we do right here, we see the two-sided tail area. Since these areas are quite large, there is a strong likelihood that the sample mean of less than 8.45 would occur. As large areas, such as this, indicate a high probability of the mean being less than the selected value. In this case, the z-score of 0.38 translates to a probability 0.65, approximately 0.65. But the sample mean would be less than or equal to 8.45.
Let's now consider an unusually high value of a size 8.8 and see how likely a sample mean size greater than that value is to occur. As you can see in the graph, this value falls on the far right side of the distribution. The z-score for this value is determined as follows-- 8.8 minus 8.4 divided by 0.13, which is approximately equal to 3.08.
This corresponds to a one-tailed area of 1 minus 0.999 or approximately 0.001. If we double the area to calculate the two-tailed area, we arrive at a probability of approximately 0.002. This is a very small area and therefore indicates that the mean would be very unlikely to be this large. In general, small areas mean that there is a low probability of a value occurring.
By the end of this lesson, we wanted to be able to determine the likelihood of a mean occurring by using a sampling distribution, which we did. And we went through a couple of different examples to illustrate that point. So again, my name is Dan Laub. And hopefully, you got some value from this lesson.
(0:00 - 0:32) Introduction
(0:33 - 2:07) Random Sampling
(2:08 - 2:55) Example 1
(2:56 - 5:41) Determining Likelihood of Mean
(5:42 - 5:56) Conclusion
Table for looking up the area starting at z=0 for a positive z-score.