Source: All images created by Dan Laub
[MUSIC PLAYING] Hi. Dan Laub here. And in this lesson, we're going to discuss significance levels. But before we do so, let's cover the objective for this lesson. By the end of this lesson, you should be able to identify the significance level in a graph of a sampling distribution. So let's get started.
Remember from previous lessons that the benefit of taking random samples from a population is that it enables one to get a more representative estimation of what the population actually looks like. When one draws a lot of random samples, the result is having a sample mean for each sample, or a lot of sample means. In the event that the sample sizes are large enough, the sample means will be normally distributed and centered on the mean of the population.
When sampling data, it is helpful to know which sampling means are very large and which are very small, which could be done by selecting a region on a normal distribution that falls far away from the mean in both directions. This area on the distribution is referred to as the region of rejection and represents that part of the distribution in which the results are not likely to be due to chance. This area of the region is known as the significance level, a level which is typically chosen to be 5%. Sample means which fall into this region of rejection are generally considered to be vastly different from the population mean and are unlikely to occur as sample means.
As you can see in this graph, the areas of the region of rejection lie in the areas to the left of z equals negative 2 and to the right of z equals 2. This is what is called a two-tailed area, as one tail lies to the far right of the distribution and the other to the far left, with each comprising 2.5% of the total area under the curve. When considering a z-table, these 2.5% areas translate to z-scores of less than negative 1.96 and greater than 1.96. This 5% combined area of the region of rejection relates to the significance level of 5%.
Suppose we have an example of a normally distributed variable in which there are four observation that we are interested in looking at. If these four values translate to the z-scores you see here, how do we know which ones lie in the region of rejection? Well, the z-score of 2.81 would fall to the right of z is equal to 1.96 and would therefore fall in the region of rejection. However, the z-score of 1.54 would not fall to the right of z is equal to 1.96, and as a result, would not fall in the region of rejection. As for the other two z-scores, negative 2.05 would fall to the left of z is equal to negative 1.96, and as a result, would fall in the region of rejection, while negative 0.46 would not fall to the left of z is equal to negative 1.96 and would not fall in the region of rejection.
When a z-value lies in the region of rejection, it is understood that the corresponding sample mean varies significantly from the population mean. To continue with our example, z-scores of 1.90 and negative 1.23 would not fall in the region of rejection, which means that the sample means corresponding to these z-scores would not be rejected, because they do not vary significantly from the population mean.
So let's go back to our objective just to make sure we covered it. We wanted to be able to identify the significance level in a graph of a sampling distribution, which we did. Keep in mind that the z-scores of negative 1.96 and positive 1.96 are the points at which we compare the z-scores of the values we are actually looking at. So again, my name is Dan Laub, and hopefully, you got some value from this lesson.