This lesson will explain hypothesis testing when the population standard deviation (sigma) is known.
Source: Image created by Joseph Gearin
In this tutorial, you're going to learn how to perform a z-test for population means. I should note that this isn't done horribly often, because it requires that you know the population's standard deviation, but not the population mean. And that doesn't happen all that often. But let's take a look.
So according to their bags, the standard bag of M&Ms candies is 47.9 grams. And in fact, if you look down at the bottom, sure enough there it is. Suppose that we take 14 bags at random and weigh them, and these are the result. Assuming the distribution of bag weights is approximately normal and that the standard deviation of all M&M bags is 0.22 grams.
This sample, is this evidence that bags don't contain the amount of candy that they say that they do? Now, this could mean that it's either higher than they say or lower than they say. And if you take a look, some of these are fairly off. This is off by almost a full gram. So we're wondering if that's a big deal. And we also are assuming that we know the standard deviation of all M&M bags, which is not always a reasonable assumption, but we're going to go with it for this particular example.
So there are four parts to running any hypothesis test. This is regardless of the type of tests that you use. The first step is stating the null and alternative hypotheses. Second, we should check the conditions necessary in order to actually perform the inference that we're trying to do. Third , we should calculate the test statistic, in this case, a z-statistic, and calculate the p value based on the normal sampling distribution. And finally, we should compare our test statistic to our chosen critical value or our p value to our chosen significance level. Those are both OK approaches.
And finally, based on how they compare, state a decision regarding the null hypothesis. So circle it back around to the null hypothesis and say, does this support the null hypothesis or does it refute the hypothesis? So make a decision to either reject or fail to reject it based on your evidence. And it should also be in the context of the problem.
So step one-- for this problem, the null hypothesis is that the M&M bags are doing exactly what we thought they would do. The mean is the 47.9 grams that was claimed. The mean of all M&M bags is 47.9 grams in weight. The alternative hypothesis is that they're not that number. So this is going to be a two-sided test based on this not equal to symbol. We should also while we're here state what our alpha level is going to be, what our significance level is going to be.
By stating that alpha equals 0.05, which is the most common significance level, we are saying if the p value is less than 0.05, reject the null hypothesis. If this is above 0.05, we should fail to reject it. Second, let's look at the conditions necessary for inference on a population mean. So were the data collected in a random way, is one criteria. The second, is each observation independent of the other observations ? And thirdly, is the sampling distribution that we're going to use approximately normal?
So I've broken it down by randomness, independence, and normality. The randomness, hopefully it says so somewhere in the problem. Think about the way the data was collected, and hopefully it should say so. Independence, we want to make sure that the population is at least 10 times as large as the sample size. This was our workaround for independence. If the population is sufficiently large then taking out the number of bags that we took doesn't make a huge difference.
And finally, normality, there's going to be two ways to verify this, either the distribution of all bags has to be normal-- the parent distribution has to be normal-- or the central limit theorem is going to have to apply. And the central limit theorem says that for most distributions, when the sample size is greater than 30, the sampling distribution will in fact be approximately normal.
So it says the bags were randomly selected in the problem. So thinking about the way that the data was collected in the problem is important. Assume there are at least 140 bags of M&Ms. That's a reasonable assumption. Why 140? Because there were 14 bags in our sample. So we're going to assume that the population of all bags of M&M is at least 10 times that size. And then finally, the distribution of bag weights is in fact approximately normal as stated in the problem.
Part three, let's look at the test statistic. In this case, our test statistic is going to be a z-statistic. And we're going to do our sample mean minus the hypothesized population mean of 47.9 from the null hypothesis. And its going to be over the standard error, which is the standard deviation of the population divided by the square root of sample size. When we do all of that and punch in the numbers, we get a z-statistic of positive 2.89. And so we're going to look at where that lies on the normal distribution that we're using.
A z-statistic of 2.89 here on the standard normal distribution centered at 0 is right here between two and three standard deviations above the mean. Because this is a two-sided test, we're going to find the probability that our z-statistic is above positive 2.89 and the probability that it's below negative 2.89. That probability when doubled gives us 0.0038. We can also find this probability using technology.
Now let's finally compare our test statistic to our critical value or our p-value to our significance level. We're going to compare our p- value of 0.0038 to our significance level of 0.05. Notice, this actually contains three parts-- the comparison, the decision, and the conclusion. So since our p-value of 0.0038 is less than 0.05-- there's our comparison-- our decision is we reject the null hypothesis. And there is evidence to conclude that the M&M bags are not filled to a mean of 47.9 grams. There's our conclusion. So we have all three parts that we need to finish the problem.
And so to recap, the steps in any hypothesis test, not just a z-test for population means, but any hypothesis test are the same. First, state the null and alternative hypotheses both in symbols and in words. Second state and verify the conditions necessary for inference. Third, calculate the test statistic from your statistics that you have and calculate its p-value. And finally, compare your p-value to the alpha level that you've chose or your test statistic to the critical value, and make a decision about the null hypothesis. And state your conclusion in the context of the problem.
In a z-test for population means, the population standard deviation must be known. And that's not very common. And we'll have other ways to deal with it when we don't know the population standard deviation. And the test statistic is a z, And so this is all about here z-test for population means. Good luck and we'll see you next time.
A hypothesis test that compares a hypothesized mean from the null hypothesis to a sample mean, when the population standard deviation is known.