Source: Shopping Cart; Creative Commons: http://www.jwmphotography.com/index.html Tables and graphs created by Jonathan Osters
In this tutorial, you're going to learn about a Hypothesis Test for Population Proportions. It's pretty analogous to any other hypothesis test that we've done. So let's take a look at a situation that would require proportions.
A popular consumer report reported that 80% of all supermarket prices end in the digits 9 or 5. So suppose you check a random sample of 115 items to just check it against the consumer report, and you find that only 88 end in 9 or 5. That's less than 80%. The question is, is that significantly less than 80%? Is this evidence that, in fact, less than 80% of all items at the supermarket have a price ending in 9 or 5?
By now you should be familiar with the process. When running a hypothesis test, same four parts every time-- null and alternative hypotheses; conditions; the test statistic and the p-value; and based on that p-value, state a decision about the null hypothesis and conclusion in the context of the problem. So in this problem, what are the null and alternative hypotheses?
Well, our null hypothesis is the "nothing's going on" hypothesis. In this case, it's the "we have no reason to disbelieve the consumer report" hypothesis, whereas the alternative hypothesis suspects that something's up. So we're going to rewrite it as p equals 0.8. The true proportion of all prices ending in 9 or 5 is 80% at the supermarket. Whereas the alternative hypothesis is going to say that p, the true proportion of prices ending in 9 or 5, is below 80%.
In this problem, we'll choose a significance level of 0.10. With the decision rule, it carries is if the p-value is less than 0.10, we're going to reject the null hypothesis in favor of the alternative.
Part 2 is checking the conditions. So by now, you should be familiar with the conditions-- randomness, independence, and normality. We'll look at them one at a time.
Randomness. How was the sample obtained? For independence, we want to make sure that the population is at least 10 times the size of the sample because we're sampling without replacement. And for normality, this is where it's a little different.
Because we're using the sampling distribution of p-hat instead of x-bar, there's different conditions for normality. So we need to use the conditions np is at least 10 and nq is at least 10. We can't use the central limit theorem here because this is not the sampling distribution of x-bar. It's the sampling distribution of p-hat, sample proportions. Let's check these.
In the problem, it does say that the items were randomly selected, so the simple random sample condition is OK. We have to assume the independence piece. We have to assume that the population of all items at the grocery store is at least 1,150. That seems reasonable.
For normality, we know what n is, and we know what p is. p is the value from the null hypothesis. It's the 80% that we're believing is the center of the distribution. And n was the sample size, 115.
So we can multiply all that out and get 92 for n times p. That's greater than 10. And 23 for n times q, that's also greater than 10. And so the sampling distribution of sample proportions is going to be of an approximately normal sampling distribution. All three conditions have been checked, and we're good to go.
Now that we've done that, we can calculate the test statistic. It's going to be statistic minus hypothesized parameter over standard error. And so the hypothesized center here of our sampling distribution was the 0.8. Our sample statistic was the 88 out of 115. And this, we're just going to put all the numbers that we need into the equation here.
88 out of 115, 80-- 0.80, rather, I'm sorry. And then 0.80, one minus 0.80, and 115 down here. When we do all of this fraction here, when we can evaluate it, we get negative 0.93, which we can then find on the normal distribution that is the sampling distribution for p-hat, and we can find the tail probability. So that's what we're doing here.
The probability that our sample proportion would be less than the one that we got-- this is the 88 out of 115 number-- is the same as the probability that the z-statistic would be less than negative 0.93. We can find that area using the normal table, and we get about 18% of the time. This means that if the null hypothesis was true and this distribution was really centered at 0.8, the true proportion of prices ending in 9 or 5 was 0.8, we would find something at least as low as we got about 18% of the time.
We can also find this probability using technology. So now based on how our p-value compares to our chosen significance level, which you may recall was 0.10, we're going to make a decision about the null hypothesis and state the conclusion. Again, that's three parts, and we need all of them.
So in our case, 0.1762 is greater than 0.10. Our decision then is we fail to reject the null hypothesis. And our conclusion is that there's not sufficient evidence to conclude that less than 80% of supermarket prices end in 9 or 5. We don't have strong enough evidence to reject the claim of the consumer report.
And so to recap, the steps in any hypothesis test, they're always the same. Null and alternative hypotheses, this is where you would also state your alpha level. Then state and verify the conditions. Calculate the test statistic and the p-value. And then finally, based on your p-value, compare it to your alpha level and make a decision about the null hypothesis and state it in the context of the problem.
In this case, we did a z-test for population proportions, and it's analogous to any other hypothesis tests that we do. The only thing that we switched up was how we verified the normality condition. We needed np to be at least 10 and nq to be at least 10. We couldn't use the central limit theorem because we weren't talking about means anymore.
Good luck, and we'll see you next time.
A type of hypothesis test used to test an assumed population proportion.
A hypothesis test where we compare to see if the sample proportion of "successes" differs significantly from a hypothesized value that we believe is the population proportion of "successes."