A confidence interval for population proportions is very similar to a confidence interval for population means. In general, a confidence interval is an estimate found by using a sample statistic and adding and subtracting an amount corresponding to how confident we are that the interval created captures the population parameter.
For a confidence interval for population proportions, the statistic is the sample proportion and population parameter is the population proportion. The following can be used to calculate the confidence interval:
To construct a confidence interval for population proportions, the following steps must be followed:
EXAMPLEObecalp is a popular prescription drug but is thought to cause headaches as a side effect. In a random sample of 206 patients taking Obecalp, 23 experienced headaches.
Step 1: Verify the conditions necessary for inference.
Stating the conditions isn't enough, and it's not just a formality--you must verify them. Recall the conditions needed:
|Randomness||How was the sample obtained?|
|Independence||Population ≥ 10n|
|Normality||np ≥ 10 and nq ≥ 10|
Step 2: Calculate the confidence interval.
To do this, we will take the point estimate, p-hat, plus or minus the z* critical value times the standard error of p-hat, which is the square root of p-hat times q hat, over n. The population proportion is not known, so you’ll use p-hat for the standard error.
First, let's find the corresponding z* critical value for a 95% confidence interval by using a z-table. For a confidence interval, we can follow the same steps as a two-sided test. If we have a 95% confidence interval, this is actually the same as a 5% significance level. However, this is split between two tails, the lower and upper part of the distribution. Each tail will have 2.5%.
We can use the upper limit to find the critical z-score. Remember, a distribution is 100%, so to find the upper limit, we can subtract 0.025 from 1, which gives us 0.975. Now, we can use a z-table.
Standard Normal Distribution|
In a z-table, the value 0.975 corresponds with a 1.9 in the left column and 0.06 in the top row. This tells us that the z-score is 1.96.
Another way is to use a t-table, which you will learn more about in a later lesson but is available to view at the end of this tutorial. We don't use the t-distribution for proportions; however, we can use the last row in this table to find the confidence levels. Z confidence level, critical values, are found in the last row of this t table, under the infinity value, or ">1000". Essentially, the normal distribution is the t distribution with infinite degrees of freedom. We're going to look in this row to find the z critical value that we should use, which is the same as the 1.96 we got from before.
Now that we have the corresonding z* critical value, we need to use p-hat, which is 23 out of 206, q-hat, which is the complement of p-hat, and the sample size, n, which is 206 and put all this information in the formula:
From this formula, we obtain 0.112, which was our p-hat, plus or minus 0.043, which is the margin of error. When we evaluate the interval, it's going to be from 0.069 all the way up to 0.155.
Step 3: Interpret the confidence interval.
The confidence interval of 0.069 to 0.155 means we're 95% certain that if everyone who was taking Obecalp was in the study, the true proportion of all Obecalp users who would experience headaches is somewhere between 6.9% and 15.5%. We don't know exactly where in that range, but the true proportion is probably somewhere in this range.
Source: Adapted from Sophia tutorial by Jonathan Osters.