In this lesson, students will learn about the concept of confidence intervals in statistics.
When attempting to estimate a quantity of a population variable, you can use a single value as an estimate for that quantity. Such a value is referred to as a point estimate. Additionally, you can also provide a range to estimate such a quantity. This range is referred to as an interval estimate. In the event that you can be 100% certain that the quantity lies in this range, you can say the range is a 100% confidence interval.
However, achieving a 100% confidence interval is impossible without sampling the entire population. You can really only provide confidence intervals with levels of confidence lower than 100%. A 95% confidence interval is most commonly used. It means that researchers are 95% certain that the quantity that you are estimating falls within this range. Another way to look at it is if the experiment was repeated 20 times, the population variable would fall within the 95% confidence interval in 19 of these times.
Recall from previous lessons that a population is a large group of observations, while a sample is a small group of observations drawn from a population. It is desirable to randomly select that sample to get a relatively representative portion of the population. This allows one to make more accurate predictions about the population as a whole.
One uses the mean of a data set to get an idea of the center of the data, regardless of whether the data set is a population or a sample. Typically, the mean of a randomly drawn sample will not be the same as the population mean. It is helpful to use a 95% confidence interval to estimate this population mean instead. Such a 95% confidence interval is equal to a sample mean plus or minus a margin of error.
By seeking the interval for a value, you are actually trying to estimate the range of values that fit within the interval you are looking at. In estimating an interval for a range of values, one must first consider a single value. Then they will think about how much above and how much below that specific value they are willing to look.
What if you were interested in estimating the mean hours of television that children watch per day? How would you go about estimating such a value by using sample data? You can estimate this by using a point estimate, which let's assume in this case is 4.6 hours per day. You could also use a range to estimate this value, or an interval estimate, which maybe something along the lines of 3.8 hours per day to 5.4 hours per day.
The difference between the point estimate and the end values in the interval estimate helps you determine the margin of error you are working with. You could express the interval estimate as 4.6 hours per day, plus or minus 0.8 hours per day.
Look at estimating how much Americans spend on dining out per week. While you are interested in the overall population, you realize that asking everyone to detail their dining expenses is impossible. So you opt instead to select a random sample.
Suppose that you were to take such a sample of 500 people and determine a sample mean of $77.48 per week was spent on dining out. This would be considered a point estimate for the mean of the overall population. Now suppose that there is a margin of error of $10.21.
This 95% confidence interval provides you with a likely range for the population mean, which you would be 95% certain that the actual mean of the population falls within this range. If you were to continue repeating this experiment, the 95% confidence interval states that 95% of our intervals would contain the actual population mean.
Source: This work is adapted from Sophia author Dan Laub.