Recall from previous lessons that a population is a entire pool of observations, while a sample is a small group of observations drawn from a population. It is desirable to randomly select that sample so as to get a relatively representative portion of the population, which allows you to make more accurate predictions about the population as a whole.
The mean of a data set gives an idea of the center of the data, regardless of whether the data set is a population or a sample. Typically, the likely range for a population mean is the 95% confidence interval. This means that if enough random samples are drawn from the population, the actual population mean will occur in this range in approximately 95% of the samples. To put it another way, if you were to conduct 20 experiments, 19 of the confidence intervals you obtain would contain the actual population mean. It is important to note that the population mean is a fixed value that does not change, and that you are simply trying to estimate it by using random samples.
Suppose you were to take a sample of 100 adults and measure how tall they were in the attempt to estimate the mean height of all adults. For this scenario, the sample mean of the height of 100 of adults is 67.5 inches with a 95% confidence interval being approximately 59.1 inches to 75.9 inches.
Now suppose you drew another sample of 100 adults, and it had a sample mean of 68.3 inches with a 95% confidence interval of 59.9 inches to 76.7 inches. If you continued drawing random samples of 100 adults, you would most likely arrive at different sample means and 95% confidence intervals for each instance.
Maybe your third sample has a sample mean of 67.9 inches and a 95% confidence interval of 59.5 inches to 76.3 inches, and a fourth sample is drawn with a sample mean of 66.1 inches and a 95% confidence interval of 57.7 inches to 74.5 inches.
here’s another example. Suppose you were interested in determining the mean number of cups of coffee that Americans drink per day, and you drew multiple random samples of 250 people in an effort to estimate this mean. Let’s suppose that the sample mean is 2.1 cups per day with a 95% confidence interval of 1.8 cups to 2.4 cups per day.
A second sample of 250 people yields a sample mean of 1.8 cups per day and a 95% confidence interval of 1.5 cups to 2.1 cups per day. If you repeat this process to the point of drawing 60 samples, you would expect that the actual population mean would fall in the range of 95% of those samples, or 57 of them.
Source: This work is adapted from Sophia author Dan Laub.