Table of Contents |
When designing an experiment, it is often impossible to measure or gather data on every member of a group, called the population. Instead, we study a representative sample of the population and use this information to make inferences about the population.
Political polls are a common example of this strategy. Rather than asking every voter their opinion on a topic, researchers might ask 1000 random voters, and then use these results to infer the preferences of the population.
Remember, populations consist of all individuals or a group of individuals who possess a certain characteristic. A sample consists of individuals who are a part, or subset, of the larger population.
In the world of statistics, there are specific terms used when speaking about data for a population and samples. A parameter is a measurable quantity or characteristic of the population.
IN CONTEXT
Say you wanted to estimate the average value of houses across the entire United States. The population parameter, the piece of information about the population you’re interested in, is the average value of homes across the entire country.
Because it’s generally very difficult to obtain information about every individual in a population, the exact values of parameters are unknown. There are, however, some exceptions, and the United States census is one of them. The census is required by law to count the population of the United States every 10 years. The last census in the United States was in 2020.
IN CONTEXT
What if you were interested in the total number of children under the age of 18 in the year 2020? According to the U.S. Census Bureau, there is actually a number for that, because everyone was counted.
The number that the bureau reports, according to its website, is 73,106,000. That is the actual number of children under the age of 18 in the year 2020. These results are parameters because they come from the entire population.
Unlike the U.S. census, you likely won't have the resources or time to gather data on the entire population. What if a researcher only has data from a sample? Any measurable quantity or characteristic related to the sample is called a statistic.
IN CONTEXT
If a pollster contacts registered voters to ask how they plan on voting in an upcoming election, the results of the survey are statistics. The value of a statistic can be significantly different than that of a parameter, depending on how the sample was obtained and how many observations are in the sample.
A larger sample will generally give better information about a population than a smaller sample. This is assuming that both were conducted identically. The larger the sample is, the more likely any data that comes from it will represent the actual population.
IN CONTEXT
Say you were interested in determining the average number of hours of television that American adults watch per week. Now, that’s a situation where the parameter would be the average for every adult in the entire United States. However, that’s not necessarily something we can measure.
Instead, you decide to take a survey or a representative sample of the population, and you ask people how many hours of television they watch. The population parameter would be the average for the entire United States. The sample statistic would be the average of your sample.
To get the statistic from above, you could look at the average snowfall from 100 random locations across the country. This sample statistic would likely be close, but not identical, to the population parameter. This is because the sample is never perfectly representative of the population as a whole.
Source: THIS TUTORIAL WAS AUTHORED BY DAN LAUB FOR SOPHIA LEARNING. PLEASE SEE OUR TERMS OF USE.