This tutorial will introduce multi-stage sampling and focus specifically on:
Multi-stage sampling is a very, common sampling procedure when the population is very, very large.
Suppose that you wanted to sample from the entire United States as a whole.
What would a Simple Random Sample (SRS) look like?
You'd have to somehow account for every person in the United States, and maybe assign them a number, and pull numbers out of a hat, or use some kind of random sampling procedure. And that would be too difficult to assign to everyone.
What would a Stratified Random Sample look like?
Strata in this case are still too big. You might take a few people from Maine, and a few people from Minnesota, and a few people from North Dakota. And it would still be too large. Plus, it really wouldn't be cost effective, commuting to all these different places.
Can you perform a Cluster Sample of states?
If you identified states as clusters, you would randomly select some of the clusters and then sample everyone within that cluster. You'd be sampling entire states. For example, everyone in North Carolina would be in the sample if you select that state as a cluster. That’s not feasible.
So none of those really make any sense. The way out of the box here is a multi-stage design.
With multi-stage sampling, you continue zooming in from larger areas to smaller and smaller areas until you can find a small enough sample of the people you need.
Let's say you want to sample of the United States as a whole. Geographic simplicity makes it so states make the most sense as clusters. So randomly, let’s say you select five states: California, Tennessee, Minnesota, Massachusetts, and Oklahoma.
So what can you do now with those clusters? It’s not realistic or feasible to sample everyone within that state, so you can only sample some people. And for people, you’re going to take a random sample. Not every state needs to be represented, as would be the case with the stratified random sample. So you choose to take Minnesota and from there, you can randomly select counties within Minnesota.
So maybe you take Carver County, and Marshall County, and maybe a few other counties. If that's a small enough basis for you to get everyone within the county, then you can stop.However, if you need yet a smaller sample size, you can choose just one county: Carver County, and sample towns within that county.
So randomly within the county of Carver, you’ll find towns such as Hollywood, Watertown, Waconia, Hamburg, Cologne, East Union, Chaska, and Chanhassen.Perhaps you randomly select three of those towns: Chanhassen, Waconia, and Chaska. If those are small enough units, then you can stop. However, if the sample size is still too large, you can continue to narrow it down.Within Chaska, you can sample some neighborhoods. And usually, by the time you get to neighborhoods within a town, it's easy enough to walk around the neighborhood and get almost everybody within that neighborhood.
Steps for Multi-stage Sampling is as follows:
2. Take a SRS from each cluster
So to recap, multi-stage sampling is used when the population is so big and the groups or strata or clusters so large that it makes more sense to zoom in and take small groups.
You begin with certain clusters, but then you sample within those clusters instead of taking the full cluster. So it combines cluster sampling, stratified designs, and simple random designs, which were contrasted within this tutorial and learned that stratified random sampling was still not possible when attempting the sample the United States.
Source: This work is adapted from Sophia author Jonathan Osters. MN map: https://en.wikipedia.org/wiki/List_of_counties_in_... Carver county: https://en.wikipedia.org/wiki/List_of_counties_in_...
A sampling design which combines elements of cluster sampling, stratified random sampling, and simple random sampling. It "zooms in" on smaller areas to sample so that sampling becomes more feasible.