Source: Top Hat, Creative Commons: http://commons.wikimedia.org/wiki/File:Chapeauclaque.png; Pool balls created by the author
This tutorial is going to be talking about Stratified Random Sampling. It's a random sampling procedure. But it's not a regular, old, simple random sample. And we'll talk about the differences as we go through.
So first off, let me posit a scenario to you. A high school has just adopted a new, healthy lunch provider. And they would like to solicit student opinion on the healthy lunch options. The school has 100 freshmen, 110 sophomores, 120 juniors, and 90 seniors.
So the first part of the question that I'm going to ask you to do is explain how the school could select a simple random sample of 42 students. What I'd like you to do is pause the video and write down, off to the side, how the school might do that, how they might implement that.
All right. Hopefully what you came up with after you paused the video was that the school could assign each student a unique number, 1 to 420, then use a random number generator to select 42 numbers, ignoring repeats. The students corresponding to those numbers will be surveyed about the school's new, healthy options. Another way to do it would be to put the 420 student names in a hat and draw out 42.
So if you had either of those as your solution, there, that would be fine. Now, how about a way that the study might improve and guarantee an accurate cross-section of students between the grades? Because freshman might feel differently about the healthy options than seniors do. So pause the video again and decide how the school might obtain a more accurate cross-section.
Hopefully what you came up with after hitting pause is something like this. Since 42 is 10% of the school's population, survey 10% of each grade. With the freshmen, sophomore, junior, and senior classes, randomly select 10, 11, 12, and 9, using a similar simple random sample method as described before, like putting names in a hat or assigning everyone a unique number and selecting those numbers.
What you've just described to me, what we see here on the screen is a sampling method called a stratified random sample. It's a sampling method where the population is subdivided into groups called strata. The strata are homogeneous with some respect, some characteristic that we think might affect the overall sample.
Basically, we don't want too many of the sample to be having this characteristic. But then a simple random sample is carried out within each stratum. And you can have as many strata as you please. But they have to be roughly homogeneous.
So for instance, let's take a look at these billiard balls from a pool table. What I've done is I've subdivided them into low, middle, and high. This is pretty common if you have three people that want to play a pool game. A lot of the times people will subdivide them into lows, mediums, and highs.
What you can do to take a stratified random sample of these 15 is to put all the low-valued balls in a hat, put all the middle-valued balls in a hat, put all the large-valued balls in a hat, and randomly select, say, two from each. And then you can have a stratified random sample of six. You're guaranteed to have exactly two low, exactly two middle, and exactly two high.
So to recap, in a stratified random sample, the population is broken down into homogeneous groups called "strata." And we think that if we don't break it down into strata, that there's going to be some characteristic that might misrepresent the population.
So we're going to force them into groups and then take a simple random sample within each of the strata. The terms that we've used are "stratified random sample," and the groups are called "strata"-- singular, "strata." Good luck. And I'll see you next time.
A random sampling method where individuals are separated into homogenous groups, then simple random samples are taken within each group.
The homogenous groups in a stratified random sample. All individuals in each stratum have something in common, and we would like to see how that affects the outcome of the sample.