This lesson will explain how to ensure everyone in the population has equal chance of participating in a sample, specifically focusing on:
A Simple Random Sample (SRS) is a sampling method that not only ensures that everyone in the population has an equal chance of being in the sample, but also that every sample is equally likely to be the sample that's being selected.
IN CONTEXT
If you’ve ever experienced a raffle situation, you’ve experienced a simple random sample. What generally happens at these events is that someone removes tickets from the raffle puts them into a bucket.
The tickets are mixed up in the bucket, and one ticket is pulled out. The owner of that ticket usually wins some kind of fantastic prize. Now, being in a simple random sample is pretty much the same thing. The only difference is that instead of winning the prize, you get to be part of the sample and that's your prize.
Take billiard balls from pool table, put those all into a hat, shake it up, and pour out five.
Shake #1:
Shake #2:
You may have noticed that the “1” ball was in both of these first two examples. However, it doesn't mean it's any more likely to be selected than any of the other balls. It's the same likelihood. And any sample of five, the first sample or second sample of five, were equally likely samples of five, as was shake #3.
Shake #3:
Now, notice, all five of these were striped billiard balls, not one solid ball in the bunch. Is that unusual? Sure, it's kind of unusual to happen.
Unusual samples have an equal likelihood to happen too. Just because they're strange and don't happen very often doesn't mean they can't happen. In fact, they have the same likelihood as any other selection of five.
So knowing how to take a Simple Random Sample, which is going to be constantly abbreviated SRS, is important, because most inferences about the population that we do assumes that we collected data in this way. So names in a hat is fine. In our case, raffle tickets in a bucket, or billiard balls in a hat...that's all fine.
However, what about the situations where we don’t have the manpower to pull numbers or names from hat? There are two other ways to take a simple random sample. One way is using a random number generator and the other is a random number table. First, we are going to discuss random number generator.
Suppose that we want to take a sample of 100 individuals from a population of 2,000 people. So here's some of those individuals lined up, and you can just imagine that individuals 10 through 1,995 are somewhere in the middle there. And we assign each individual a unique number. So no one can have the same number as anybody else.
And using technology or a calculator, a website or a calculator, you can search "random number generator" on the internet, and websites will come up. Or you can use a calculator.
This particular model of calculator is a Texas Instruments calculator:
“RandInt” indicates random integer”-- an integer is a whole number-- from 0 to 1. And so it picks either 0 or 1.
When you put in the third number, it's saying how many of them do you want?In this case, you entered five. Now, you don't want numbers between 0 and 1 in this case, and you don't want five of them. You want numbers between 0 and 2,000., and you want 100 of them.
Now, why was 150 written when you only want 100 numbers?
You can’t select one person twice, so repeats must be ignored. It's incredibly likely that if you had just written 100 instead of 150, there would have been at least one repeat in the bunch.
And finally, you're going to select the individuals that correspond to those first 100 numbers, 100 different numbers that were picked.
So, person number 8, and the person that corresponds to 1,119, and the person who corresponds to 1,996 are a few that are chosen.
Now, notice that the person corresponding to 8 was chosen again. You can see that it’s listed twice in the list. You're not going to select that person twice because they've already been selected once, so they are crossed out. This is the reason 150 numbers were created, so you have room to cross repeats out. Make sense?
Using a random number table is basically the same idea. It's a little bit more cumbersome. For starters, it’s generally used if no technology is available. You will soon notice this is a longer process and more time-consuming than using a random number generator. A random number generator typically goes faster.
Each individual is assigned a unique number, just like the random number generator, however each member's number must have the same number of digits.
The same method as the random number generator cannot be used, because the number 2,000 has four digits, and the number 1 only has one digit. All of these must have the same number of digits, so instead of 1, it's 0001. Instead of 2, it's 0002. All the way up to 2,000.
A table of random digits can be found in a textbook or online. Four numbers will be selected at a time, because each individual has four numbers.
Perhaps the first four numbers found were 1-9-2-2. That corresponds to someone in the list. There is someone who is 1,922, so that individual will be selected for the sample. It’s circled in green below since a person corresponds to that number.
The next number found is 3-9-5-0. No one on the list that corresponds to the number 3,950, so it is ignored. The next number, 3-4-0-5, does not correspond to an individual either, so that is ignored as well.
You'll notice that all numbers circled in red are numbers that are unassigned in our list. This is going to make this a very cumbersome process. It will go for a while until 100 individuals are obtained. Will this work? It will work, but it might take a very long time.
One of the numbers circled in green is 0001. This very first person in the list, and it just happens that person 0001 will be among the sample. This individual will be selected along with everyone else whose four-digit number was selected.
A simple random sample again is the most ideal sampling method if your goal is to obtain a representative sample. Sometimes, with really big populations, it's not feasible to assign everyone a number or put everything into a hat, so other sampling methods may be used. The random number generator is typically used with a calculator and it’s a fast way to calculate random “integers” without needing to assign same-number digits to each individual. The random number table is a more time-consuming and generally used when technology is not available.
Good luck!
Source: This work is adapted from Sophia author Jonathan Osters.
A method of selection that guarantees that every sample of a certain size has an equal chance of being the selected sample
A method of collecting a sample that utilizes technology to select random numbers corresponding to individuals in the population
A method of collecting a sample to select random numbers corresponding to individuals in the population. Each individual is assigned a number, which are then selected from the table.