Source: Soup, stick figure, and envelope created by Jonathan Osters; Phone http://tinyurl.com/cud9k5a Lion Head Knocker http://tinyurl.com/d93xje9
This tutorial is going to teach you about nonresponse bias. Now, bias we've already talked about as being a bad thing, and we're going to talk about how nonresponse, or a lack of response, from people you've selected is going to affect the ability to draw conclusions from your sample.
A nice way to think of sampling is with what we call a "pot of soup" analogy. So, we want a representative sample, which means we don't need to drink the entire pot of soup in order to figure out what's in it. We just need the right taste.
So, it would be like selecting all of the ingredients from the soup in a single tasting, but certain things can go wrong with the taste test that can affect what we think is in the soup because, in real life, we don't really know what the population looks like. We don't know what's in the soup. All we get is the taste, and if we don't get the right taste, we're going to leave something out and not know exactly what's in the soup.
So, let's go back to the sampling world here for a minute. Nonresponse means that someone selected for the sample either can't be contacted or is unwilling to participate. So, suppose someone gets a call, and they say, "Hi, you've been selected to take a sample." They say no thanks and hang up. This is problematic.
Nonresponse in and of itself-- it's not the end of the world. It's fine. It happens. It's an inevitability that you either will get people that are uncooperative and don't want to take your survey, or answer your questions, or be part of your experiment, or it's inevitable that you just won't be able to contact certain people.
The problem comes in when the opinions of the people left out-- the people that weren't able to be contacted or refused to participate-- differ substantially from the people that were in the sample, and that's problematic. That's called nonresponse bias because you're not getting an accurate cross-section of opinions.
So, let's go back to the analogy of the soup for a second. How does that affect the taste test? Well, we don't get an accurate flavor profile from our taste of the soup because some of the ingredients have been left out. Some of the opinions of the people that we wanted to get are left out.
So here's an example. A workplace wishes to survey 200 of its 1,000 employees about their workload and their stress level, so they put 200 surveys in the workers' mailboxes. Now, what might happen is that the people who have the biggest workloads might get left out of the sample because they don't get around to checking their mailboxes because they're already so busy. Or, even if they do get around to checking their mailbox, maybe they don't fill out the survey, or don't return it, because they're so busy.
What effect might that have? Well, of the 200 that the workplace actually gets back, maybe the ones that it gets back say that the workload level is not that high. The only problem is the people with the lower workloads are the only people who turned them in, because they had the time to take it. And the people with the higher workloads didn't have the time to take it. The company might think the workload level is lower than it really is.
Take a look at these different ways of conducting a survey, or a poll, or a sample. Which of these methods, mail, telephone, or face-to-face, do you think has the highest nonresponse rate? The answer is the mail. People will either throw it away, or forget to fill it out, or maybe they'll fill it out and then forget to mail it back. This is kind of problematic because when the United States takes its census of everyone in the country, it does it by mail. And so, sometimes they have to do follow-ups.
The nonresponse rate is easy to calculate. You just subtract the number that you got back from the number that you mailed out, and that's your nonresponse rate. Say you mailed out 100, and you only got 80 back. Well, that's 20 out of 100, or 20% nonresponse rate. In samples with high rates of nonresponse, follow-ups typically are needed.
So, supposing you started with a mailing, you might need to follow up by calling them at home. And if you can't reach them by calling them at home, you might need to follow up by coming directly to their house. And sometimes, even then, even when they are contacted, someone will refuse to participate. Follow-ups like this might be more necessary in some areas of the country than others because different areas of the country have different rates of nonresponse.
And so, to recap, nonresponse bias occurs when people who are selected for the sample can't participate, either because you can't find them, or because they're actively refusing. And the biggest problem is that if you have high rates of nonresponse, it might give you an inaccurate representation of what's going on with your population. You won't be able to use your sample to draw an inference about your population.
So, we talked about nonresponse, which is just the act of being unable to participate in the survey, and nonresponse bias, which is the problems that arise as a result.
Good luck, and we'll see you next time.
This tutorial is going to talk to you about response bias. Now, response bias is when people's answers are influenced by something. So a nice way to think about sampling is with a pot of soup analogy. When you get a representative sample, that's like getting a little taste of everything that's in the soup. Buy things can go wrong, where you don't get the right taste to the soup.
Response bias can occur if the wording of the question is such that it's unclear to the respondent, or if a respondent is uncomfortable due to sensitive or personal nature of the questions, or if the respondent feels like the questioner is implying that the question has a, quote unquote, "correct response." That's also called social desirability bias.
So for instance, here's an example. On April 20, 1993, the New York Times published an article on a survey conducted by the Roper Organization on behalf of the Jewish-American community about the soon-to-be-opened Holocaust Museum in Washington, D.C.-- right here. The newspaper reported that 22%-- an astounding number of adults surveyed-- expressed some doubt as to whether the Holocaust had actually occurred.
The actual question that was presented to people was, does it seem possible, or does it seem impossible to you, that the Nazi extermination of the Jews never happened? Now, on its surface, this seems to be a fairly straightforward question. But there was a big problem with it, and it caused response bias.
The problem is that the question contains a double negative. Does it seem impossible to you that it never happened? Saying it's impossible that it never happened is the same as saying that you're certain that it did happen. It just doesn't sound that way when it's read in the question.
The good thing is that one year later, the question was revised, and it became clearer. The new question stated, does it seem possible to you that the Nazi extermination of the Jews never happened, or do you feel certain that it happened? With this new, clearer question, the question clearly distinguishes between what the two options are.
Does it seem possible, or do you feel certain? And now, with the two options clearly defined, less than 2% of individuals were unsure as to whether it was real or not. And this gives a more accurate interpretation of what the American public feels. So unclear questions can lead to an inaccurate representation due to response bias.
The other scenario in which this can occur is when people will answer a question because they're either ashamed, or they think that there's a right answer that someone's fishing for. So in a survey about drug use, many people will say they've never used it, even if they have, and even if there's no consequence. And even if the survey's anonymous, they'll still say they've never used when, in fact, they have.
There are certain topics that are particularly sensitive and might make a person want to lie. Criminal history-- they might say they don't have one, even if they do. Sexual behavior, because it's of a highly sensitive and personal nature. Racial prejudice, because there's an implied right answer. People don't want to say that they're racially prejudiced. And income-- people will report it as being higher than it actually is if they're of low income status, or even possibly-- more surprisingly-- people will report it as lower than it really is if they're of a very high income status. A lot of people don't want to be very showy about their wealth, and so they'll try and come up with a more, in their eyes, reasonable number.
Now, how does this affect what we think about the population? How does this affect the soup? Well, it's like taking a sample of the soup and only tasting the things that you want to taste. Maybe you don't like beans, and so you just sort of ignore the fact that they're there. Well, you don't get the overall flavor of what's supposed to happen. Same thing with response bias. It doesn't give you the right overall interpretation of what this is supposed to be like.
And so to recap, response bias occurs one of two ways-- either a respondent doesn't understand the question, so gives an answer that he wasn't intending, or because the respondent wants to give a supposedly correct answer to the question. Both of these can be inaccurate representations about what actually is the truth about the population. Response bias is a tough thing to get rid of, especially when it deals with the wording of the questions and it's unintentional. The terms we used were response bias.
Good luck, and we'll see you next time.