Source: Image created by Joseph Gearin
In this tutorial, you're going to learn about the basics of hypothesis testing. When we do this, there are two hypotheses that we pit against each other. So for instance, suppose that you have a Liter O'Cola company and they have new diet Liter O'Cola, and they say it's indistinguishable from Classic Liter O'Cola. So they obtain 120 individuals to do a taste test.
And if their claim is true, some people will be able to correctly identify the diet soda just by guessing. What percent of people will do that? Well, you'd think it would probably be around 50%, 60 people. 50% would guess correctly and 50% would guessed incorrectly. Just based on guessing, even if the Cola was indistinguishable from the Classic.
Now, suppose that you didn't get an exact 50/50 split. Suppose 61 people correctly identified the diet Cola. Would that be evidence against the company's claim? Well, it's more than half. But it's not that much more than half. So we would say no. 61 isn't that different from 60. So that's not really evidence that more than half of people can correctly identify the diet soda.
Now, contrast that with a crazier example. Suppose that 102 people of the group were able to correctly identify the diet cola. Is that evidence against the company's claim? In this case, 102 is significantly more than half. So we would say that that would be evidence that at least some of the people could taste the difference. Even if some of those 102 were guessing, it's evidence that at least some of those 102 can taste the difference.
So now the question posed to us with the 102 is if the people were, in fact, guessing, what would be the probability that we would get 102 correct answers or more if they were guessing randomly just by chance? Isn't it possible that 102 out of 120 could correctly pick the diet cola just by chance? I mean, anything is possible. But if this was a low probability, then the evidence doesn't really support the hypothesis of guessing, that in fact, some people can taste the difference.
So let's take a look at the competing hypotheses that we have. First, we're going to state what we believe the beliefs in the problem to be. Liter O'Cola's claim is that 50% of people will correctly select the diet cola by chance, because the two colas are indistinguishable. This is called the null hypothesis. The null hypothesis is a default thing that we're going to just accept temporarily is true.
It's the default, nothing unusual, no change from what we expect assumption. In symbols, we're going to say p, the true proportion of people who can correctly identify the diet soda, is 1/2. Whereas our suspicion is that maybe over 50% of people will select the diet cola. Some of those by chance, and some of those because they can actually taste the difference.
This is called the alternative hypothesis. The "something is going on here" type of assumption. And we're going to say that the true proportion of people who can correctly identify the diet soda is going to be more than 0.50. So let's see what's going on here. The notation is h sub 0 for the null hypothesis, and h sub a for the alternative hypothesis. Here they are in symbols and in words.
Those are the notations. Null hypothesis is always an equality, and the alternative hypothesis can be expressed many ways, depending on the problem. It's either a less than symbol, a greater than symbol, or a strictly not equal to symbol. So in this example, if more than half, significantly more than half of cola drinkers in our sample of 120 can correctly select the diet soda, we will reject the null hypothesis, the claim that Liter O'Cola has in favor of the alternative saying there's convincing evidence that more than half of people will correctly identified the diet.
Now, significantly more than half is a loose term. How many is that? We decided that 102 was probably significant, while 61 probably wasn't that significant. We'll leave that definition for another time. On the other hand, if not significantly more than half of participants will select the diet soda, then we will fail to reject the null hypothesis. This would be the example of the 61.
61 is not significantly more than half of the participants, and so we'll fail to reject the null hypothesis. Notice we don't say the word accept. Why not? Why do we fail to reject the null hypothesis and not accept it? Well, there's very good reason for that. The reasoning is when we do an experiment like this, we are already believing the null hypothesis and we're trying to provide evidence against it.
If we haven't brought legitimate evidence against it or strong enough evidence to reject it, then all we can do is not reject it. We haven't proven that the null hypothesis is true, we just haven't presented strong enough evidence to prove it false.
So to recap, hypothesis testing involves a lot of stuff. We start by stating our assumption about the population, which is the null hypothesis denoted H null. And seeing if the evidence gathered contradicts the assumption, leading you to reject the null hypothesis in favor of the alternative hypothesis, H sub a. We can calculate conditional probabilities asking ourselves the question what's the probability that we would obtain statistics at least as extreme as these from a sample if the null hypothesis were, in fact, true.
So we talked about the hypotheses in the hypothesis test, and those were the null and alternative hypotheses. And those are what we're going to pit against each other and calculate probabilities about to try and make a decision about the population. Good luck, and we'll see you next time.
A claim about a population parameter.
The standard procedure in statistics for testing claims about population parameters.
A claim about a particular value of a population parameter that serves as the starting assumption for a hypothesis test.
A claim that a population parameter differs from the value claimed in the null hypothesis.