Or

4
Tutorials that teach
One-Tailed and Two-Tailed Tests

Take your pick:

Tutorial

Source: Image created by Joseph Gearin, graphs created by Author

In this tutorial, you're going to learn about the difference between a one-tailed and a two-tailed test in a hypothesis test. So let's take a look. Suppose we have our favorite pop, Liter O'Cola. And it's come out with new Diet Liter O'Cola. And they think that it's indistinguishable. So they obtain 120 individuals to do the taste test. If the claim is true, we would expect about 50%, or 60 people, to guess correctly just based on the fact that they were guessing and they guessed it right if the taste was indistinguishable.

But what if some people can taste the difference? What would we expect the proportion of people correctly selecting the diet cola to be? Well, I would say it's some number over 50%. At least half of the people will be able to correctly identify which cup is the diet cola.

So these are our null and alternative hypotheses. Our null says that p, the true proportion of people who can correctly identify the diet cola is 1/2, half the people. Our alternative hypothesis suspects that maybe more than half of people will be able to select the diet cola correctly. This is called a one-tailed test. We're only interested in testing whether or not the true proportion of people who can guess correctly or identify which one is the diet cola is over half. We don't care if it's under half. If it's under half, that actually works in Liter O'Cola's favor.

So now suppose we are presented with a different scenario. Suppose we suspect that Liter O'Cola is under-filling their bottles. Unsurprisingly, the bottles are supposed to contain 1 liter of cola. State the null and alternative hypothesis for this. Pause the video, and then come back. Also scribble out if you think this is a one-tailed test or two-tailed test.

This is another example of a one-tailed test. The null hypothesis says that the average amount of cola in the bottle is 1 liter over all the bottles that Liter O'Cola makes. The alternative is that maybe we think that it's less than 1. They're under-filling the bottles. The average amount is less than 1 liter. Again, this is a one-tailed test. If the average amount, mu, was greater than 1 liter, we wouldn't really have a claim against Liter O'Cola because we're actually getting more pop than they say that they're giving us. We're only going to give them trouble if they're under-filling their bottles.

Let's take a look at a third example. Liter O'Cola also claims 35 grams of sugar in its bottles of cola. Anything over that, and the pop will taste too sweet. Anything under that, and the pop won't taste quite sweet enough. They won't get the refreshing Liter O'Cola taste that people have come to expect. So we think that Liter O'Cola might have altered their formula recently because it tastes different. So what do you think the null and alternative hypotheses will be here with respect to sugar?

Here, the null hypothesis is that the mean grams of sugar will be the same as it was before, 35. What about the alternative hypothesis? Well, if they've changed their formula, we don't know if they added more sugar or put in less sugar. But they're only going to be in trouble if they put in a different amount of sugar than before. The mean, a number of grams of sugar in the bottle, is different than 35. And this is a two-tailed test. They're going to be in trouble if they put in significantly more than 35 or significantly less.

So one-tailed tests, if we can swing it, are preferred to two-tailed tests because they're more powerful. Statistical power means that they have a higher likelihood of actually detecting a difference if one is present. Let's take one last look visually at what a one-tailed test and a two-tailed test look like. This is what a one-tailed test with a p-value of 5% would look like. This would be under the alternative hypothesis that you have something less than a particular number, like a mean is less than 1, like we had in the one example. So we end up with one tail area here of about 5%.

Whereas, this is what a p-value of 5% would look like on a two-tailed test. We are interested in what's the probability that we would get at least as extreme on either side of a value as we ended up with from our sample. So it could either be extremely low or extremely high. Something that is extremely different from what we would have expected. Whereas, on this side, we're only going to get them in trouble if it's extremely lower than what we would have expected.

And so to recap, one-tailed tests only can test whether or not there's evidence of a statistic being significantly higher or significantly lower than a particular parameter, like mu or p. Whereas two-tailed tests will tell whether or not the statistic, x bar or p hat, is significantly different on the high or the low side from the claimed parameter. So we talked about one-tailed tests, which have two versions, a left-tailed test, where we say in the alternative hypothesis that it's less than this claimed parameter, a right-tailed test, which means that it's larger than the claimed parameter, or there can be a two-sided test, where we claim that simply the true value is different than the claimed parameter, not equal to. Good luck, and we'll see you next time.