Source: Image created by Joseph Gearin
This video is going to teach you about statistical significance, which is a huge term when it comes to hypothesis testing. When we run a significance test, we need to determine what level of departure is considered a significant departure from what we would have expected to have happen.
So let's take a look. Suppose we have this company, Liter O'Cola. And we've developed a new Diet cola that we believe is indistinguishable from the Classic. So we obtain 120 individuals do a taste test. And if the claim is true, what percent of people should select the correct cola just by freak chance, just by guessing?
Well, if Liter O'Cola's claim is correct, about 50% of people would just guess correctly. And 50% of people would guess incorrectly if presented with the two options. And so now the question is, at what point are we going to stop believing Liter O'Cola's claim?
So suppose 61 people were able to pick the diet cola. Is this evidence against the claim? Now, 61's not that different from 60, so we're going to say no. That's not that much different. This is not significantly different from what we would expect.
Conversely, take a look. Suppose 102 people were able to pick the diet cola correctly. Would that be evidence against the company's claim?
In this case, we would probably say so. 102 is way over 60. And 60 is what we would expect had they been randomly guessing. It's pretty the unusual that we would see 102 people get it right by randomly guessing out of 120. So this is evidence that some people can taste the difference.
And this is the whole idea of statistical significance. 61 out of 120 is not a significant result, meaning that's not evidence against the claim. It's not evidence against the null hypothesis. Conversely, take a look at the 102. That would be evidence against the null hypothesis, because it's so much higher than what we would have expected.
So statistical significance means that we doubt that the results that we obtained are due to chance. Instead, we believe that it's part of some larger trend. Like in the cola example, we don't believe the null hypothesis that people can't distinguish. We believe that the trend is that people in fact can distinguish.
So if 61 people correctly identify it, we're not convinced that over half can identify the diet. The difference might be only due to chance. In fact, it probably is. On the other hand, the difference of 42 from what we expect is probably not due to chance. That would be called statistically significant.
Now, it's important to make the distinction between practical significance and statistical significance. They're not necessarily the same thing. Suppose I had a large enough sample. It's possible if the sample size was large enough that even something as not different from 50 as 50.1% correct guessing could be considered statistically significant with the right sample size, even though 50.1 is not that different from 50%.
So the statistical significance argument is based largely on sample size and how far off from this 50% percent claim we are. If the sample size is big, we don't need to be very far off. If the sample size was small, we need to be further off in order to claim significance. But if the sample size is big, we might not get something that's practically significant. We wouldn't shout this 50.1% mark from the rooftops.
So to recap, statistical significance is the extent to which a sample measurement is evidence of a trend, like being able to taste the difference between regular cola and diet cola, or whether the difference is not that big a deal and we can write off the difference or attribute that difference to chance. It's not the same as practical significance, although sometimes it is. And sometimes very small differences can be statistically significant, though not have a whole lot of real-life meaning.
So we talked about statistical significance and how we're going to measure it versus practical significance and how those two are not necessarily the same. Good luck. And we'll see you next time.