In this tutorial, you're going to learn about test statistics. This is the statistic that we calculate using the statistics that we already have when we're running a hypothesis test. Specifically, you will focus on:
So let's take a look. The test statistics how far a sample statistic is from the assumed parameter if the null hypothesis is true.
So when we have a hypothesized value for the parameter from the null hypothesis, we might get a statistic that's different than that number. And so that's how far it is from that parameter. It's measured in terms of how many standard deviations from the mean of the sampling distribution that statistic happens to be. So all that's saying is it's a z-score.
Test statistic is equal to the statistic minus the parameter. That's how far away in absolute distance. And then how many standard deviations is it? It's that difference divided by the standard deviation of the statistic.
The means of the statistic that you obtain is a sample mean (an "x" with a bar over it, or called "x-bar").
The parameter from the null hypothesis is the hypothesized population mean mu.
The standard deviation of the statistic that you have is sigma over square root of n.
Therefore, the z-statistic that you can calculate is your test statistic. And its x bar minus mu over this fraction.
Meanwhile, for proportions, the statistic that we obtained from a sample is a sample proportion, p hat.
The parameter from our null hypothesis ia value, p. And the standard deviation of the p hat statistic is going to be the square root, square root of p times q, which is 1 minus p over n-- all of that inside the square root.
And so what we obtain is another z-statistic. The z-statistic is p hat minus p from the null hypothesis divided by that fraction.
What does this look like? Both these situations have conditions under which they're normally distributed. So you can use the normal distribution to analyze and make a decision about the null hypothesis.
The normal curve above operates under the assumption that the null hypothesis is in fact true.
If you're dealing with means, the mean is highlighted in the red box below:
The standard deviation of the sampling distribution is sigma over square root of n and is found highlighted in the red box below:
Perhaps your x bar is over to the right as indicated below. The test statistic will become a z-score. What you are going to find is what's called a p-value, the probability that you would get an x bar at least as high-- in this particular case, it's one sided-- as you got if the mean really is over here, mu.
We could do that, or if it was a two sided test, it would look like this:
Another way to determine statistical significance not using a p-value would be with what's called a critical value. This corresponds to the number of standard deviations away from the mean that you're willing to attribute to chance.
You might say that anything within this green area here is a typical value for x bar.
You are willing to attribute any deviations from mu to chance if it's in this green region. This is the most typical 95 percent of values. If it's outside that region, it would be within the most unusual 5%.
And so you would be more willing to reject the null hypothesis in that case. A test statistic far, meaing z-statistic, that's far from 0, provides evidence against the null hypothesis.
So one way would be to say all right, well, if it's further than two standard deviations, and it's in the outermost 5%, I'm going to reject the null hypothesis. And if it's in the most innermost 95 percent, I will fail to reject the null hypothesis.
And with two-tailed tests like the image above, the critical values are actually symmetric around the mean. That means that if you use positive 2 here, you would be using negative 2 here.
There are some very common critical values that we use. The most common cutoff points are at 5%, 1%, and 10%. So if it's two-tailed, that was 1.96. I know we were saying two standard deviations on either side, but that is actually 1.96 standard deviations away.
If you were doing a one-tailed test with 0.05 as your significance level or a two-tailed test with rejecting the null hypothesis if it's among the most 10% extreme values, you'd use z-statistic critical value of 1.645.
If you were doing a one-tailed test and you wanted to reject the most extreme 10% of values on one side, you'd use 1.282 for your critical value, or if you wanted to use 1%, two-tailed, it would be 2.576, or one-tailed, the 1% value would be 2.326.
When you run a hypothesis test with the critical value, you should state it as a decision rule.
So for instance, you say something like, "I will reject the null hypothesis if the test statistic z is greater than 2.33". That's the same as saying that on a right-tailed test-- note this is one-tailed, because you're saying that the rejection region is on the right side of the normal curve.
This would be a right tailed test saying to reject the null hypothesis if the sample mean is among the highest 1% of all sample means that would occur by chance.
The decision rule, the area where the red and blue boxes overlap is your line in the sand. Anything less than that will fail to reject the null hypothesis and attribute whatever differences exist for a mu to chance.
Anything higher than 2.33 for a test statistic, you will reject the null hypothesis and not attribute the difference from mu to chance.
We talked about test statistics, both of which were z's, p-values, which were the probabilities that you would get a statistic as extreme as what you've got by chance, and the critical values, which is our lines in the sand whereby if we exceed that number with our test statistic, we'll reject the null hypothesis.
When we actually go through a hypothesis test, we convert our sample statistic obtained, which is either x bar or p hat, into a test statistic, both of which are z's. We can then use the sampling distribution, which is approximately normal, to determine if our sample statistic is unusual or not, unusually high or unusually low or just unusually different, given that the null hypothesis is in fact true.
We can decide on different critical values based on how sure we want to be that the difference actually exists-- so different levels of what we would consider unusual. Do we need it to be among the highest 1% or the highest 5%? We can make that decision. And if our test statistic exceeds the critical value, we'll reject the null hypothesis. And that's our decision rule
Source: This work adapted from Sophia Author Jonathan Osters.
A value that can be compared to the test statistic to decide the outcome of a hypothesis test
The probability that the test statistic is that value or more extreme in the direction of the alternative hypothesis
A measurement, in standardized units, of how far a sample statistic is from the assumed parameter if the null hypothesis is true