Source: Image created by Joseph Gearin
In this tutorial, you're going to learn about t-tests. So let's take a look. In a z-test for means, we found that the test statistic was this, z is equal to the sample mean minus the hypothesized population mean over the standard deviation of the population divided by the square root of sample size. However, the z-statistic was based on the fact that the population standard deviation was known. And if it's not known, we need a new statistic. So we're going to use our sample standard deviation, s.
This s over the square root of n value, replacing the sigma over square root of n value, is called the standard error, just like it was here. There's only one problem using the sample standard deviation as opposed to the population standard deviation-- is the value of s can vary largely from sample to sample. Sigma is fixed. And so we can base our normal distribution off of it.
The sample standard deviation is more variable than the population standard deviation and much more variable for small samples than for large samples. For large samples, a and sigma are very close. But with small samples particularly, the value of s can vary wildly.
So we need a new distribution in order to account for this increased variability of the standard deviation. Because s is so variable, it creates a distribution of test statistics much like the normal distribution. The only difference is this is a more heavy-tailed distribution. If we used the normal distribution, it wouldn't underestimate the proportion of extreme values in the sampling distribution. This distribution is called the student's t-distribution, or sometimes just called the t-distribution.
The t-distribution is actually a family of distributions. They all are a little bit shorter than the standard normal distribution and a little heavier on the tails. But as the sample size gets larger, the t-distribution does get close to the normal distribution. So it doesn't diminish as quickly in the tails when the sample size is small, but gets very close to the normal distribution when n is large. And we're going to calculate t-statistics much like we calculated z-statistics.
Now you might recall that when running a hypothesis test, there are four parts. First, state the null and alternative hypotheses. Then, check the conditions necessary for inference. Third, calculate the test statistic and its p-value. And then fourth, you have to compare your test statistic to your chosen critical value or your p-value to the significance level, and then based on how those compare, make a decision about the null hypothesis and a conclusion in the context of the problem. You'll see that these are, in fact, three separate things that you need to do in one.
The only difference between a z-test for means and a t-test for means is the test statistic is going to be a t-statistic instead of a z-statistic. And because we're using the t-distribution instead of the z-distribution, we're going to obtain a different p-value. And we need a new table, not the standard normal table for that.
So here is the t-distribution table. Notice it's actually one side and it's the upper side that gives us these tail probabilities here. So these are your p-values. These are your potential p-values based on these values of t down here.
The one new wrinkle that we're adding for a t-distribution is this value df. It's called the degrees of freedom. And for our purposes, it's just going to be the sample size minus 1. And so you find your t-statistic in whatever row your degrees of freedom is. And if it's between two values, that means your p-value is between these two p-values.
We'll go through an example. So the M&Ms in a bag are supposed to weigh collectively 47.9 grams. Suppose that I inspected 14 grams and got this distribution. Assuming the distribution of bag weights is approximately normal, is this evidence that the bags don't contain the amount of candy that they say that they do?
First, we're going to state the null and alternative hypotheses. The null is that the mean is 47.9 grams. The alternative is the mean is not 47.9 grams. Our alpha level is going to be 0.05, which means that if the p-value is less than 0.05, this number, reject the null hypothesis.
Moving on, we're going to check the necessary conditions. We need to make sure that the sample was collected in a random way. So how were the data collected? That the observations are independent, and we're going to verify that by showing that the population is at least 10 times the sample size. And normality. Is the sampling distribution approximately normal? We can verify it with the central limit theorem that says that it will be approximately normal for most distributions if the sample size is 30 or larger, or if the parent distribution is approximately normal.
Now verifying each of those, it says in the problem that the bags were randomly selected. Also, we're going to go ahead and assume, which makes sense, that there are at least 140 bags of M&Ms. That's 10 times as large as the 14 bags in our sample. And finally, it does say in the problem that the distribution of bag weights is approximately normal, and so normality will be verified for our sampling distribution.
Now we're going to calculate the test statistic and the p-value. By plugging in all the numbers that we have, we obtain a value of 1.06. That's a t-statistic of positive 1.06. Now where exactly is that? We need to calculate the probability that we get a t-statistic of 1.06 or larger. I've highlighted this row, this 13df row, because our sample size was 14 and our degrees of freedom is 14 minus 1, 13. So we're going to look in this row. And I realize the text is small. But we're going to look for 1.06.
Now, in all likelihood, it's not one of the values listed in the row here. But it's between two values. What we see is that it's between the 0.870 and the 1.079, which means that the p-value is going to be between those two numbers. And so what we see is that looking in this row and between these two columns, we can find that the p-value is somewhere between those two numbers.
However, one additional wrinkle is that our particular problem was a two-sided problem. So we have to double our p-value. So our p-value will be some number between 0.30 and 0.40. Now the thing about this between 0.30 and 0.40 business is that we can, in fact, use technology to nail down the p-value more exactly. We don't have to use this table. Although you can use the table, to still answer the question about the null hypothesis.
So part four, compare our p-value to our significance level. We don't know exactly what our p-value is, but we know that it's within the range of 0.3 to 0.4. So we're going to do these three parts. And I'll show them in these colors here. Since our p-value is between 0.3 and 0.4. And both of those-- any number in this range is greater than 0.05. We're going to fail to reject the null hypothesis. There's our decision based on how they compare. And finally, the conclusion is that there's not sufficient evidence to conclude that M&M bags are being filled to a mean other than 47.9 grams.
So to recap, in cases where the population standard deviation is not known-- this is almost always the case by the way-- we should use the t-distribution to account for the additional variability introduced by using the sample standard deviation in the test statistic. The steps in the hypothesis test are the same as they were with a z-test, first stating the non alternative hypotheses, stating and verifying the conclusions of the test, calculating the test statistic and the p-value, and then finally, comparing the p-value to alpha and making a decision about the null hypothesis. So we talked about a t-test for population means. Good luck, and we'll see you next time.