Or

4
Tutorials that teach
Conditions for Z-Tests and T-Tests

Take your pick:

Tutorial

This tutorial will show you how to address and check the conditions for both z-tests and t-tests. You’ll learn about:

- Z-tests and T-tests Conditions

For both **z-tests and t-tests,** the conditions are the same. However, you may recall that for z-tests, the population standard deviation has to be known, and for t-tests, the population standard deviation is unknown.

**t-test conditions**- The data were collected in a random way, each observation must be independent of the others, and the sampling distribution must be normal or approximately normal.
**z-test conditions**- The data were collected in a random way, each observation must be independent of the others, the sampling distribution must be normal or approximately normal, and the population standard deviation must be known.

When performing a hypothesis test for a population mean, there are three conditions.

- One has to deal with how the data were collected. Were they collected in some random way? A simple random sample is the gold standard.
- Second, is each observation independent of the others? You're going to verify that mathematically.
- And third, is the sampling distribution approximately normal? Again, you're going to verify that a number of ways.

1. First, are the data collected in some random way? The purpose is to make sure there's not any bias in the sample. Ideally, you want a simple random sample from the population or to be able to treat our data as being a simple random sample. Cluster samples are typically okay, as are stratified random samples. The randomness is what matters most.

2. Second, the independence condition. You want to make sure that each observation doesn't affect any other observation. There are a couple ways to do that:

- One, which isn't very common, is sampling with replacement. This means when you take a person out, or an item out of the population, that you put them back and can sample them again. That's not typically how you do sampling. Normally, when you're sampling somebody, you don't put them back, and you can't sample them again. For instance, if you're taking a political poll you wouldn't want someone's opinion counted twice. So you need a population that is large.
- Sampling without replacement, where you have to check that the sample is less than 10% of the population. If we multiply your sample size by 10, the population has to be at least that big in order to say that the observations are pretty much independent of each other.

3. Finally, Is the sampling distribution approximately normal? The distribution of sample means the sampling distribution will be nearly normal in two cases:

- One is if the sample size is 30 or above. The central limit theorem says that the sampling distribution of sample means will be approximately normal when the sample size is large. For most distributions that's 30 or larger for a sample size.
- The other way is, if the parent distribution (the distribution of values from which we got our data) is normal, then the sampling distribution of sample means will also be normal, regardless of the sample size. There's two ways to verify that:
- If we're lucky, it might be stated within the context of the problem. If you're actually doing this, though, in real life, it would be hard to verify that for sure.
- If it doesn't, then you actually have to look at your data. Graph the data in a histogram or a dot plot and look for approximate symmetry, a mound shape, and a lack of outliers.

Many customers pay attention to the nutritional contents on packaged foods, so it's important that the information be accurate. Here is a list of the calorie contents of some frozen dinners:

The reported calorie content was 240. So check to see if the conditions for inference are met: do the data support or refute the idea that the calorie content is in fact 240?

- Randomness: the context says that it was a random sample. So the randomness condition is met.
- Independence: is the population at least 10 times the sample size? There have to be at least 120 frozen dinners that this company has made in their frozen dinner line, so independence seems reasonable to assume.
- Sampling distribution: is the sample size at least 30? In this problem, it isn't. Is the parent distribution normal? The problem didn't say. So you have to work through the actual graphing of the data:

As you can see, it's approximately symmetric, mound-shaped. So the idea that this could have come from a normal distribution is a reasonable assumption for you to be making.

So the three conditions have been checked and verified.

Renee wants to know the average weight of women at her health club. She stands at the door and asks the first 20 women who enter if they'll step on the scale. Here are the weights of the women who said yes:

Are the conditions for inference met here?

The data were not collected randomly. Renee stood at the door and asks the first 20 women who enter if they'll step on the scale. Maybe not all 20 actually did. Maybe the first 20 women who said yes are here. Ultimately, the sample was a convenience sample, not a random sample, and it probably suffers from voluntary response bias. The women who maybe are heavier might be more self-conscious about stepping on the scale to give Renee their weight. So maybe the sample will have bias that underestimates the average weight of women in the health club.

You can't do a test of significance. You can't do a confidence interval. There's no rescuing poorly collected data. So you don't need to check the remaining conditions, because inference will not be appropriate for the data, even if the other two conditions were met.

The conditions for running a hypothesis test -- a z-test or a t-test-- are as follows:

The randomness condition. How were the data collected? It should be some kind of random way. If it's not, you actually can't proceed.

Independence. Is the population large in comparison to your sample, at least 10 times your sample size?

And normality. Remember, there are three ways to do this. You should reference the central limit theorem if there are at least 30 observations in your sample. Or, there are two ways to verify that the parent distribution is normal. Either it will say in the problem, if you get lucky, or you have to actually graph the data and look for a mound-shaped, approximately symmetric, single-peaked distribution of data with no outliers.

And so those are the **z-test conditions and t-test conditions.** Remember, there was one additional condition for the z-test that required that we know the population's standard deviation.

Thank you and good luck!

Source: THIS WORK IS ADAPTED FROM SOPHIA AUTHOR JONATHAN OSTERS