Source: Scale; Public Domain http://pixabay.com/en/icon-drawing-cartoon-scale-free-37772/ Heart; Creative Commons: http://www.sterling-wellness.com/american-heart-month/ Male Symbol; Public Domain: http://www.clker.com/clipart-6434.html Female Symbol; Public Domain: http://www.clker.com/clipart-9417.html
In this tutorial, you're going to learn about the null and alternative hypotheses of a hypothesis test when there are two samples involved instead of just one. So in a situation where there are in fact two samples, there are two possible scenarios that might actually be happening with your data. One is that they're independent samples, where the values from one sample don't really match or pair or affect any of the values from the other sample.
The other scenario is that the samples are in fact paired together, where the first data point from the first sample actually does have a meaningful pairing with the first data point from the second sample. We call that paired data. Each pair essentially constitutes one data point instead of two different data points.
So if the data are in fact from independent samples, then the null hypothesis states this, that the population mean for population one and the population mean for population two are the same. So the mean of population one and population two are equal. The alternative hypothesis is that the means of the two populations are not the same, either that the mean of population one is smaller than that of population two, or bigger than the mean of population two, or simply not equal to the mean of population two. But notice there are two different populations. And so there are two different population means in each of these cases.
So an example of when you would use something like this is if you were doing a controlled experiment, where you would want to compare the mean weight loss between the two populations of the treatment group and the control group. Now what if the data are paired together? Well, the null hypothesis in that case would simply state the true mean of the differences between those pairs is zero. So the mean of the difference between the parent data is zero. Whereas the alternative hypothesis would say that the true mean difference is actually greater than, less than, or equal to zero.
So we might say that the true mean of the difference between, for instance, husbands' heights and wives' heights-- so you could use this procedure to determine if husbands were significantly taller than their wives. Each husband-wife pair would constitute one data point. So if you had a whole list of husbands and that was paired with a whole list of wives, you would be really interested in simply the difference between the two heights, and not the actual list of the husbands and the list of wives.
Now, here's the thing. With the paired data, the remaining work for the matched pairs scenario, that is the conditions, the mechanics that is calculating the t-statistic and everything, and the decision and conclusion are the same as they are for a one-sample scenario. So if it's matched pairs, like the husband and wife heights example, you can treat it like a one-sample scenario.
If the samples are independent and it is a two sample scenario like our weight loss example, then we have to resort to more complicated formulas for the mechanics here. The conditions there end up being more conditions that need to be verified. The mechanics, the formulas are more complicated. And the decision and conclusion do end up following from the fact that there are two samples.
And so it's a little bit more complicated. We're not going to go all through it. But a lot of the work is different than they are in a one-sample scenario. So it's important to know what the differences are between a matched pair scenario, which is apparently essentially a glorified one-sample scenario versus a two-sample scenario, which is more complicated.
Now, the statistic and p-value here are almost always calculated using technology. And so to recap, when examining an experiment or an observational study that has two groups, we should decide if they're independent groups, where the values from the one group aren't affected by the values from the other group, or if they're paired, where the first data point in the first list and the first data point on the second list, if they should actually go together.
If the data are paired, then the procedure is essentially the same as a one-sample scenario. If not, it's a little bit more complicated. So we talked about the null hypothesis with two samples, and the alternative hypothesis with two samples, and how they differ if they're independent samples versus if they're paired. Good luck and we'll see you next time.
Overview
(0:00-1:04) When there are two groups, they may be independent or paired
(1:05-2:13) Null and Alternative Hypotheses for independent samples
(2:14-3:25) Null and Alternative Hypotheses for paired samples
(3:26-4:00) Scenarios that have paired data follow one-sample procedures
(4:01-5:07) The mechanics of a hypothesis test are more complicated if the two samples are independent
(5:08-6:00) Recap
A claim that there is a difference in a population parameter between two samples. The alternative hypothesis is always an inequality statement.
A claim about a population difference that serves as the starting assumption for a hypothesis test. The null hypothesis usually tests that the difference in a certain parameter between two samples, such as a mean or standard deviation, is equal to zero.