First, please create an account

Already have a Sophia account?

Analysis of Variance/ANOVA

Author: Ryan Backman

Video Chapters

( 00:00 - 00:40 ) Definition of Analysis of Variance (or ANOVA)

( 00:41 - 01:33 ) Definition of Test Statistic F

( 01:34 - 03:04 ) Discussion of the ANOVA Hypotheses

( 03:05 - 03:44 ) Discussion of ANOVA Conditions

( 03:45 - 04:35 ) Discussion of ANOVA Calculations

( 04:36 - 04:44 ) Discussion of ANOVA Conclusions

( 04:45 - 06:46 ) Demonstration of ANOVA Technology Use

Video Transcription

Download PDF

Hi. This tutorial covers the analysis of variance, also known as ANOVA. All right. So let's start with the definition. So the analysis of variance, again, also known as ANOVA, is a type of hypothesis test used for testing the equality of three or more population means by analyzing sample variances.

All right. So basically, we're going to be looking at multiple populations and doing a test on the means of those populations. Seeing if they're all equal, or if they seem different. And we're doing this by analyzing the sample variances from the samples from the multiple populations.

All right. So to do ANOVA, you need a test statistic called F. So F is a test statistic used in ANOVA calculated by dividing the variance between samples by the variance within samples. OK? So we have a specific calculation that we'll do here, but this is kind of just an abbreviated version.

So we're going to take the variance between samples. So if we have three different populations, we'll have three samples. We want to know what's the variance between those samples. We're going to divide that by the variance within the samples. So then, we'll be looking at the variance within each of those three samples, three or more samples.

All right. Technology is almost always used to calculate F and to do other ANOVA calculations. OK? At the end of the tutorial, I'll show you a little bit of that technology, but we're not really going to go through in an entire ANOVA test here.

All right. Like our other hypothesis tests, we also have the four-step procedure that we're going to use for ANOVA. And the first step is to formulate the null and alternative hypotheses and choose a significance level.

All right. So just like all of our other hypothesis tests, we are going to write hypotheses in pairs-- so our null hypothesis and our alternative hypothesis. Remember, our null hypothesis is we are testing the equality of three or more population means. OK? So what we might write is mu sub 1 equals mu sub 2 equals mu sub 3 equals, I put dot dot dot equals mu sub n. So this would be for n population means. So if we were only testing three population means, we would stop it right there, so mu 1 equals mu 2 equals mu 3. OK?

And on the alternative hypothesis, simply can just be that the null is false. OK? So that would just mean that two or more of these population means are not equal. OK? So when you're running your hypothesis tests, you are trying to reject the null hypothesis, which gives you evidence for the alternative hypothesis.

And then, generally, at step one, you're going to choose a significance level. A lot of times alpha equals 0.05 is the most common level of alpha. Step two, you need to check your conditions. OK? So you're going to have multiple random samples here that you're using for ANOVA.

So the conditions that you want, the first condition is that observations are independent and from a random sample. So if you have three random samples, all of them, all those observations must be independent in all three of those random samples. OK? The population standard deviations are about the same of the multiple populations you're dealing with. And then your populations should be normally distributed. So, again, if you're testing three populations, you want three normally distributed populations.

All right. Step three is where you calculate your test statistic. And then we'd find a p-value. So in this case, we're going to be calculating F. And then finding a p-value for F.

All right. So note, a large F tells us that the sample means differ more than the data within the individual samples, which would be unlikely if the null hypothesis were true. And then a small test statistic or a small value of F tells us that the sample means differ less than the data within the individual samples, which is consistent with the null hypothesis being true.

OK? So if you had a small value of F, chances are you're going to fail to reject the null hypothesis. If you have a large value of F, chances are you're going to reject the null hypothesis. OK? And that's basically what you're doing in step four, decide whether to reject or not the null hypothesis. And then draw a conclusion.

All right? And then just kind of as a short example, consider the exam scores of three sections of the same math course I teach. OK? So I gave the same test to three of my different classes. And what I have here is all of the scores for those three classes in my calculator in these different lists. So I have L1, L2, L3. It was a standard 100 point test. OK?

I have all of these in the three lists. So there is a calculator function that will do the ANOVA test for you. So I'm going to go to that. So ANOVA and then I'm going to tell the calculator to do the ANOVA test with my three samples-- L1, L2, and L3. OK? I'm going to hit Enter there. OK?

And notice the first two things that come up are my F statistic and my p-value. So in this case, my F statistic was 1.3269 and my p-value is about 0.27. So my p-value, remember, was 0.271. And basically, now once I have a p-value, what I want to do is compare that to a value of alpha. So many times we'll use that alpha equals 0.05. OK?

So we can see in this case that our p-value is greater than our value of alpha. So when we do that, we fail to reject the null hypothesis. So in this case, we cannot reject the notion of all three of those population means being the same. OK? So we don't have evidence to show that we have differences in the mean exam scores of the three sections.

So really, in this case, we'd have an inconclusive ANOVA test. And with no evidence then to show that the three sections are different that scored differently on the test. All right. So this has been your tutorial on ANOVA, also known as the analysis of variance. Thanks for watching.

Terms to Know

Analysis of Variance (ANOVA): A hypothesis test that allows us to compare three or more population means.
F statistic: The test statistic in an ANOVA test. It is the ratio of the variability between the samples to the variability within each sample. If the null hypothesis is true, the F statistic will probably be small.