Source: Graphs created by the author
This tutorial is going to teach you about things called z-scores. And these are values that allow you to make apples to apples comparisons between a couple of different distributions. So let's take a look. Oftentimes, we want to compare two things, but it's not really fair to compare them directly. So let's take a look at this example with Sophia.
On our first exam of the year, the class mean was 88 points, and standard deviation was five. Sophia scored a 92, so she did better than the class average. On the second test, she scored an 80, which is a lot worse than she did the first time.
But take a look on this exam, the class mean was 74. So it must have been a much harder test, because her score went down, but the class average went way, way down. And the standard deviation, this time, was four points.
Did she do better on the first or the second test. Well, it's obvious that she scored higher on the first test, but relative to her classmates, did she do better on the first test or the second test? Just based on the scores, it's not fair to talk about just those scores, 92 versus 80, and say, well, then she must have done better on the first test. We want to see how she did relative to her classmates. And it's z-scores that are going to allow us to make this apples to apples comparison.
So z-scores are sometimes called standardized scores, and they're called standardized, because they're measuring how many standard deviations away from the mean that your observation happens to be. So in the previous example for the first test of the year, the standard deviation was five for the first exam, and Sophia scored four points higher than average. She scored a 92. The average was an 88.
So she scored higher than the average by four points. Her z-score is less than one. Because she scored four points higher, the standard deviation was five points. So less than one, she's less than one standard deviation above the mean.
In fact, more specifically, her z-score is positive 4/5. Positive four means above the mean, and divided by five. It's that much of a standard deviation. It's positive 0.8 standard deviations from the mean. So how did we do that by formula?
We took the raw score, her 92, subtracted the mean of 88 and divided by the standard deviation of 5 to obtain positive 0.8. Symbolically, we can talk about the raw score, the mean, and the standard deviation. We've talked about symbols for the mean and standard deviation before. The raw score we can just call x. And so we can write symbolically z, z for z-score, is equal to x, the raw score, minus mu, the mean, divided by standard deviation, sigma.
So let's take a look. Let's answer the question now. We need to compare the standardized scores. We need to compare the z-score from our first test to our z-score from the second.
We already determined that our first z-score, the one from the first test, was positive 0.8. How did she do on the second test? Pause the video, and then come back to it.
What you should have come up with is her second z-score is 80, which is her score, minus 74, which was the mean for the class average for the second one, divided by 4, the standard deviation of the class for the second one. And that gives you positive 1.5, or 1.5 standard deviations above the 74, which was the mean. Indicating, because positive 1.5 is larger than positive 0.8, her score on the second test was, in fact, better. Not in terms of actual points, but relative to the rest of her class.
The other thing that's worth noting is that a z-score can, in fact, be negative. Let's look back to how it's calculated to see how, in fact, it can be negative. Raw score minus mean divided by standard deviation. It can be negative if you're subtracting a bigger number from a smaller number, which means if the raw score is a smaller number than the mean, then you'll end up with a value that's negative.
So if the raw score is below the mean, the z-score will be negative. If the raw score is above the mean, like it was in both tests for Sophia, the z-score is positive. And finally, if the raw score and the mean are the same, the z-score is 0. This actually brings up an interesting thing to look at.
Suppose I looked at men's heights. Men's heights follow a normal distribution, with a mean of 68 and a standard deviation of 3. What I have marked here are the standard deviations from the mean.
What I'd like you to do, or at least think about, is converting this 59, and 62, and 65, et cetera into z-scores. What would that normal distribution look like then? Pause the video and think it over.
What you should have come up with is because these are integer numbers of standard deviations away, and that's what z-scores measure, this is one standard deviation below the mean. So it gets a z-score of negative 1, exactly negative 1. This is two standard deviations below the mean, and so it gets a z-score of negative 2.
The result being, we can change the normal distribution center to 68, with a standard deviation of 3, and make it look like this. These are the standardized men's heights. It goes from looking like a 68 mean and a three standard deviation to a mean of 0 and a standard deviation of 1.
That's kind of interesting. This normal distribution of z-scores is called the standard normal distribution. Standard, because it's the normal distribution of standardized scores. it's the normal distribution of z-scores.
And so to recap, standardized scores allow us to make apples to apples comparisons of scores from one distribution to scores from another distribution. And they measure how many standard deviations above or below the mean you are. A point that's further above the mean will have a higher z-score than a point that's closer to the mean. A point above the mean will have a positive z-score. A point below the mean will have a negative z-score.
Good luck, and we'll see you next time.