First, please create an account

Already have a Sophia account?

Mean, Median, and Mode

Author: Katherine Williams

Mean

Video Chapters

( 00:00 - 00:31 ) Definition of Mean

( 00:32 - 01:38 ) Definition of Summation Notation

( 01:39 - 02:33 ) Example: Calculating Mean

( 02:34 - 04:06 ) Example: Effect of Outliers on the Mean

( 04:07 - 05:53 ) Example: Calculating a Weighted Mean

Video Transcription

Download PDF

This tutorial talks about the mean. The mean is one of many measures of center. The way that mean is calculated is by adding all the values and then dividing by how many values there are. Typically, when people use the word average, they're referring to the mean. However, any of the measures of center could be met by the word average, so mean as a more precise term. Anytime you hear the word average, you should get clarification as to whether it is, in fact, the mean that is meant, if possible.

Now let's look at an example. Before we look at an example, one important thing to note is this term summation notation. It's a way of compactly writing add up all of these values. It's also referred to as sigma notation, because we use this symbol, a kind of funny looking e, which is the sigma. When you're doing sigma notation, there's indexing at the bottom. It says i equals, and then it'll insert something here. And that's kind of telling you where to start doing your counting and your inclusion in the sum.

And then it'll tell you up at the top what you go to. And n would be the last term, or the n-th term. And then beside it, you put what a typical value is going to look like. So here, we're going to use x sub i. So what this is telling us to do is start with the x1, the first term, and add up all of those terms until you get to x sub n, the last term.

So, if we wanted to, we could use the sigma notation in order to tell us to do this calculation here, in order to add up all the heights to do our mean. And when we add up all of these values here, we get 385. So our sum for this problem asking about the heights in inches of third graders is 385.

Now, once we've found the sum, in order to find the mean, we need to divide by how many terms do we have. So in this case, 1, 2, 3, 4, 5, 6, 7, 8, 9, and 10. We have 10 terms, so we're going to divide by 10. And we're going to find out that our mean, in this case, is 385. Sorry, the mean in this case is 38.5.

Now, with sigma notation, you're going to see that a lot more frequently in other level of math. So this first example is one we already did. We already find out that the mean here was 38.5. Let's examine what happens when we add two higher values in and we kind of change up our data set a little bit.

So here, we have the 50 and the 47 that are a lot higher than most of these other ones here. And 47 is not a lot higher than 42, but it's definitely larger. So we have a very similar example, except now we're including some values that may or may not be outliers but are at least larger than what we typically had before.

So let's start by adding up all the values. You add up all the values, 50 plus 47 plus 40 and so on. We end up with 472. Now, this time, we have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12 terms. So to find the mean, we're going to divide by 12. And when we do 472 divided by 12, we end up with 39.3 repeating.

Now, our mean here ends up being higher. We get a higher value for our mean. It's 39.3. Because we have these values here, the 50 and the 47, that's starting to pull the mean up. And when we had really extreme outliers, that's going to pull the mean up even higher. So when you have outliers, the mean is not the best measure of center to use, because those high up values are kind of pulling the mean up towards them and distorting the data a bit.

One other thing that's important to look at for mean is the weighted mean. The weighted mean is when the data is weighted. The values have different importance. In order to calculate a weighted mean, first you multiply the value by its weight, and then you add those weighted values.

For example, if we have a student named Sam who's taking a class where the participation is worth 10%, homework is worth 25%, quizzes are with 50%, and tests are 15%, here the data is weighted. Each of the values, so a participation grade or a homework grade, has a different importance, which is indicated by this percentage here, the weight.

So in order to calculate the weighted mean, we're going to need to do a couple of different steps. First, we're going to multiply the value by its weight. So in this case here, the student, Sam, earned a 100 in participation, a 50% in homework, a 70 in quizzes, and a 93 in tests. So we're going to multiply the scores that she received by the weight.

So 0.1 times 100. 0.25 times 50. 0.5. times 70. 0.15 times 93. And then once we have those values, we are going to calculate what that is-- so 10, 12.5, 35, 13.95-- and add them all up. And the value that we get is going to be Sam's score in the class as well as our weighted mean. So when you add 10 plus 12 plus 35-- sorry, 10 plus 12.5 plus 35 plus 13.95, you end up with 71.45. So this here is Sam's final score, as well as the weighted mean. This tutorial has talked about the mean.

Median

Video Chapters

( 00:00 - 00:00 ) Definition of Median

( 00:15 - 04:00 ) Examples: Finding the Median by the Cross-Off Method

( 04:01 - 04:30 ) Example: How Outliers Affect the Mean

( 04:31 - 05:38 ) Examples: Finding the Median by the Number of Elements Divided by Two Method

( 05:39 - 06:51 ) Example: Finding the Median Class

Video Transcription

Download PDF

This tutorial talks about the median. The median is the value in the middle of the data set, and half the data is above the median, half the data is below the media. So, again, it's the middle part. There's 50% above, 50% below. Let's look at some examples.

In this first example, it's talking about the height in inches of third graders. Now, here, data is unordered. In order to find out which number is in the middle, which number is the median, we have to start out by ordering the data. So let's take a second and do that now.

So we need to find the lowest value is 34, and then there's 35. And then 37, 38. There's two 39's, two 40's, a 41, and a 42. Now let's just take a second and look at our data set. And one thing that I always like to make sure is that my reordered set has the same number of values as the original set, because that's an easy mistake to make. And because I've lined up the numbers nicely, you can see. You can also count and see, OK, there are 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 values in this set and 10 values in this set. Either way, it's just a good double-check.

Now, there are a couple of methods of finding the median. The one that I prefer is kind of called the cross-off method. So in order to find the middle, if you cross off one from the bottom and one from the top until you get one value left, then you know you've found the middle. And here, I've crossed off four in the bottom and four in the top. So I've crossed off the same number of both sides, but I get left with two numbers, not one. And if I kept crossing off, I'd get left nothing, so I don't want to do that.

Instead, I need to know exactly what number comes in the middle here. And in this case, we have a 39 and a 39. And the number in the middle is still going to be 39. If you couldn't tell, you could add the two values together, 39 plus 39, or whatever those two values are, and then divide by 2. That would also help you to find the median for when you get left with two values. So in this problem here, our median is 39.

Now, if you get left with just one number, that number is obviously your median, and you don't have to do that last step. In the next example, it's talking about the heights in inches of third graders. And let's see what we get here using our cross-off method. I've already reorganized the data to go from lowest to highest, so we're all set there. So I'm just going to be crossing off.

And if you're in the middle of your cross-offs and you kind of lose your place, you can always check and count on each side. There should be the same number of values crossed off on each side. So here, where I have four crossed off on this side, I need to have four crossed off on this side, which I do. Five on this side, five on this side. And again, I got left with two values. So I need to find out what is going to be in the middle of 39 and 40, because that is going to be our median.

And in order to do that, one, you might know right away that the middle of 39 40 is 39.5. Otherwise, you could add. You would say 39 plus 40 divided by 2 and end up with that 39.5. So that is our median of this set here.

Now, one thing to notice is this set had a pretty compact set of values, 34 to 42. There's nothing really a big gap. There's no outliers, probably. In this set down here, we have a little bit of a gap. So I don't know for sure yet whether the 47 and 50 are outliers, but they could be. And even if we did have outliers in here, because we're just crossing off the values from the high to the low, we end up still with a pretty accurate representation of the center of our data. So for data with outliers, the median is typically the one to use.

Now, the other way you could find the median is by counting how many values you have. So here, we had 1, 2, 3, 4, 5, 6, 7, 8, 9, 10. And then when you divide by 2, you find out that you get 5. So we know that when we find the middle, we need to have five numbers on each side. So you can count up to the sixth number-- 1, 2, 3, 4, 5. And then the sixth number here. And the fifth number, you know, has to find your median in between the two. And again, that's 39.

Down here, we have 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12. So when we divide that by 2, we get 6. We should have six numbers on each side. So I can count up and find the 1, 2, 3, 4, 5, sixth and seventh term, and that's my middle. And in order to find the middle, we have to go, again, and do the 39 plus 40 divided by 2 to get 39.5.

Something else that we need to look at in terms of medians is the median class. When our data is already in bins in a frequency table, in order to find the middle number, we need to determine what interval class it is in. So here, I need to know how many total values I have. So you can use your cumulative frequency to find that. So 3 and then plus 3 gets me 6, plus 8 gets me 14, plus 10 gets me 24, plus 5 gets me 29, plus 2 gets me 31.

Now, if I do that trick from before and I say 31 divided by 2 gets me 15.5. So to find the middle number, I know that I'm going to have 15 on this side and 15 on that side, and it's this term in the middle here that's going to be my median. And it's going to indicate where my median class is, too. So here, we're looking for the 16th term. And over here, we can tell the 16th term is after the 14th, so it's somewhere in here. So our median class is this 75 to 80 feet. So this has been our tutorial on medians and median classes.

Mode

Video Chapters

( 00:00 - 00:00 ) Definition of Mode

( 00:50 - 03:01 ) Examples: Finding the Mode of Quantitative Data

( 03:02 - 03:18 ) Example: Finding the Mode of Qualitative Data

Video Transcription

Download PDF

This tutorial talks about the mode. The mode is the value that occurs the most often. So the mode is the value that occurs the most. It has the highest frequency.

Now, some data sets have more than one mode. If this happens, you report both numbers, and then that is how you can indicate that the data set has more than one mode. If, however, the data set has no number that's repeated, then you would say that the set has no mode.

One thing to note is if the set has large and local high points, those can be referred to mode. So for example, if you had a distribution that looked something like this, you could say that there are two modes because you could consider this local high as a secondary mode. Let's look through some examples.

Here in this first example, we're looking at the heights and inches of third graders. So we're looking to see if there's any values that are repeated. So as I look across, I haven't come across anything that's repeated yet until this 39. There's one here, and there's one here.

So there are two 39s in this data set. So the mode of this data set is going to be 39. So this is an example of when a data set has one mode. There's one number that appears most often. There's one number that's repeated.

In the second example, we're going to do the same thing. We're going to scan for numbers that are repeated. So here, this 40, this is the second 40. There is one back here as well. And then here, this 39-- there is one that was repeated here.

So in this case, we have two numbers that show up more than once-- that are repeated. And they are repeated in the same amount. So there's two 40s and two 39s. So you would indicate that this data set has a mode of 39 and 40.

Now, if we had another 40 tacked on here, and we had three 40s and two 39s, we would just have one mode. It would just be the 40 because that one appears three times, whereas 39 only appears twice. So if you have numbers that repeat in the exact same amount, then you have more than one mode. But in this case, we do have more than one mode. We have 39 and 40.

And let's look at this last example. And that should be fourth graders, here. And here, with the heights and inches of fourth graders, we're looking again to see if we have any repeated values. And we don't, so in this case, we would say that there is no mode.

Now, one other thing to think about-- and this has all been quantitative data. It's all been talking about numbers. We can describe the mode for qualitative data as well. So if we had something like this, the results of a survey where students were picking their favorite color.

It could have been red. It could then green. It could have been blue. We can still report the mode of this data set. The mode is the one that was picked the most often, which is the blue. So the mode here is blue. This has been been our tutorial on the mode.

Terms to Know

Mean: The "average" value of a data set. It is obtained by dividing the sum of the values by the number of values in the set.
Median: The value that is in the "middle" of a data set when the set is arranged from least to greatest.
Median Class: The bin that contains the median value. This is the most precise measurement we can obtain when we are looking at data that have already been categorized.
Mode: The most frequently appearing number in a set of quantitative data or most frequently occurring category in a set of qualitative data.
Summation Notation: A notation that uses the Greek letter sigma to state that values should be added together.
Weighted Mean/Average: A way of calculating a mean when not all the values count for the same amount. Each value should be multiplied by its weight and added together, then divide the sum by the sum of the weights.

Formulas to Know

Mean: $m e a n space equals space fraction numerator x subscript 1 plus x subscript 2 plus x subscript 3 plus times times times plus x subscript n over denominator n end fraction space equals space 1 over n sum from i equals 1 to n of x subscript i$