In this lesson, students will learn different methods of categorizing data in an experiment.
The two types of variables, explanatory and response, can represent a quantity that is being changed or measured. Focus on steps 5 through 8 of the experimental method for a moment.
One of the first steps in analyzing your findings is to determine its data type and how to classify that data. Classification allows researchers to identify different types of information.
A horse race is a good example of a situation that involves a lot of data. Each horse and rider is represented by a number, the finish order of the horses is numbered, horses may be identified as male or female, and so on. What can you tell from this information?
You can classify the data by type. For instance, the order they finish in is an ordinal measurement. In other words, it comes in order. You also know that gender has no numerical value. That makes it a nominal measurement.
Even a simple scenario such as our horse race provides a variety of measurements. Let’s compare the winnings of different horses.
This bar graph represents four hypothetical horses. For the sake of simplicity, they are called Horse A, Horse B, Horse C, and Horse D. You can see the different types of data here. That’s important, because it allows you to make a visual comparison of the horses’ winnings, and it’s relatively easy to tell which horse won more money than the others.
Let’s look at a histogram, which is a different kind of representation.
In this instance, you’re looking at the age ranges of the jockeys who rode the horses. At the low end you have 18. The histogram tells us there are 11 jockeys between the ages of 18 and 22. In the next range, there are 19 between the ages of 22 and 26. This goes all the way up to five jockeys over the age of 34.
The nice part about graphs is they allows you to visually represent the data. When representing data, it often matters how that data is classified.
If a variable has a nominal scale, all it does is provide categories. Categories allow researchers to separate data based on characteristics that it might have. Such a variable is called a nominal variable.
Nominal scales do not show direction or magnitude. Direction means that data can be ordered, and magnitude means that one thing can be considered larger or smaller than another. When something does not have magnitude, then this comparison can’t be made. When something does not have direction, then the order it’s presented in does not matter.
Consider eye color. If you were at the department of motor vehicles getting a new driver’s license, you would have four choices for describing your eye color: blue, brown, green, and hazel.
These choices don’t have magnitude, because they can’t be compared to one another. Blue eyes aren’t necessarily bigger or better than brown eyes. Nor do the choices have a direction. The order they’re presented in doesn’t matter, though they are likely to be in alphabetical order. Nominal variables can also have categories that are numbers.
As an example, think of a telephone number. It’s going to have an area code. This is an example of a nominal variable that consists of a number. All numbers within that area code would start with the same three digits. The area code varies from one region to the next.
A variable has an ordinal scale if it provides categories, but only if the categories can be put in a meaningful order. What this means is that for each category, you can decide which is better or worse than others. Another way to say that is which item comes before another. A variable like this is called an ordinal variable. While ordinal scales do show direction, they do not show magnitude, because direction refers to the position of categories or numbers.
Take a look at the different classes that might exist in a high school or college. We have freshman, sophomore, junior, and senior. Those are very common categories used to separate students based on how long they have attended the school.
The class categories don’t have magnitude, but they do have direction. Freshmen come first, sophomores come second, juniors third, and seniors fourth.
For some some scales of measurement, we need to consider the difference between variables on the scale. A variable has an interval scale if it provides numbers, so that the difference between two values can be measured. The difference between any two values can always be determined the same way.
Consider SAT scores. The difference between a 1500 score and a 1700 score has the same meaning as the difference between a 1250 score and a 1450 score. This type of variable is called an interval variable.
With an interval variable, the number 0 does not mean that something does not exist. If somebody took the SAT and got a 0, that doesn’t mean that person didn’t take the exam, it means he or she failed to answer any questions correctly.
The fourth scale to talk about today is the ratio scale. A variable has a ratio scale if it is an interval variable where the only difference is that a value of 0 does mean that something does not exist. This is called a ratio variable, and it occurs in most types of physical measurements, such as length, width, and weight.
Variables that provide a count, such as the number of apps on your phone, are also ratio variables. Weight is a great example of a ratio scale that you can see used in everyday life.
For instance, ordering a double scoop of ice cream would indicate that the cone contains twice as much by weight as a single scoop. This is different than an interval scale, however, because it suggests that there is a multiple more of something rather than saying the second scoop of ice cream has 10 ounces more ice cream.
The four types of data we’ve discussed can be separated the two major areas: categorical data and quantitative data. Categorical data relate to nominal and ordinal variables, while quantitative data relate to interval and ratio variables.
Think about a group of people who are divided into two categories: college graduates and non-college graduates. This is a categorical variable as well as a nominal variable, as there is no direction or magnitude associated with it.
Or consider the numbers on marathon runners. These numbers are considered categorical and could be ordinal if they were issued in the order in which the runners registered for the race. Keep in mind that just because they are numbers does not necessarily mean that they are quantitative variables.
Source: This work is adapted from Sophia author Dan Laub.
A scale that provides numbers, and the difference of two numbers is a measure of the magnitude of their difference.
A scale that only provides categories.
A scale that provides categories, but the categories can be put in a meaningful order.
An interval scale, and additionally the number zero means the absence of a given quantity.