Source: Tables created by the author
In this tutorial, you're going to learn about two-way tables, also called contingency tables. And these are a way of showing the relationship between two categorical variables.
So suppose you had 335 students in different parts of the country. And they were asked the question, if you had to pick one thing about school that's most important to you, would you say it's getting good grades, being popular, or being good at sports?
Now suppose that the distribution looked like this. This means that 57 rural students said the grades were the most important thing. And six urban students said that being popular was the most important thing to them. So we can see the relationship between school location and goal.
Now, one of the things that we can look at when we have a two-way table, and this is one of its most important features, is called the marginal distributions. We call them that because they're written, obviously, in the margins. They are the row totals and column totals for the particular categories that we have.
What this shows is there were 149 rural students in this study, whereas there were only 35 urban students in this study. It shows us that 168 students said that grades were the most popular thing, regardless of where they live. And 98 students said that being popular was the most important thing at school.
We also can add up all these cell values and obtain the grand total. This means there were 335 students in the study, which we knew at the beginning, based on the way the problem was posed. Now the nice thing about doing this, is this allows us to answer some pretty interesting probability problems.
So for instance, what's the probability that a student says that grades are the most important thing? That would be 168 students out of 335 students. Because there were 168 students, regardless of where they live, that said that grades were the most important thing, out of the 335.
What's the probability that an urban student says popular? We can isolate our view to just the 35 urban students, and say that it's six out of those 35. And then finally, what's the probability that someone who said sports was a rural student? We could say, we'll limit our view to just the 69 students who said sports, and see that 42 out of those 69 were the ones who were in the rural schools.
And so to recap, two-way tables help us to understand the relationships between two different events, or two different categories. We can use two-way tables to answer some pretty interesting probability questions. And oftentimes, we use those marginal distributions, the row totals or column totals, or even the grand total.
And so we talked about two-way tables, which are also sometimes called contingency tables. Good luck, and we'll see you next time.
A way of presenting data such that we can see the relationships between two categorical variables.