This tutorial has to do with establishing causality. You will learn about:
- Levels of Confidence
You might recall that causality is a kind of relationship between variable.
A cause-and-effect relationship between two variables.
Sometimes when you're trying to determine whether two variables are well-correlated due to cause and effect. The best way to do it is with a controlled experiment, but sometimes you cannot do a controlled experiment; perhaps you have to do an observational study due to ethical or practical concerns. How can you prove cause and effect under those circumstances?
It's still possible, though very difficult, to prove cause and effect with a study that isn't an experiment. The study will need to meet these five criteria:
- First, you need consistency.
- Does the association remain even when other variables are allowed to vary?
- So does this work across different races, across different genders?
- Do high amounts of the alleged cause lead to high or low amounts of the alleged effect?
- Second, you need something similar to a control. It's not exactly using a control group, but it's sort of what you would do if you had done an experiment. This is essentially like splitting a group of volunteers into two groups, and having a treatment group and a control group. Although you're not assigning them that way, but we're looking sort of for the same thing.
- Is the effect absent when the cause is absent?
- Is the effect present when the cause is present?
- Is the effect present when the cause present, and the effect absent when the cause is absent?
- Third, you're looking for correlation.
- Does as increase in the cause correspond to an increase or, hypothetically, a decrease in the effect?
- Suppose we're trying to determine whether or not aspirin cures headaches. Does an increase in the amount of aspirin correspond to a decrease in the amount of pain.
- Fourth, you need consideration of alternatives.
- Is there be something else, perhaps some lurking variable, that you're missing? That maybe is in the people that are doing this thing, verses some common thread among the people that aren't.
- Might there be other plausible causes?
- Fifth, you need a connection.
- What physically might create this effect?
- What is the physical mechanism behind the effect, and how could it plausibly be led to from the cause?
These are pretty strict requirements. They are necessary in order to determine, without an experiment, whether or not two correlated variables are going to be cause and effect related.
Consider the claim, "eating a lot of carbohydrates makes you gain weight." Go through those criteria one by one:
- Consistency: is this consistent across different races, different genders, etc?
- More or less, this claim is consistent.
- Control: Is the effect present when the cause is present? Do people who eat lots of carbohydrates gain weight?
- You can see a lot of people that eat lots of carbohydrates and don't gain a lot of weight.
- Correlation: does an increase in the amount of carbohydrates increase the amount of weight gained?
- All other things being constant, yes, more or less.
- Consideration of alternatives: is there anything else besides eating lots of carbohydrates that might make people gain weight?
- It's possible that people that eat lots of carbohydrates don't exercise as much as people that eat fewer carbohydrates. Maybe that's what's making those people gain weight.
- We've considered alternatives and found them to be plausible. So we're going to say that we can't say that this is the only cause.
- Physical mechanism: is eating lots of carbohydrates physically related to weight gain?
- They are.
So this claim almost passed, but it did not meet all of the criteria. So you can't say that this claim is 100% true.
Try this one yourself. Consider the claim "smoking causes lung cancer."
- Consistency: Do you see higher lung cancer rates among smokers across different genders and races? Yes, even across different countries. This is true worldwide.
- How about control? People can get lung cancer even if they don't smoke. But you see it in much higher rates with people that do smoke, and much lower rates in people that don't smoke.
- Correlation. Do groups of people who smoke have higher incidences of lung cancer than people that smoke less? Yes, they do.
- Considering the alternatives. What else might be causing lung cancer? It's possible that there's a genetic link that both causes people to smoke and predisposes them to lung cancer. Although that is somewhat plausible, it isn't highly plausible. Considering the alternatives, you can say that smoking is a more likely cause than genetics.
- The physical connection. Is there a scientifically understood physical connection between smoking and lung cancer? Yes. There have been experiments using the tar in cigarettes on animals, and those animals have developed cancerous tumors. So we understand the physical connection.
Therefore, this claim passes all of the tests, so you can reasonably claim that smoking does cause lung cancer. As our answers to the questions showed, smoking is not going to cause lung cancer in 100% of people; not everyone who smokes is going to get lung cancer. But we can say that smoking is a large, large contributor in contracting lung cancer.
2. Levels of Confidence
You can have different levels in your confidence in the causation.
- You can have a possible cause, which means you can imagine a scenario where A causes B. One thing causes the other.
- You can have probable cause, which means you're pretty sure that A causes B.
- You can have cause beyond a reasonable doubt, which means that you cannot think of a scenario where this second variable B, where the response could have been caused by anything other than A.
The only way to prove 100% definitively causation is with a controlled, randomized experiment. But by using a set of very stringent criteria, you can reasonably conclude that there's a causal link between two variables based on whether or not they meet those five criteria. Sometimes the alleged causes don't hold up under the scrutiny, but we can be certain of the ones that do. For this reason, we can describe levels of confidence in our causation.
Thank you and good luck!