The Great Correlation/Causation Conundrum
Every class in psychology probably repeats the mantra, “correlation does not equal causation.” But what does it mean?
It means that when you know there is a correlation between two variables – which is worthwhile information in itself – you cannot know WHY the correlation exists. (Stop here for a moment and repeat that as a thought: you have discovered there IS a correlation, but you don’t know WHY the two things are correlated.)
Let’s back up a step: What is a correlation? A correlation is a statistical measurement. It tells us that two variables fluctuate in a predictable pattern relative to each other. That’s nice to know, whenever it’s true. When we know there is a correlation, then we can use it to predict the value of one variable from the other. Correlations are useful this way. They let us make better predictions.
For example, if there is a correlation between SAT scores and success in college, we can predict who is most likely to be successful in college when we know people’s SAT scores. But wait, we can also predict what a person’s SAT score was if we know how successful that person now is in college. Confusing? I only looked at it in the opposite direction. It goes both ways as a predictor.
In vague terms, we measure two variables, one we call “A” and one we call “B.” Let’s say the calculations show that there is a correlation between A and B. And let’s say it’s a moderately-strong correlation. When A and B correlate, we can use one to predict the other. In other words, if we know that A and B are correlated we can use this knowledge when we come across some situation of A without yet knowing B; we can then predict what B will be because we know A and know there is a correlation between them. Or we can start with B. We come across some situation of B without yet knowing A; we can then predict what A will be because we know B and know there is a correlation between them. The stronger the correlation, the more accurate our predictions.
Made to look simple, it’s this:
If we know A and B are correlated, then…
if all we know is the score of A → we can predict what the score will be on B; or
if all we know is the score of B → we can predict what the score will be on A.
Why? Because the two are correlated: They fluctuate in a predictable pattern relative to each other. Some real examples (serious, silly, whatever):
If SAT scores (A) are correlated to success in college (B), then…
if we know a person’s SAT score → we can predict how successful they will be in college (this is what admissions offices do); or
if we know how successful a person is in college → we can predict what their SAT scores were (past tense because they took the SAT before they came to college).
If the volume (decibel level) of a bar is correlated to beer consumption/sales, then…
if we know how much beer is being sold at the bar → we can predict how loud the place is going to be; or
if we know how loud the place is → we can predict how much beer is being sold/consumed.
If the amount of body area covered by tattoos is correlated to income, then…
if we know how tattooed a person is → we can predict their income level; or
if we know a person’s income → we can predict how much of their body is tattooed.
Will our predictions always be accurate? No, because the strength of correlations vary from weak to moderate to strong. If we had strong correlations, then there would be few exceptions and our predictions would have a good chance of being correct. If we have weak correlations, then there are many more exceptions to the rule, and our predictions are going to be better than a wild guess, but with a good chance of being wrong anyway. We can think of the strength of the correlation as a reflection of its error rate: the stronger the correlation, the less chance of error when we predict one from the other.
And now back to CAUSATION.
Causation means that one thing is a reason why something else happens. When we talk about causation in psychology, we don’t always mean that the cause is automatic and direct, but we do mean that, for the most part, the cause is leading to some change (the effect). For example, the weather causes people to wear more or less clothing. In the complexity of real life, the weather influences a person’s decisions about what to wear, and it’s one factor among many that leads to how much clothing is on that person during, say, the drive to work. We know that anything we look at is complex and made up of multiple causes, but if we find that changing the weather leads to changes in clothing on people, then we say weather is a cause (not THE cause, but a cause) of how much clothing a person wears. It’s part of explaining why something happens.
Which also means that weather will be correlated to the amount of clothing on a person: there will be a discernible pattern.
So if A and B are correlated, then there must be some reason they are correlated. This is where the “does not equal causation” part comes in. If A and B are correlated, there are THREE POSSIBLE explanations for that correlation:
variations in A might cause changes in B;
variations in B might cause changes in A;
variations in something else (call it C) might cause A and B to change in pattern together.
That’s it. Three possible explanations. Which one is the right one for the correlation you are looking at? And the answer is: you can’t know yet. All three are possible. (With one exception: if A and B are chronologically dependent, only two are possible. I’ll get to this below.) Again, all three are possible. So it’s not that there is NO causation in the mix, the problem is that you just can’t know which causation model is right. You have a correlation. All three causation models are possible. You can’t simply pick one and think it’s the right one. And so you have a correlation, but it does not equal picking ONE of those causation models as the correct one. That’s what we mean by “correlation does not equal causation.”
By the way, the “something else” option is what people call the “third variable” problem (statisticians might refer to it as a “lurking variable” or a “confounding variable,” which is technically correct, but that term complicates things for beginners, so it’s best to stick to the third variable label for now).
Bringing our concrete examples into play:
If the amount of body area covered by tattoos (A) is correlated to income (B), then the possible explanations are…
1. getting or removing tattoos (A) causes changes in a person’s income (B);
2. getting a salary raise or salary reduction (B) causes a person to be less tattooed or more tattooed (A), respectively (it’s what we call a negative correlation, meaning the more one’s salary, the less the body is tattooed);
3. something else, like maybe different types of peer influence or self-selection into career paths or something else (C) is the cause of BOTH how much a person tends to be tattooed (A) AND how low or high the income is (B).
We can theorize and guess about which of these explanations is most likely, but if our only evidence is the correlation, we CAN’T KNOW which is right. Of course in this example, the second possibility is silly: getting a raise at work won’t make tattoos disappear from your body, even in the long run. But the first is possible: a tattooed person might face discrimination in many career paths, and getting more tattoos may reduce job opportunities; however, most likely in this case, we’re left to explain the pattern with the “third variable,” but we don’t really know what it is (more on this below).
If the volume (decibel level) of a bar is correlated to beer consumption/sales, then the possible explanations are…
1. turning up the volume on the music will cause people to buy more beer;
2. when people buy more beer, it causes the bar to get louder;
3. something else, like the size of the crowd in the bar, causes BOTH the amount of beer sold to increase AND the decibel level to increase.
Which is it? You can’t yet know. We can conduct an experiment and test one of these at a time, but until we do, we can’t yet know which explanation is the right one.
If SAT scores are correlated to success in college, then the possible explanations are…
1. changing a person’s SAT score to a higher score will make them more successful in college;
2. CHRONOLOGY: can’t go backwards in time, so college success can’t retroactively cause changes in their high school SAT scores, so in this example we can eliminate B → A;
3. something else, like different levels of parental involvement or IQ is the cause of BOTH a person’s SAT score AND their likely success in college.
For the “third variable” I mentioned each time a couple of choices, but it has to be ONE thing that causes the pattern. That is, you need to always explain the fact that there is a pattern between A and B by using one “third variable” as the cause. In the case of tattoos, the more tattooed a person is, the lower their income. It’s a weak correlation, but a real one. In general, people who are more tattooed have lower income than people who are less (or not at all) tattooed. Why? Maybe the peers who influence someone to get a lot of tattoos are also influencing the person in a way that reduces opportunities for good jobs (e.g., they may influence their friends to devalue school), or maybe the kind of people who choose to get more tattoos are not the kind of people who pursue the careers that provide the best incomes. Notice how either one of these explanations (peer influence OR personality) would explain BOTH sides of the correlation: tattoos and income.
Keep in mind that the “third variable” is not a synonym for “lots of things.” Life is complex. We know that. But we have a correlation to account for and there must be some common thread linking A and B together. The “third variable” has to be a common thread. (If there’s no common thread, then the correlation is totally spurious and essentially meaningless, a product of chance.)
When can we say CAUSATION?
If a correlational study doesn’t let us choose which causal model is the correct one, there must be some other way, and that other way is called an EXPERIMENT. Unlike a correlational study, which cannot determine which variable comes first, an experiment changes the circumstances to see if the change results in a difference at the end. In scientific terminology, an experiment MANIPULATES one VARIABLE to see if it has an effect on another variable. In other words, is that first variable why the second one changes? (Remember, a correlation only measures the two variables – it doesn’t play around with them to see what would happen if….)
A well-known experiment in psychology tested to see if changing the number of people in a room would influence a person to seek help when smoke starts coming in from under a doorway. In other words, does the number of people who just sit there CAUSE a person to be less likely to seek help for a possible emergency? What we do to test this is bring a participant (and we do this to 50 or 100 participants, one at a time) into a room, ask them to fill out some paperwork, then leave them alone; a few minutes later we turn on the smoke machine and smoke comes in from under a side door in the room. We see how long it takes before the person gets up to get help. The experiment is that some participants are seated in the room alone, and some are seated in the room in which five other people are filling out paperwork at other seats, and when the smoke starts, these five other people do nothing special (they are paid actors, not actual participants; only the one person is a true participant in the study).
Okay, so what do we have there and why is it an experiment that will tell us something about causation?
We set up something that looks like this:
A + B + C + D + E + F + g1 → X
vs.
A + B + C + D + E + F + g2 → X
And we want to see if the score (X) for the first group is different from the score (X) for the second group.
What are the letters? Well, they are things like:
A – the room (it’s the same room)
B – the number of seats (same number)
C – the forms to fill out (same for both groups)
D – timing of the smoke (same)
E – amount of smoke (same)
F – whatever else except for “g” (same, same, same)
g – the condition we modify (g1 = no others present; g2 = 5 others present)
X – the measure of how long before the participant goes for help
Normally we just show IV → DV (for independent variable → dependent variable), but I think seeing lots of variables makes the point a little better. There’s a lot going on there and we hold as much constant as possible, with only the independent variable (in this case “g”) being different. That way, if the results of group 1 differ from the results of group 2, we know it must have been due to variable “g” because everything else was the same. (If one of those other variables, A-F are not the same for both groups, it confounds the results.)
(We find that a person in the condition shown here as g1, alone in the room, gets up within about 10 seconds after the smoke starts and seeks help; a person in the condition shown as g2, with 5 passive people planted in other seats, usually needs to be rescued after 15 or 20 minutes because the person in that situation doesn’t seek help.)
So what causes what? In this example, the presence of others causes a person to avoid seeking help for a weird occurrence. Or maybe we should say the absence of others makes a person more likely to seek help for a weird occurrence. So it wasn’t the smoke itself, it wasn’t the timing of the smoke, it wasn’t the personality of the person (we know that only if we have a large enough sample, but I won’t explain this here), it was the presence or absence of other people.
Of course the true causal sequence – why it happens – is more complex, but in time we figure it out. There is some mechanism leading from the condition of being with others in that room to the act of getting out of one’s chair, and it involves seeing the smoke, and thinking about what is going on, and seeing the reactions of others, and thinking about what the other people are doing, and wondering why they’re not moving, and thinking about what the options are, and thinking about what might happen if the option to say something is chosen, and having emotions, and so on. And of course any one thought involves a causal chain of events involving language habits and memories and millions of neurons. But on a simplistic level we allow ourselves to say “the presence or absence of others causes changes in a person’s reaction to a weird situation.”
Okay, so now, Frequently Asked Questions:
What?
Yes, well, it’s a lot to read and maybe a lot to think about, but if you take your time with it, it should make sense. It’s logical.
Can’t a correlation be positive or negative?
Yes, these words only tell us what the pattern looks like: do the numbers of A and B rise together and fall together? Or does B fall when A rises and vice versa? It’s not so much that we made up the names for each pattern, but that the statistical calculation produces a positive number for the one and a negative number for the other.
So a correlation can’t be causation because there might be a third variable?
No, that’s not the point. It can’t tell us what the direction of causation is because we didn’t intentionally change anything (by manipulation) to see how the change plays out; all we did was measure two things. If we find a correlation (a pattern between the two things we measured), then there is SOMETHING that explains it (unless it’s totally by chance). That something might be that changing the one variable leads to changes in the other, or vice versa (so there we have two possibilities) OR a third possibility: some variable we haven’t captured yet is the culprit explaining the pattern. It’s a mistake to rush to blame a third variable.
Are the two variables in a correlation the independent variable and the dependent variable?
No, they are really just variables. Independent would mean that you know it is independent of the other (and dependent would mean you suspect it depends on the other), but we don’t know which way it might work when we have a correlation. You could call one the “predictor variable,” or “explanatory variable,” but that’s when you are applying the correlation to some situation; for example, if SAT scores are correlated with success in college, we can use SAT scores to predict success in college, or we can turn that around and use success in college to predict what a student’s SAT score was; in either case, the first one mentioned is the predictor variable.
Does a correlation prove anything?
Let’s not use the word “prove.” And that applies whether we’re looking at experiments or correlational studies. Proofs imply (or in math technically mean) that something is always true. In psychology nothing is always true (there are always exceptions). So let’s say, “Does a correlation verify anything?” Yes, it verifies the existence of the correlation. Without the study we would be guessing whether the predictive pattern is there; when we conduct the correlational study we discover that it is real (or not). So a correlation “proves” that a pattern exists between two variables.
Aren’t you stereotyping people with tattoos with your example above?
No, but this is interesting. An actual stereotype – one used informally in our lives – relies on an assumption of correlations. For example, if we say “poor people are lazy” (a stereotype), we are assuming that a correlation exists between income (or wealth) and productivity (activity level). If we have not done research to verify that pattern, then it’s what we call an illusory correlation: an impression of a correlation that isn’t true (it’s an illusion). Now as for tattoos and income, when we do conduct the research and find conclusions that we don’t like, then we have conclusions that we don’t like, but it’s not untrue. What could be controversial is how we use that information.