Correlation vs causation: a working leader's guide

Your support tickets dropped 20% the same quarter you launched the new help centre. Tempting story: the help centre fixed it. Also possible: you shipped fewer features that quarter, so there was less to complain about. Same data, two explanations, and only one of them justifies doubling the content team.

The quick version

Correlation means two things move together. Causation means one makes the other happen. They are not the same, and most data only ever shows you the first.
The usual culprit is a hidden third factor, a confounder, driving both, or plain coincidence in a sea of comparisons.
The cleanest way to prove cause is to change one thing on purpose and see what moves: an experiment or A/B test.
When you can't experiment, you can still reason carefully, but say "associated with," budget like it might be coincidence, and name what could be confounding you.

The idea in depth

A correlation is just a measured fact: when one number goes up, another tends to go up (or down) with it. Statisticians have been able to put a number on that tendency since the 1890s, when Karl Pearson formalised the correlation coefficient. Pearson himself saw the trap almost immediately. In a footnote to the 1900 edition of The Grammar of Science he warned that "all causation… is correlation, but the converse is not necessarily true", finding a correlation does not let you read off a cause. The cliché is more than a century old, and people have been ignoring it for just as long.

Why is the leap so tempting? Because our minds are built to infer cause from co-occurrence, it is usually a useful shortcut, until it isn't. The discipline is to remember that a correlation between A and B is consistent with at least four different worlds: A causes B; B causes A (you may have the arrow backwards); a third thing C causes both; or it is simply coincidence. Telling those apart is the whole job.

flowchart TD
    Obs(["You observe: A and B move together"]) --> A1("A causes B")
    Obs --> A2("B causes A, arrow reversed")
    Obs --> A3("A hidden C causes both, confounding")
    Obs --> A4("Pure coincidence")
    A1 --> Move("So the move: change one thing on purpose and watch")
    A2 --> Move
    A3 --> Move
    A4 --> Move

One correlation, four possible worlds. The data alone can't tell you which. Leaders Loop

The confounder is usually the real story

The most common reason A and B move together is that something else moves them both. The classic illustration: ice-cream sales correlate with drowning deaths. Ice cream does not drown anyone, hot weather drives both swimming and ice-cream buying. That third variable, temperature, is the confounder. Once you account for it, the link between ice cream and drowning evaporates.

In business the confounders are subtler but the shape is identical. Customers who use your premium feature churn less, so push everyone to the feature? Maybe. Or maybe your most committed customers were always going to stay, and committedness is what drives both feature use and retention. The feature may be doing nothing. So the move is: before you act, write down the plausible third factors out loud. If you can name a confounder, you can design around it; if you can't be bothered to look for one, you will find it later in a failed roll-out.

"All causation… is correlation, but the converse is not necessarily true.", Karl Pearson, The Grammar of Science, 1900

Coincidence scales with how hard you look

The second trap is volume. If you compare enough variables, some will line up beautifully by pure chance. Tyler Vigen made this unforgettable with his Spurious Correlations project: by running hundreds of millions of comparisons across public datasets, he surfaced near-perfect correlations between things like US cheese consumption and the number of people who died tangled in their bedsheets. The charts are a joke; the lesson is not. A dashboard with fifty metrics is a coincidence-generating machine. The more relationships you trawl, the more spurious ones you will haul up, and the most striking chart in the deck is often the luckiest, not the truest.

The one move that actually proves cause

There is a reliable way out, and it is the backbone of modern science: the randomised controlled experiment. Split your population at random into a group that gets the change and a group that doesn't. Because assignment is random, the two groups are alike on everything else, known confounders and unknown ones, so any difference in outcome can be pinned on the change itself. This is exactly what an A/B test is. Randomisation is what turns "these things happened together" into "this thing caused that."

Judea Pearl, the computer scientist whose work reshaped how we reason about cause, frames this as climbing a "ladder of causation" in The Book of Why (2018). The bottom rung is seeing, spotting correlations in data. The middle rung is doing, intervening and watching what changes, which is precisely the experiment. You cannot answer a "doing" question with "seeing" data, no matter how much of it you have. So the move: when a decision is big enough and reversible enough, stop arguing about the correlation and run the test.

flowchart LR
    Seeing(["Seeing: spot a correlation in the data"]) --> Doing(["Doing: change one thing, randomise, measure"])
    Doing --> Cause(["Now you can claim cause"])
    Seeing -.->|"tempting shortcut, often wrong"| Cause

Pearl's first two rungs. You can't jump straight from seeing to cause, the dotted path is where bad decisions live. Leaders Loop

When you honestly can't run the experiment

Sometimes randomising is impossible, unethical, or too slow, you can't randomly assign half your customers to a recession. Here, honesty matters more than certainty. The most influential answer to "when can observation alone support a causal claim?" came from the epidemiologist Austin Bradford Hill, whose 1965 address The Environment and Disease: Association or Causation? laid out nine viewpoints, strength of the association, consistency across settings, the cause clearly preceding the effect (temporality), a dose-response gradient, and more, to weigh before treating an association as causal. They are not a checklist that mints proof; even Hill called them viewpoints, not rules. But naming a real limitation is part of the work: an observational case can be strong, and you should still spend like it might be wrong. The discipline is to use the careful word, "associated with," not "drives", and to keep looking for the confounder you missed.

A worked example

A regional sales director notices that reps who attend the optional Friday coaching call close 30% more than reps who skip it (figures illustrative). The obvious read: make the call mandatory for everyone. Before signing that email, she runs the four-worlds test from above.

Does the call cause the closes? Possibly. Or is the arrow reversed, do reps who are already closing well have the slack and confidence to show up on a Friday? Or is there a confounder, are the attendees simply the more conscientious reps, who would out-close the others coaching or not? That last one is lethal, because forcing the call on everyone wouldn't transplant the conscientiousness that was doing the real work.

So she does the cheap version of the right thing. For one quarter she randomly invites half the non-attenders to a mandatory call and leaves the rest as they were, a small A/B test on people who were previously self-selecting out. If the newly-mandated group's close rate climbs toward the veterans', the call has real lift and she scales it. If it doesn't budge, she just saved everyone a standing Friday meeting and learned the call was a marker of good reps, not a maker of them. Either way she traded a confident guess for a cheap, reversible answer, which is the entire point of knowing the difference.

Frequently asked questions

Doesn't a really strong correlation prove causation?

No. Strength is one signal Bradford Hill weighed, but a near-perfect correlation can still be pure coincidence or a confounder, Vigen pairs US cheese consumption with bedsheet-tangling deaths at around 0.95, and the two have nothing to do with each other. A high number tells you the pattern is tight, not that one thing is moving the other.

What's the fastest way to spot a likely confounder?

Ask: "What kind of person, team, or condition ends up in both groups?" If you can describe a type, the committed customer, the conscientious rep, the hot day, that naturally produces both behaviours, you've probably found your confounder. Then either measure it or randomise to neutralise it.

If I can't run an experiment, is the data useless?

Not useless, just weaker evidence. Use it to form a hypothesis and prioritise, not to declare cause. Say "associated with," name the confounders you can't rule out, and look for the supporting signs Hill described, like the cause reliably preceding the effect.

Isn't this just common sense?

It's common sense that's expensive to skip. The trouble isn't knowing the rule, it's that a tidy chart in a slide deck is persuasive precisely when you're tired and want to decide. The habit, not the knowledge, is what's rare.

How is this different from regression?

Regression measures and adjusts for relationships you've thought to include, which can help control for known confounders. But it still rests on observational data, so it can't rule out the confounders you didn't measure. See Regression for what it can and can't do.

Related in the Toolkit

Regression (linear, non-linear, logistic), the main tool for adjusting for known confounders, and where its limits start.
Statistical significance: p-values, t-scores, chi-square, how to judge whether a correlation is even real before asking if it's causal.
Descriptive statistics (mean, median, mode, variance, SD), the summaries you compute before you start spotting relationships.
Distributions, percentiles & quartiles, why the shape of your data changes which correlations you should trust.
Data types (discrete/continuous, categorical/ordinal), which correlation measure even applies depends on what kind of data you have.
Reversible vs irreversible decisions, how hard you should fight to run an experiment before acting.
First principles vs heuristics vs analogical reasoning, the four-worlds test is first-principles thinking applied to a chart.
Jobs-to-be-Done & needs research, qualitative cause: why customers do what the numbers only describe.

Where to go next

The Book of Why (Pearl & Mackenzie, 2018), the modern, readable case for why "seeing" data can't answer "doing" questions.
The Environment and Disease: Association or Causation? (Bradford Hill, 1965), the original nine viewpoints for weighing causation without an experiment; short and still cited daily.
Tyler Vigen's Spurious Correlations, five minutes here permanently inoculates you against a tight-looking chart.
"The danger of mixing up causality and correlation", Ionica Smeets, TEDxDelft, a clear, funny 12-minute talk that makes the confounder idea stick.