Diversity, equity & inclusion: what the evidence actually supports

Few phrases have travelled further from their plain meaning than "diversity, equity and inclusion." It has become a slogan, a department, a culture-war flag, anything but a clear set of ideas. Strip the politics away and you are left with something a working leader can actually use: three distinct things you can measure, three different sets of moves, and a body of research that is more honest and more useful than either the boosters or the sceptics tend to admit.

The quick version

Diversity is who is in the room, the mix of backgrounds, identities and ways of thinking on your team. It is a fact about composition, and it is the easiest of the three to count.
Equity is whether the rules give people a fair shot, looking at outcomes (who gets hired, paid, promoted) and removing the friction that quietly disadvantages some groups. It is about the system, not the individual.
Inclusion is whether people, once in the room, can actually contribute and belong. Diversity without inclusion is a guest list nobody dances at.
The honest evidence: the headline "diverse companies are more profitable" claim is weaker than commonly stated, but the case for fair, inclusive systems, and against most diversity training, is strong. Change the design, not just the slogan.

The idea in depth: three words, three jobs

The most common mistake is treating the three as one initiative. They are not. Vernā Myers, the diversity strategist who became Netflix's first head of inclusion strategy, put the difference between two of them better than any textbook: "Diversity is being invited to the party; inclusion is being asked to dance." Getting people through the door (diversity) and making the room one they want to stay in (inclusion) are separate achievements, and most organisations are far better at the first than the second.

"Diversity is being invited to the party; inclusion is being asked to dance.", Vernā Myers

Which is why the useful version isn't "a DEI programme" at all. It's three specific questions, each with its own metric. Diversity: who is on the team, by level? Equity: at which stage of hiring, pay or promotion do groups diverge, and is that gap explained by anything other than the group? Inclusion: do people feel safe to speak, disagree and bring problems forward? That last one connects directly to belonging and engagement, and to the psychological-safety research that shows teams perform better when members can take interpersonal risks without fear.

flowchart LR
  A(["Diversity
who is in the room?"]) --> B(["Equity
do the rules give
a fair shot?"])
  B --> C(["Inclusion
can they contribute
& belong?"])
  C -.->|"without this,
the rest leaks away"| A

Three distinct jobs, in sequence, and inclusion is what stops the first two from draining back out (people you hired but never made room for, leave). Leaders Loop

The business case is real, but not where most people put it

Here is where intellectual honesty matters most, because the most-quoted statistic in the field does not hold up the way it is used. McKinsey's "Diversity Matters" series, the original (2015), "Delivering Through Diversity" (2018), "Diversity Wins" (2020) and "Diversity Matters Even More" (2023), reported that companies in the top quartile for executive-team gender diversity were markedly more likely to outperform financially, a figure McKinsey said grew from a 15% greater likelihood in 2015 to 39% by 2023. Those numbers were repeated in countless board decks as proof that diversity causes profit.

They do not show that. In a 2024 paper in Econ Journal Watch, accounting researchers Jeremiah Green and John Hand attempted to replicate the studies using McKinsey's own preferred performance measure (EBIT margin) on S&P 500 firms, and could not reproduce a statistically significant link. Worse for the causal story: McKinsey's diversity data was largely collected after the financial period it was correlated with, so to the extent there is a relationship, the arrow may point the other way (profitable firms can afford to diversify their executive ranks). McKinsey itself noted it was not asserting causation, but the public framing routinely did. The practical takeaway: drop the McKinsey number from your business case. It won't survive a sceptical CFO, and reaching for it makes the rest of your argument look like advocacy.

An honest limitation, in both directions. Failing to replicate a correlation is not proof that diversity has no value; it means that this particular firm-level profitability claim is unproven. The better-supported case sits at the team and decision level: diverse groups tend to scrutinise facts more carefully and resist premature consensus, which is precisely the failure mode that good decision-making guards against. Treat diversity as a way to make better decisions and avoid groupthink, a defensible, mechanism-level claim, rather than as a lever you pull to lift the share price.

Why most diversity training fails (and what replaces it)

If there is one finding every leader should internalise, it is this: the default intervention, mandatory unconscious-bias training, is among the least effective tools available, and can backfire. Sociologists Frank Dobbin and Alexandra Kalev, analysing more than three decades of data from 829 US firms, found that the standard toolkit of mandatory training, hiring tests and grievance systems reduced the share of women and minorities in management. They set out why in "Why Diversity Programs Fail" (Harvard Business Review, 2016): people respond to being commanded to think differently with resistance, and a half-day course does nothing to change the systems that produce unequal outcomes.

The alternative is the most useful idea in the whole field, and it comes from behavioural economist Iris Bohnet of Harvard. In What Works: Gender Equality by Design (Harvard University Press, 2016), Bohnet argues that since de-biasing individual minds is slow, expensive and unreliable, you should de-bias the environment instead, redesign the process so bias has fewer opportunities to operate. What that looks like is concrete and doable: structure your interviews (same questions, same order, scored independently before discussion), strip names and demographic signals from CVs at the screening stage, compare candidates side-by-side rather than one at a time, and audit promotion and pay data for unexplained gaps. None of that requires changing how anyone feels; it changes what the system does.

The canonical illustration is Claudia Goldin and Cecilia Rouse's study of symphony orchestras, "Orchestrating Impartiality" (American Economic Review, 2000). When orchestras put a screen between the audition panel and the musician, the share of women advancing and being hired rose substantially. (In fairness, that result has been re-examined and some economists argue the headline effect is less clean than usually told, so hold it as a vivid example of the mechanism, not as a precise effect size.) The point survives the quibble: a small change to the process outperformed years of asking panels to be less biased.

flowchart TD
  A(["Goal: fewer biased decisions"]) --> B{"Fix the person
or fix the process?"}
  B -->|"Fix the person
(mandatory training)"| C(["Resistance, little change,
can reduce representation"])
  B -->|"Fix the process
(structure, blind screening)"| D(["Bias has fewer
chances to operate"])
  D --> E(["Measure the outcome:
did the gap actually close?"])

Bohnet's core move: de-bias the environment, not the individual, then check the numbers, not the sentiment. Leaders Loop

A worked example

Take a 200-person software company, call it Meridian. (Illustrative figures throughout; this is a teaching example, not a real company.) Leadership notices that women make up around 45% of new graduate hires but only about 18% of senior engineers, and the board wants "a DEI initiative." The reflex is to book everyone onto an unconscious-bias workshop. The evidence above says that is the move most likely to spend the budget and move nothing.

Instead, Meridian pulls the three jobs apart. Diversity tells them the entry pipeline is fine, the problem is not who walks in the door. Equity is where the gap lives: they audit the promotion data and find women are put forward for senior roles at roughly half the rate of men with comparable performance ratings, and that promotions hinge on a self-nominated, narrative-heavy case that rewards confident self-promotion. So they redesign the process, managers nominate everyone who meets the bar (not just those who put their hand up), the case is scored against fixed criteria before any group discussion, and the panel reviews candidates side-by-side. Inclusion is the third check: an engagement pulse shows women in senior teams report lower psychological safety, so they invest in the conditions for speaking up rather than another poster campaign.

A year on, the promotion-nomination gap has roughly halved, not because anyone was lectured about bias, but because the process stopped quietly filtering people out. That is the whole argument in miniature: the change that worked was structural, measurable, and aimed at the outcome, not the slogan.

Frequently asked questions

Is there really a business case for diversity?

There is a defensible one, but not the one usually cited. The firm-level "diverse companies make more money" correlation, most associated with McKinsey's reports, could not be replicated by independent researchers, and even at face value the causation may run the other way. The stronger case is at the decision level: diverse groups tend to challenge assumptions and resist groupthink, which improves the quality of hard decisions. Make that argument, not the share-price one.

What's the difference between equity and equality?

Equality means treating everyone identically; equity means giving people what they need to have a fair shot, which sometimes means treating situations differently. The practical test is outcomes: if an identical process consistently produces unequal results between groups with comparable qualifications, the process, not the people, is where to look. Equity work is largely the work of finding and fixing that friction.

Does unconscious-bias training work?

Mostly not, on its own. Training can raise awareness in the short term, but the large-scale evidence (Dobbin and Kalev's analysis of 829 firms) shows mandatory programmes can reduce representation rather than improve it, partly through resistance. Voluntary training fares better, but no half-day course changes the systems that produce unequal outcomes. Spend the effort on process design instead.

Isn't this just hiring on identity instead of merit?

The evidence-based version is the opposite. Blind screening, structured interviews and side-by-side comparison exist to make hiring more merit-based, by removing the demographic signals and inconsistent judgements that let bias in. The aim is to judge the work, not the name on the CV, which is a stricter merit standard than the unstructured "gut feel" it replaces.

How do we measure inclusion, which feels unmeasurable?

Proxy it through behaviour and survey items you can track: do people from all groups speak in meetings, raise problems, stay (retention by group), and report that they can disagree without penalty? Psychological-safety survey items give a repeatable read. Inclusion is harder to count than headcount diversity, but it is not unmeasurable, and what you do not measure here, you will not manage.

Related in the Toolkit

DEI is a property of an organisation's culture, so it sits alongside how that culture forms and what it stands for: the way culture forms and persists determines whether inclusion sticks or washes off, and the same design discipline shows up after a merger, when two cultures have to be integrated without one quietly excluding the other.

How organisational culture forms & persists, whether inclusion holds depends on the cultural machinery underneath it.
Defining & embedding values, equity and inclusion become real when they are embedded as values with teeth, not posters.
Belonging & engagement, inclusion is the input; felt belonging and engagement are the outputs you actually want.
Wellbeing & psychological health, psychological safety is the floor inclusion stands on; without it, diversity stays decorative.
Subcultures & cultural integration (esp. post-M&A), integrating cultures is an inclusion problem at organisational scale.
Leadership styles & models (situational, servant, transformational, adaptive), inclusive leadership is a behaviour the style models name and develop.
Onboarding & ramp, the first weeks decide whether a new hire from any background feels invited to dance.
Centralisation vs decentralisation, whether DEI is owned centrally or by each unit changes how consistently it is applied.

Where to go next

What Works: Gender Equality by Design, Iris Bohnet (Harvard University Press, 2016), the best single book on the topic: de-bias the system, not the person, with evidence behind every move.
"Why Diversity Programs Fail", Dobbin & Kalev, Harvard Business Review (2016), the data on why the standard toolkit backfires, and what works better.
"McKinsey's Diversity Matters/Delivers/Wins Results Revisited", Green & Hand (2024), the replication paper every leader citing the business case should read first.
"Diversity is being invited to the party; Inclusion is being asked to dance", Vernā Myers (YouTube), a short, sharp talk on why getting people in the door is only half the job.