A bank installs an ATM. Tellers, the obvious story goes, lose their jobs. The obvious story is wrong: as cash-handling got automated, the cost of running a branch fell, banks opened more branches, and the work that remained, selling, advising, solving the awkward problems a machine can't, became the job. The number of tellers in the United States actually rose for decades after the ATM arrived. Same technology, two completely different outcomes, depending on one decision: did you use the machine to replace the person, or to amplify them?

The quick version

  • Automation hands a whole task to a machine and removes the human. Augmentation keeps the human in the loop and makes their judgement faster, broader, or sharper.
  • Jobs are bundles of tasks, not single things, so the real question is never "automate this job?" but "which tasks, and to what end?"
  • Automating the routine part of a role often raises the value of the parts left behind. That is where augmentation pays.
  • The trap is copying a human instead of complementing one: you get a worse worker and a more fragile org, not a better one.

The idea in depth: a job is a bundle of tasks

The single most useful move a leader can make here is to stop thinking in jobs and start thinking in tasks. That framing comes from economists David Autor, Frank Levy and Richard Murnane, whose 2003 paper "The Skill Content of Recent Technological Change" argued that technology doesn't substitute for people wholesale, it substitutes for specific tasks. They split work into the routine (rule-based, codifiable steps a machine can follow) and the non-routine (judgement, persuasion, dexterity, dealing with the unexpected). Computers, they showed, are very good at the first and weak at the second.

In practice that means: take any role on your team and write its tasks in two columns, routine and rule-based on the left, judgement and human-facing on the right. Automation candidates live on the left. The right-hand column is where augmentation, and most of the value, lives. You almost never automate a job; you automate a column.

flowchart TD
    J(["A role = a bundle of tasks"]) --> R("Routine, rule-based tasks")
    J --> N("Non-routine judgement & human tasks")
    R --> A("Automate: hand the task to a machine")
    N --> G("Augment: give the person a better tool")
    A --> V("Freed-up time & lower cost")
    G --> V
    V --> O(["More valuable role, not a deleted one"])
					
Decompose the role before you decide. Automation and augmentation act on different tasks, and can compound. Leaders Loop

Why automating one task can make a person more valuable

Here is the counter-intuitive part. Autor's later work, his 2015 paper "Why Are There Still So Many Jobs?" and his TEDxCambridge talk, leans on what he calls the O-ring principle, after economist Michael Kremer's model (named for the cheap rubber seal that destroyed the Challenger shuttle in 1986). The idea: when work is a chain of interlocking steps, every link has to hold for the whole thing to succeed. Automate one link to near-perfection and you don't make the others redundant, you make them the bottleneck, and therefore more valuable. Cheap, reliable cash dispensing made the teller's advice the scarce, valuable thing.

The practical question, then, comes right after you automate a task: which remaining human task just became the constraint on quality? That task is your investment target, train for it, hire for it, design the role around it. This is the same logic behind Jobs-to-be-Done: customers don't want the task done, they want the outcome, and the outcome usually still routes through a person.

"Automate a link in the chain and you don't delete the others, you turn them into the bottleneck, and the bottleneck is where value concentrates."

This is also why the augmentation framing tends to pay better in practice. In a 2023 NBER study, "Generative AI at Work," Erik Brynjolfsson, Danielle Li and Lindsey Raymond watched a generative-AI assistant roll out across 5,179 customer-support agents. Productivity rose about 14% on average, but the gains were concentrated among novice and lower-skilled agents, who effectively absorbed the tacit know-how of their best colleagues. The AI didn't replace the agents; it raised the floor. A field experiment by Fabrizio Dell'Acqua, Ethan Mollick and colleagues with Boston Consulting Group, "Navigating the Jagged Technological Frontier" (2023), found consultants using GPT-4 completed 12.2% more tasks, 25.1% faster, at higher quality. Augmentation, working as advertised.

The honest limitation: the jagged frontier and the Turing trap

Now the part most AI pitches skip. The same BCG study found that on tasks outside the model's competence, what the authors call the "jagged technological frontier", AI didn't just fail to help. Consultants using it were more likely to reach the wrong answer, because the tool produced confident, fluent, plausible nonsense and people trusted it. Augmentation can quietly become de-skilling if the human stops checking. This connects to model risk and explainability: a tool you can't interrogate is a tool you over-trust.

There's a strategic version of the same warning. Erik Brynjolfsson's 2022 essay "The Turing Trap" (in Daedalus) argues that we are drawn, by habit, since Turing, to build AI that imitates humans rather than AI that extends them. Imitation steers us toward substitution, which concentrates power and gains in whoever owns the machine. Complementary AI, which does what people can't, tends to spread value more widely. His point is that automation-versus-augmentation isn't only an efficiency choice; it's a choice about who benefits. That is an argument, not a settled empirical law, and worth flagging as such, but it reframes the decision usefully.

Which turns "should the machine do this alone?" into a question about the cost of being wrong. Where errors are cheap and reversible, automate freely. Where they're expensive or hard to undo, keep a human in the loop and use AI to augment, a direct application of reversible vs irreversible decisions.

flowchart TD
    S(["A task you might give to AI"]) --> Q1{"Is it routine & rule-based?"}
    Q1 -->|"No"| AUG("Augment: human decides, AI assists")
    Q1 -->|"Yes"| Q2{"Are errors cheap & reversible?"}
    Q2 -->|"Yes"| AUTO("Automate: machine runs it, audit by sample")
    Q2 -->|"No"| HITL("Augment + checkpoint: AI drafts, human approves")
    AUG --> CHK(["Watch for the jagged frontier"])
    HITL --> CHK
					
A working heuristic. Routine-ness decides automate-vs-augment; reversibility decides how much human oversight to keep. Leaders Loop

A worked example: the claims team

Take an insurance claims team of ten handlers. (The numbers below are illustrative, to show the reasoning, not figures from a specific company.) Their day breaks into tasks: pulling documents, checking policy coverage, calculating routine payouts, spotting suspicious patterns, and the hard one, phoning a distressed customer who has just lost a house and explaining what happens next.

The lazy framing is "AI can do claims handling, cut the team." The task framing is sharper. Document retrieval and standard payout calculation are routine and rule-based: automate them, with sample audits. Coverage checks against messy policy wording sit on the jagged frontier, AI drafts an assessment, a handler approves it, because a wrong "not covered" is expensive and hard to reverse. Fraud-pattern detection is genuine augmentation: the model flags anomalies a tired human misses, and the human investigates. And the distressed-customer call? That's the O-ring. Once you've automated 60% of the routine load, the quality of that conversation becomes the thing the whole team's reputation rides on.

The result isn't ten handlers cut to four. It's ten handlers who now spend their day on coverage judgement, fraud, and customers, the high-value column, supported by tools on the low-value one. Borrow Davenport and Kirby's language from their 2015 HBR piece "Beyond Automation": some people step aside (lean into the human work machines can't do), some step up (orchestrate the augmented system), and some step in (manage and improve the automation itself). Same headcount, a far more valuable team.

Frequently asked questions

Is augmentation just automation with better PR?

No. The test is whether a human stays in the decision loop and ends up able to do more. Automation removes the person from a task; augmentation hands them a tool and keeps their judgement load-bearing. If your "augmentation" project has no human exercising judgement at the end, you've automated, be honest about it, because the risks differ.

Doesn't augmentation just delay the inevitable job losses?

Sometimes tasks do disappear, and pretending otherwise is dishonest. But Autor's task framework and the ATM history show the common pattern is reallocation, not deletion: routine tasks fall away and the residual human tasks grow in value. The leadership job is to invest in that residual, reskilling toward the judgement-heavy column, not to assume the headcount simply shrinks.

How do I decide which tasks to automate first?

Score each task on two axes: how routine and rule-based it is, and how cheap and reversible an error would be. High on both, automate first. Routine but costly-if-wrong, automate with a human checkpoint. Non-routine, augment, don't automate. The first diagram above is the short version.

What's the biggest mistake leaders make here?

Trusting fluent output on tasks beyond the tool's real competence, the jagged-frontier problem. AI is most dangerous exactly where it's most confident and least correct. Build in sampling, spot-checks, and a culture where "the AI said so" is never a complete answer for a consequential call.

Does this apply to small teams, or only big ones?

Especially small teams. When you have ten people, freeing each from routine load is the difference between firefighting and doing the work only humans can. Augmentation lets a small team punch at the scope of a much larger one, provided someone owns the checking.

Related in the Toolkit

Where to go next