Open almost any team's backlog and you will find the same thing: a thousand-item graveyard where good ideas go to die of old age. Tickets from two reorgs ago. A feature request from a customer who has since churned. Three near-duplicate entries nobody dares delete in case they matter. The backlog has stopped being a tool for deciding and become a museum of everything that was once urgent.

Good backlog management is the opposite habit. It treats the list as a living, ordered statement of what to do next, short enough to read, ranked by value, and pruned often enough that the top is always trustworthy. The skill isn't adding things. Anyone can add things. The skill is ordering, refining and ruthlessly removing.

The quick version

  • A product backlog is an emergent, ordered list of what to do next, and there should be only one per product, not a backlog per team, label or stakeholder.
  • Health has a name: keep it DEEP, Detailed appropriately, Estimated, Emergent, Prioritised. The top is sharp; the bottom is deliberately fuzzy.
  • Order by value and cost of delay, not by who shouted loudest or who filed the ticket first.
  • The weekly work is refinement and deletion, not collection. A backlog you never prune isn't a plan, it's a wish-list.

The idea in depth

The word "backlog" carries some baggage. In ordinary English it means a pile of unfinished work, the implication being you've fallen behind. In product management it means something more deliberate, and the distinction is the whole game.

One ordered list, not a heap

The 2020 Scrum Guide defines the Product Backlog as "an emergent, ordered list of what is needed to improve the product," and "the single source of work undertaken by the Scrum Team." Two words there do the heavy lifting. Ordered means it is a sequence, not a set, there is a top, and the top is a commitment about what comes next. Single means one backlog per product: no separate hideouts for bugs, tech-debt or the VP's pet idea, all competing for the same engineering hours in the open.

So the move is to enforce the single ordered list as a discipline, not a formality. If your "backlog" is really five backlogs in a trench-coat, a bug tracker here, a stakeholder spreadsheet there, a "someday" tab nobody opens, then nothing is genuinely prioritised, because the trade-offs never meet on the same page. Merge them. Make every item compete against every other item for the same scarce capacity. That competition is prioritisation.

Keep it DEEP, sharp at the top, fuzzy at the bottom

The most useful health check for a backlog is an acronym from Roman Pichler and Mike Cohn: DEEP, Detailed appropriately, Estimated, Emergent, Prioritised. The clever part is "appropriately." Items near the top, which you'll build soon, should be detailed and ready. Items further down should be deliberately vague. Pichler's DEEP write-up borrows a line from Ken Schwaber and Mike Beedle's Agile Software Development with Scrum that puts it well: "the lower the priority, the less detail, until you can barely make out the backlog item." Writing crisp acceptance criteria for something you won't touch for six months is waste, the world will have changed the requirement before you reach it.

This maps onto what Scrum calls refinement: the ongoing act of breaking items down and adding "a description, order, and size" so they're ready when their turn comes. Run it as a standing rhythm, a short, regular session where you sharpen the next few items, re-rank, and cut anything stale, rather than a panic the night before sprint planning. An item is "ready" when the team could pick it up and finish it within a sprint without going back to ask what it means.

"The lower the priority, the less detail, until you can barely make out the backlog item.", Schwaber & Beedle, quoted in Roman Pichler's DEEP backlog write-up

Order by value, not by volume

The hardest question in backlog management isn't what goes on the list, it's in what order. The weak answer is recency or volume: the newest request, or the one with the most upvotes, floats to the top. The stronger answer is economic. In The Principles of Product Development Flow (2009), Don Reinertsen argues that most product trade-offs collapse into one number you can actually compute: the cost of delay, what it costs you, per week, to not have a given thing. His blunt advice: "If you only quantify one thing, quantify the cost of delay." From it comes Weighted Shortest Job First, do the work with the highest cost of delay per unit of effort first.

In practice that means dropping the gut-rank and asking, item by item, "what does waiting on this cost us?" You don't need a finance model. A rough, shared estimate, high / medium / low cost of delay, set against rough effort, already beats the loudest-voice heuristic most backlogs run on, and it gives you a defensible answer when a stakeholder asks why their request is at number forty.

An honest limitation. None of this is settled science. The Scrum Guide and DEEP are practitioner frameworks, not peer-reviewed findings, and cost-of-delay numbers are estimates dressed as precision, easy to game, and dangerous if treated as exact. Reinertsen's own point is that the discipline of estimating cost of delay matters more than the figure you land on. And a backlog ordered purely by near-term value will quietly starve the long-horizon bets that don't show an immediate payoff, which is why a backlog needs a strategy and vision sitting above it to protect the work that matters but never wins this week's economic argument. Use these tools as lenses, not laws.

flowchart TD
    I("Ideas, requests, bugs, insights") --> R("Refine: clarify, split, size the top items")
    R --> P("Prioritise by value & cost of delay")
    P --> T("Ready items at the top")
    T --> S("Sprint / next delivery cycle")
    S --> L("Learn from what shipped")
    L --> R
    P --> D("Delete or park the stale & low-value")
					
Backlog management as a loop, not an inbox: ideas in, refine and rank, ship the top, learn, and prune the rest. Leaders Loop

A worked example

The figures below are illustrative, a composite scenario, not a real company's data, but the mechanics are exactly how a healthy refinement session runs.

Maya runs product for a small B2B scheduling tool. Her backlog has crept to 380 items. Sprint planning has become an argument, because every stakeholder can point to "their" ticket sitting in the list as proof it's coming. Engineering quietly distrusts the order; sales over-promises against it.

She resets it over three sessions. First, the merge: the separate bug tracker, the "feature ideas" spreadsheet and the support team's requests all fold into one list. The 380 items become 410 for a day, then the duplicates surface and she cuts 90 outright. Second, the cull: anything untouched for nine months gets a hard question, "if this mattered, why has nobody fought for it?" Most go. The list lands at around 120, and only the top 15 are detailed enough to build.

Third, the re-rank. Instead of upvote counts, Maya scores the top thirty on rough cost of delay against rough effort. A small CSV-export feature that three churning accounts cited as their reason for leaving jumps from number 22 to number 3, high cost of delay, low effort. A glossy calendar redesign that everyone "loved" sinks, because nothing actually breaks if it waits.

The unlock isn't the new order. It's that the order is now explainable. When sales asks why the redesign slipped, Maya has a sentence, not a shrug: "It costs us almost nothing to wait, and the export is bleeding revenue now." A backlog you can defend in one sentence per item is a backlog your team will trust, and a trusted order is the entire point of keeping one. That weekly habit of prioritising by cost of delay is what stops the list sliding back into a 380-item graveyard.

flowchart LR
    A("380-item graveyard") -->|Merge into one list| B("Single ordered backlog")
    B -->|Cull stale & duplicate| C("~120 live items")
    C -->|Re-rank by cost of delay| D("Trusted top 15, defensible order")
					
Maya's reset (illustrative): merge, cull, re-rank, from a museum of old tickets to a list the team believes. Leaders Loop

Frequently asked questions

How big should a backlog be?

Small enough to read in one sitting. There's no magic number, but if you can't skim the whole thing and roughly recall what's there, it's too big to be useful for deciding. Pichler's guidance is to derive a focused backlog from a roadmap rather than hoard every idea, an oversized, over-detailed backlog becomes "difficult to use" and hard to evolve. If it's grown unreadable, the fix is deletion, not a better tool.

What's the difference between a product backlog and a sprint backlog?

The product backlog is the whole ordered list of what could be done next, owned by the product owner. The sprint backlog is the small slice the team has committed to for the current cycle, plus the plan to deliver it. One is the menu; the other is tonight's order. Keeping them distinct stops "everything we might ever do" from masquerading as "what we promised this week."

Who owns the backlog and its order?

One person is accountable for the order, typically the product owner or product manager, even though the whole team contributes items and helps refine them. Order can't be decided by committee or by vote count; that's how you get a backlog ranked by politics. The owner makes the call and, crucially, can explain it. Refinement is collaborative; prioritisation has a single throat to choke.

Should bugs and tech debt live in the same backlog as features?

Yes, that's the point of "single source of work." If defects and tech debt sit in a separate system, they never compete openly with features for the same capacity, and they either get ignored or silently jump the queue. One list forces the honest trade-off: this bug fix versus that new feature, ranked side by side on value and cost of delay.

Isn't this just prioritisation by another name?

Prioritisation is one move within it. Backlog management is the wider, ongoing discipline: keeping the list a single source, refining items to the right depth at the right time, ranking by value, and, the part most teams skip, deleting relentlessly. A perfectly prioritised list of 380 stale items is still a graveyard.

Related in the Toolkit

Where to go next