North Star metric & outcome over output: a working guide

Two teams report the same quarter. One says: "We shipped fourteen features and closed forty-one tickets." The other says: "The share of new users who finished their first real task in week one went from 31% to 44%." Both worked hard. Only one of them can tell you whether the work mattered. That gap, between counting what you produced and measuring what changed for someone else, is the whole point of a North Star metric, and the reason so many of them quietly fail.

The quick version

A North Star metric is the single measure that best captures the value your product delivers to customers. The term was coined by growth specialist Sean Ellis around 2010 and built into a fuller framework by Amplitude.
It only works as an outcome, a change in customer behaviour or value received, not an output (features shipped, releases made). Outputs are things you control; outcomes are the things worth controlling for.
Pair the North Star with 3–5 input metrics your teams can actually move. The North Star says where you're going; the inputs are the levers.
One number can be gamed (Goodhart's Law) and can flatter you (vanity metrics). Treat it as a compass with guardrails, not a target to hit at any cost.

The idea in depth

The metaphor is older than the framework. Polaris, the pole star, sits almost directly above Earth's northern axis, so for centuries it gave travellers a fixed bearing in a moving sky. A product North Star does the same job for a team: one durable direction that doesn't change every time priorities wobble.

The term entered product vocabulary through Sean Ellis, the marketer who coined "growth hacking" and ran early growth at Dropbox, LogMeIn and Eventbrite, around 2010. The idea was later developed into a structured model, the North Star Framework, popularised from 2017 onward by the analytics company Amplitude and written up by its then product evangelist John Cutler in The North Star Playbook. Amplitude's version has two parts: the North Star Metric itself, and a set of three to five input metrics, the complementary levers a team believes most directly move the North Star and can influence through the product.

Here's the test that does the work. Write your candidate metric down and ask one question of it: does this measure something that happens in the customer's world, or something that happens in ours? "Releases shipped this quarter" lives in your world. "Weekly users who complete a meaningful action" lives in theirs. Keep only the ones that live in theirs.

Why "outcome over output" is the load-bearing wall

This is where the second half of the topic does the real work. Joshua Seiden, in Outcomes Over Output (2019), gives the cleanest definition going: an outcome is "a change in human behaviour that drives business results." Not the feature you launched, the thing people now do because you launched it. Seiden boils delivery down to three questions a team should be able to answer: what customer behaviours drive results, how do we get more of them, and how do we know we're right.

It isn't a fringe view. Marty Cagan built the most recent edition of Inspired, the standard product-management text, around the same shift from output to outcome, and product-discovery coach Teresa Torres has argued the harder follow-on point: most teams should anchor on product outcomes (a behaviour the team can actually influence) rather than distant business outcomes like revenue, which the team only moves indirectly. Three independent practitioners, one direction of travel. That's about as close to consensus as this field offers, though it's practitioner consensus, not a controlled study, so weight it accordingly.

Which points to a practical instruction: phrase your North Star as a behaviour, not a deliverable. Spotify is widely cited for time spent listening; Airbnb for nights booked. Neither says "songs catalogued" or "listings added." Each measures the customer getting the thing they came for. Write yours the same way: a noun the customer would recognise as value.

flowchart TD
  O("Output: things we ship<br/>features, releases, tickets") --> Q{"Did a customer<br/>behaviour change?"}
  Q -->|"No"| W("Activity without proof<br/>busy, not better")
  Q -->|"Yes"| OUT("Outcome: a change in<br/>customer behaviour")
  OUT --> NSM(["North Star metric<br/>the value, measured"])
  NSM --> B("Business result<br/>revenue, retention, growth")

Output is the raw material; the outcome is the proof it mattered. The North Star measures the outcome, and the business result follows. Leaders Loop

The North Star tree: one direction, several levers

A single number can feel paralysing, if there's only one metric, what does any given team work on? Amplitude's answer is the North Star tree: the metric at the top, then three to five input metrics beneath it, each owned by people who can actually move it. Amplitude groups inputs along dimensions such as breadth (how many users get value), depth (how much value per user), frequency (how often they return) and efficiency (how quickly they reach value). Amplitude itself is a useful example of the framework in motion: its early North Star was weekly querying users, which it later retired in favour of weekly learning users, on the logic that a customer running queries all day without sharing an insight is a product failure, not a success. The metric followed the value, not the other way round.

The practical sequence: draw the tree before you assign any roadmap. Put the North Star at the top, hang three to five inputs off it, and check that every team can point to one input they own and can shift this quarter. If a team can't, your tree is missing a branch, or that team is working on something that doesn't ladder up to the value you claim to care about.

flowchart TD
  NS(["North Star metric<br/>e.g. weekly active value moments"]) --> I1("Breadth<br/>new users activated")
  NS --> I2("Depth<br/>actions per active user")
  NS --> I3("Frequency<br/>return cadence / retention")
  NS --> I4("Efficiency<br/>time to first value")
  I1 --> T1("Team / squad owns this")
  I2 --> T2("Team / squad owns this")
  I3 --> T3("Team / squad owns this")
  I4 --> T4("Team / squad owns this")

The North Star tree turns one shared direction into 3–5 levers different teams can pull. Leaders Loop

Where it breaks down (an honest limitation)

Any single metric pointed at hard enough will eventually mislead you. Goodhart's Law is the standing warning. Economist Charles Goodhart made the original observation in 1975 (in drier terms, about monetary targets); the crisp phrasing everyone quotes, "when a measure becomes a target, it ceases to be a good measure", was actually anthropologist Marilyn Strathern's, in 1997. Make "time spent listening" the only thing that matters and someone, somewhere, will find a way to inflate it that has nothing to do with delight: autoplay traps, dark patterns, dopamine over usefulness. The metric goes up; the value goes down. Vanity metrics are the gentler cousin of the same problem, numbers that rise reliably (sign-ups, page views) while telling you nothing about whether anyone got value.

So pair every North Star with a counter-metric or a quality guardrail, a number that should not get worse while the North Star climbs. Watch for retention or satisfaction alongside engagement; watch churn alongside sign-ups. A compass is for steering, not for staring at while you walk into a wall.

A worked example

Take a small B2B scheduling tool, call it ShiftLine. (Illustrative figures throughout; the company is invented to show the mechanics.) The team has been reporting output for a year: "11 features shipped, 92% sprint completion." Leadership is happy until renewals come up flat and nobody can explain why.

They reset around an outcome. The value ShiftLine actually delivers is a manager filling a rota without a pile of back-and-forth messages. So the North Star becomes weekly rotas published without manual edits after send, a clean behaviour that means the tool did its job. Under it they hang four inputs: new accounts that publish a first rota within seven days (breadth), average shifts auto-filled per rota (depth), managers returning each week (frequency), and minutes from sign-up to first published rota (efficiency).

Now the quarter reads differently. "Auto-filled rotas rose from 38% to 52%; weekly returning managers held at 71%; churn did not rise." (Again, illustrative numbers.) Two squads each own an input they can move. And when someone proposes a feature that would bump a vanity number, say, total messages sent in-app, the tree makes the awkward question easy: which input does this move, and does the North Star care? If the honest answer is "none," it goes to the bottom of the list. That single screening question is the payoff of the whole exercise.

A North Star you can't tie to a customer behaviour is just a slogan with a number bolted on.

Frequently asked questions

What's the difference between a North Star metric and an OKR?

They're complementary, not rival. The North Star is a durable direction that holds for years; OKRs (Objectives and Key Results) are the time-boxed goals, usually quarterly, you set to move it. Amplitude's own guidance pairs the two: the North Star and its inputs tell you what to measure; OKRs tell you what to push on this quarter. Use the North Star to keep the OKRs honest.

Should every company have just one North Star metric?

One per product or business line is the discipline the framework is built on, the whole value of a North Star is the alignment that comes from a single shared direction. Large organisations with genuinely separate products may run more than one, but resist inventing several for one product to keep every team comfortable. If everything is a priority, nothing is. The 3–5 input metrics exist precisely so different teams get their own lever without fracturing the top-line direction.

Can revenue be a North Star metric?

It's usually a poor one. Revenue is a lagging business result, and most product teams only move it indirectly, which is exactly why Teresa Torres argues for product outcomes (a customer behaviour the team can influence) over business outcomes. A North Star should sit one step closer to the customer than the money, so that improving it reliably leads to revenue rather than just correlating with it.

How do I stop people gaming the metric?

Assume they eventually will, that's Goodhart's Law, not a character flaw, and design for it. Pair the North Star with a guardrail metric that must not degrade (retention, satisfaction, refund rate), review the metric as a conversation rather than a verdict, and never tie individual bonuses to a single number you've asked teams to optimise. The metric is a prompt for better questions, not a stick.

We're pre-product-market fit, is it too early for this?

The full tree can wait, but the habit shouldn't. Even early, force yourself to name the one customer behaviour that would prove you're onto something, and measure it. That's outcome thinking at its simplest, and it's cheap insurance against shipping a year of output before discovering nobody changed what they do.

Related in the Toolkit

Product strategy & vision, your North Star should fall out of the strategy, not float free of it.
Product lifecycle (launch / grow / mature / exit), the right North Star shifts as a product moves from launch to maturity.
Roadmapping & prioritisation (RICE, MoSCoW, cost of delay), input metrics give roadmap bets something honest to ladder up to.
Discovery, validation & de-risking, how you learn which behaviours actually drive the metric.
MVP & iterative delivery, ship to move an outcome, not to tick off output.
Customer needs identification & latent needs, the value behind the metric starts with a real need.
Usability & guerrilla testing, cheap ways to check the behaviour you're counting on.
Sales process & pipeline management, the commercial pipeline has its own outputs-vs-outcomes trap.

Where to go next

The North Star Playbook (Amplitude, John Cutler), the free, practical source for the metric-plus-inputs model and the North Star tree.
Outcomes Over Output by Joshua Seiden (2019), short, sharp, and the clearest case for measuring behaviour change over shipped features.
Defining Product Outcomes by Teresa Torres, why teams should anchor on a behaviour they can move, not the revenue they can only nudge.
North Star Metric Clinic with John Cutler (LaunchNotes, video), a working session on choosing and pressure-testing a North Star, straight from the playbook's author.
Goodhart's Law, the one-page reminder of why no single metric should ever become a target you chase blindly.