You have probably sat in all three meetings without anyone naming which was which. A sprint planning session that drifted into arguing about quarterly priorities; a quarter-opener that produced a wish-list nobody costed; a cross-team event that mapped dependencies but never asked what success would look like. Sprint, PI and OKR planning are not the same activity at three sizes, they answer three different questions, on three different clocks, and the most common delivery failure is running one when you needed another.

The quick version

  • OKR planning sets the goal, usually quarterly. An Objective (where we're going) plus a few measurable Key Results (how we'll know we got there). It answers "what outcome are we chasing, and why?"
  • PI planning (Program Increment, from SAFe) aligns many teams for the next 8–12 weeks, a two-day event where everyone commits to a shared plan, surfaces dependencies, and names the risks. It answers "who is building what, and where do we collide?"
  • Sprint planning is the delivery loop, one short cycle (often two weeks) where a single team decides the goal, what it can finish, and how. It answers "what will we actually ship next, and how?"
  • They nest: a quarterly OKR sets direction, a PI coordinates the teams, sprints do the work. Run them as one blurred meeting and you get motion without a destination.

The idea in depth: three loops, three questions

Think of it as a set of nested loops, each a different length, each answering a different question. The outer loop is the goal; the middle loop is coordination; the inner loop is delivery. Confusion is what happens when you try to answer a goal question inside a delivery meeting, or never answer it at all and just keep delivering.

Start with the innermost and best-defined loop. Sprint planning comes from Scrum, and the 2020 Scrum Guide is precise about it: the event is timeboxed (a maximum of eight hours for a one-month sprint, proportionally shorter for shorter sprints) and addresses three topics in order, Why this sprint has value (the Sprint Goal), What can be finished, and How the work gets done. The 2020 revision deliberately promoted "Why" to the front; earlier versions led with "what" and "how," and teams routinely skipped the goal and went straight to filling the sprint with tasks. So the move is: don't let anyone touch the backlog until the team can say, in one sentence, what this sprint is for. If you can't write the Sprint Goal, you're assembling a to-do list, not planning a sprint.

The outermost loop is OKR planning, and it has the deepest evidence behind it. OKRs, Objectives and Key Results, were adapted at Intel by Andy Grove from Peter Drucker's Management by Objectives, documented in Grove's 1983 book High Output Management, and carried to Google in 1999 by John Doerr, who'd learned the method from Grove as a young Intel employee (Doerr later wrote the popular account, Measure What Matters, 2018). The reason a well-formed Objective beats a task list isn't folklore. It rests on one of the most replicated findings in organisational psychology: Edwin Locke and Gary Latham's goal-setting theory, summarised in their 2002 American Psychologist paper. Their finding: specific, difficult goals reliably outperform vague "do your best" goals, provided the person is committed, has the ability, and gets feedback on progress. That last clause is the whole game. So the move is: write Key Results as measurable outcomes ("cut median onboarding time from 9 days to 3"), not activities ("ship the onboarding redesign"), and check them often enough that the feedback loop actually exists.

A goal works because it's specific, hard, and you can see whether you're hitting it, not because it's written on a slide.

An honest limitation. The same research that makes OKRs work also marks their edge. Locke and Latham found that specific, hard goals can degrade performance on novel, complex tasks, when you don't yet know how to do the work, a rigid output target crowds out the learning you need first. Later critics (the 2009 Academy of Management Perspectives paper "Goals Gone Wild") catalogued the side effects of aggressive goals: tunnel vision, gaming the metric, ethical corner-cutting. So OKRs are a sharp instrument for work you broadly know how to do, and a blunt one for genuine discovery. For those, set a learning goal ("find out whether X is feasible"), not a delivery number.

The middle loop: PI planning and the problem of many teams

Sprints work for one team; OKRs work at any altitude. The gap they leave is coordination, what happens when ten teams share a goal and trip over each other's dependencies. That is the problem PI planning exists to solve. It comes from the Scaled Agile Framework (SAFe), which defines PI planning as a cadence-based event for an entire Agile Release Train, typically 50 to 125 people across multiple teams, held every 8 to 12 weeks. Over two days, the teams plan the next increment together: each drafts its plan, the room surfaces cross-team dependencies, risks are named and owned, and teams commit to shared objectives. SAFe's stated benefits are mostly about visibility, face-to-face communication, matching demand to capacity, and "a holistic, transparent view of where and when value will be delivered."

The point worth holding onto is that PI planning is not really a planning technique, it is a synchronisation technique. Its value is the two days of teams in one room discovering, out loud, that Team A's February deliverable depends on Team B's January one that nobody had connected. So the move is: even if you never adopt SAFe wholesale, borrow its best habit, once a quarter, get the teams that share a goal into one room to map dependencies and name risks before the work starts, not when it's already late.

flowchart TD
  O(["Quarterly OKR
the goal: outcome + key results"]) --> P(["PI planning · every 8–12 weeks
align many teams, map dependencies"]) P --> S1(["Sprint planning · ~2 weeks
Why → What → How, one team"]) P --> S2(["Sprint planning · ~2 weeks
another team, shared goal"]) S1 --> D(["Ship · review · adjust"]) S2 --> D D -.->|"feedback"| O
Three nested loops: the goal sets direction, the PI coordinates the teams, sprints deliver, and what ships feeds back to the goal. Leaders Loop

An honest limitation. PI planning is expensive, two days of 50-plus people, every quarter, and SAFe itself is contested. Critics argue it reintroduces heavyweight, top-down planning under an agile label, and that the ceremony can become theatre: a choreographed two days producing a wall of sticky notes nobody revisits. The honest test is whether the dependencies you surface actually change the plan. If the same risks appear every quarter and nothing moves, you're paying for a ritual, not coordination, shrink the event or fix the structural problem underneath it.

A worked example

Take a payments company, call it Tessellate, with four engineering teams and a goal to win mid-market customers. (Illustrative throughout; teaching example, not a real company.) Leadership opens the quarter with one Objective: "Make Tessellate the obvious choice for a 50–200-person business." Underneath it, three Key Results, each a number with a baseline: cut time-to-first-payment from 9 days to 3; lift trial-to-paid conversion from 18% to 28%; reach 95% uptime on the new dashboard. Note what these are not, not "redesign onboarding" or "improve reliability." They are outcomes the team can watch move, which is exactly what Locke and Latham's feedback condition requires.

Next, the four teams hold a one-day PI-style planning session (too small for SAFe's full two-day train, Tessellate borrows the habit at its own scale). Within an hour the room finds the collision: the conversion KR depends on the onboarding team's work, but onboarding depends on the platform team finishing an authentication change first, a dependency nobody had drawn. They sequence it, name the platform lead as risk owner, and adjust the plan before a line of code is written.

flowchart LR
  A(["KR: trial→paid 18% → 28%"]) --> B{"depends on
onboarding rebuild?"} B -->|"yes"| C(["but onboarding needs
the auth change first"]) C --> D(["sequence it · name a
risk owner · re-plan"]) D --> E(["sprint 1: auth change"]) E --> F(["sprint 2–3: onboarding
+ conversion experiment"])
The dependency the quarter would have tripped over, caught in an hour of teams-in-a-room, not in week six. Leaders Loop

Now the inner loop runs. Each fortnight, the onboarding team's sprint planning opens with a Sprint Goal, "a new user can take their first test payment without a support ticket", and only then fills the sprint with work that serves it. At review they measure against the KR baseline, see time-to-first-payment fall from 9 days to 5, and feed that back up: still short of 3, so the next sprint goal sharpens. Each loop does a distinct job, the OKR holds the destination steady, the PI keeps four teams from colliding, the sprints convert it into shipped, measured work. Blur them into one weekly meeting and you'd have lost the dependency, the goal, or both.

Frequently asked questions

Do I need all three? They sound like a lot of meetings.

No, and most teams shouldn't run all three by the book. A single team with no cross-team dependencies needs OKRs plus sprints and can skip PI planning entirely; the middle loop only earns its cost when several teams share a goal and collide. The error isn't having three loops, it's running ceremonies whose question you don't actually have. Keep the loop, drop the ritual you don't need.

What's the difference between a Key Result and a sprint task?

Altitude and type. A Key Result is a measurable outcome for the quarter ("conversion 18% → 28%"); a sprint task is a unit of work for the next two weeks ("build the new signup form"). The task is something you do; the Key Result is something that changes as a result. If your Key Results read like a task list, they've collapsed into the inner loop and lost the point of having an outer one.

Should OKRs be set top-down or bottom-up?

Both, deliberately. Doerr's account of OKRs at Google stresses that roughly half should originate from teams, not be handed down, and Locke and Latham found goal commitment is a precondition for goals working at all. A target imposed without buy-in is one people quietly ignore. In practice: leadership sets the direction and the boldest objectives; teams propose the Key Results they'll own, then both sides negotiate. Pure top-down kills commitment; pure bottom-up loses alignment.

How aggressive should OKRs be? I've heard you should aim to fail.

That's the Google "stretch" convention, aim so high that hitting ~70% is a strong result, and it's a style choice, not a law. It suits ambitious, well-understood work and motivated teams. It backfires on committed delivery (a regulatory deadline isn't a 70% target) and on novel work, where "Goals Gone Wild" warns aggressive numbers invite gaming and tunnel vision. Split them: stretch goals for growth bets, committed goals for things that simply have to land.

Does this only work for software teams?

The mechanics generalise; the vocabulary doesn't always. Goal-setting theory is domain-agnostic, specific, hard, feedback-rich goals lift performance in factories and sales teams as much as in engineering, and OKRs are used well beyond software. Sprints and PI planning are more software-shaped (they assume iterative, decomposable work), but the underlying pattern, a goal loop, a coordination loop, a delivery loop, maps onto most operational work with several teams and a moving target.

Related in the Toolkit

These three loops are the planning layer that sits on top of a delivery methodology, sprints assume Scrum or a Scrum-like cadence underneath, and they only pay off if you can read whether the work is actually flowing, which is the job of your delivery metrics.

Where to go next