Stochastic vs deterministic models: deciding under uncertainty

Your finance team hands you a number: the project lands in eleven weeks and costs £240,000. It is precise, confident, and almost certainly the one thing you can be sure won't happen. Not because anyone lied, but because the model behind it was asked the wrong kind of question. It was built to produce a single answer to a future that has many.

The quick version

A deterministic model maps fixed inputs to one fixed output, same inputs, same answer, every time. A stochastic model treats uncertain inputs as ranges and gives you a distribution of outcomes with probabilities attached.
Most plans use deterministic models fed with averages, which is fine until the relationships turn non-linear. Then the average of the outcomes stops equalling the outcome of the averages. This is the "flaw of averages."
The practical fix has nothing to do with more precision. It's a change of question, from "what will the number be?" to "what's the spread, and how bad is the bad end?"
You don't need new software to start, just stop committing to single-point forecasts and start asking for the range.

The idea in depth

The distinction is older and simpler than the jargon suggests. A deterministic system contains no randomness: give it the same inputs and it returns the same output, like a calculator. A stochastic system (from the Greek stókhos, "aim" or "guess") carries randomness inside it: the inputs are uncertain, so the output is a range of possibilities rather than one value. A mortgage table that says £1,420 a month is deterministic. A weather forecast that says "70% chance of rain" is stochastic, it has stopped pretending to know and started telling you the odds.

Neither is "better." The error is using one where the situation calls for the other, and the most common version of that error has a name. In The Flaw of Averages (Wiley, 2009), Stanford management-science professor Sam Savage describes what happens when someone "plugs a single number into a spreadsheet to represent an uncertain future quantity" and trusts the answer that drops out. His one-line verdict has become a kind of folk theorem among forecasters: plans based on average assumptions are wrong, on average.

"Plans based on average assumptions are wrong, on average.", Sam Savage, The Flaw of Averages

Why the average lies

This isn't sloppiness; it's mathematics. If the relationship between your inputs and your outcome is a straight line, averaging the inputs is harmless. But almost nothing important in a business is a straight line. Deadlines compound. Costs hit thresholds. A queue that's fine at 80% utilisation falls apart at 95%. Whenever the curve bends, a result the Danish mathematician Johan Jensen proved back in 1906, Jensen's inequality, kicks in: for a curved function, the average of the outputs is not the output of the average. The two come apart, sometimes dramatically, and not in a friendly, symmetric way.

The classic illustration is the drunk on the highway. On average, he's standing on the centre line, a perfectly safe place to be. Plan around that average and he survives. But his position is uncertain, and the payoff (getting hit) is wildly non-linear at the edges. The average hides exactly the outcomes that matter. Projects behave the same way: a task whose duration averages out fine can still blow the deadline far more often than the single-number plan implies, because the late tail is longer than the early one.

flowchart LR
  A(["Uncertain inputs
cost, duration, demand"]) --> B{Model type?}
  B -->|Deterministic| C(["Plug in the average
of each input"])
  C --> D(["One clean number
e.g. 11 weeks"])
  B -->|Stochastic| E(["Sample the full range
run it many times"])
  E --> F(["A distribution
P10 / P50 / P90"])
  D -.->|"hides the spread
(flaw of averages)"| F

Same inputs, two questions: a single answer versus the shape of all the answers. Leaders Loop

So the move is to treat any single-number forecast as a claim about an average, and immediately ask the follow-up question it conceals: "Average of what range, and what does the bad tail look like?" A forecast of eleven weeks that could plausibly be nine or twenty is a different decision from one that could be ten or twelve, even though both "average" eleven.

The stochastic alternative: simulate, don't single-guess

If averaging the inputs is the trap, the escape is to keep the uncertainty alive all the way through the model instead of collapsing it at the start. You replace each fixed input with a distribution, a range of plausible values and how likely each is, and then run the model hundreds or thousands of times, drawing a fresh random value each pass. The output isn't one number; it's a histogram of what could happen.

This technique is Monte Carlo simulation, and it has a real pedigree. It was formalised by Nicholas Metropolis and Stanislaw Ulam in their 1949 paper "The Monte Carlo Method" in the Journal of the American Statistical Association, born out of nuclear-physics problems at Los Alamos that were too tangled to solve with a clean equation. The trick was to stop trying to compute the answer and instead sample the randomness directly, the same logic a casino runs on, which is where the name comes from. What once needed a national lab now runs in a spreadsheet over lunch.

The payoff is a vocabulary that single-point forecasts can't offer. Instead of "eleven weeks," you get the P50 (a coin-flip, half the runs finish faster, half slower), the P90 (you beat this date in 90% of futures), and the shape in between. Suddenly you can make a commitment at the P90 and hold an ambition at the P50, and everyone knows which is which.

flowchart TD
  A(["Pick the few inputs
that actually drive the result"]) --> B(["Give each a range
not a single number"])
  B --> C(["Run the model
1,000+ times, sampling each input"])
  C --> D(["Read the distribution"])
  D --> E(["P50, the realistic middle"])
  D --> F(["P90, what you can commit to"])
  D --> G(["The tail, how bad is bad?"])

A Monte Carlo loop in five steps: range the inputs, run it many times, read the spread. Leaders Loop

The honest limitation: a stochastic model is only as good as the distributions you feed it, and those are often guesses dressed as data. Worse, it answers questions in what the economist Frank Knight called the realm of risk, outcomes whose odds you can estimate, not uncertainty, where you can't ("Risk, Uncertainty and Profit," 1921). A simulation will not warn you about the supplier going bust or the regulation that didn't exist when you built the model. It quantifies the futures you imagined; it is silent on the ones you didn't. Knowing the difference is its own discipline, see our Toolkit piece on risk vs uncertainty vs ambiguity. A stochastic model is a sharper lens, not a crystal ball.

A worked example

A regional operations lead, call her Priya, needs to commit a delivery date to the board for a system migration with four sequential stages. Her team gives her a tidy estimate (figures below are illustrative): stages of 2, 3, 2 and 3 weeks. She adds them up, gets ten weeks, and is about to promise it.

Then she asks for the range on each stage instead of the average. Each, it turns out, is "probably as estimated, but could run up to twice as long if a dependency slips." That asymmetry, stages can run long far more easily than they can run short, is exactly where the flaw of averages lives. The team builds a quick Monte Carlo model: each stage drawn from its own range, the four added up, the whole thing run a thousand times.

The deterministic answer was ten weeks. The simulation's P50 comes out around twelve, because the long tails of four stages stack up. Hitting ten weeks happens in maybe one run in five. To clear a 90% confidence bar, the kind you'd want before promising a board, she needs roughly fifteen. Same data, same team, no new pessimism. The difference is that Priya now commits to fifteen weeks as a date she'll hit, while privately driving toward twelve, rather than promising ten and explaining a slip later. The model didn't make her cautious; it made her honest about the odds.

Frequently asked questions

Is a stochastic model just a deterministic model with randomness added?

Roughly, yes, and that's the useful way to think about it. You keep the same logic, swap fixed inputs for probability distributions, and run it many times. The structure doesn't change; the output does. Instead of one number you get a range of outcomes with frequencies, which is a far more honest description of a future you can't pin down.

When is a deterministic model the right choice?

When the inputs really are fixed (a loan repayment, a tax rate), when the relationships are close to linear, or when you just want a fast sense of scale before deciding whether deeper analysis is worth it. Deterministic models are cheaper, quicker and easier to explain. The danger is never the model, it's mistaking its single answer for a guarantee.

What exactly is the "flaw of averages"?

It's Sam Savage's name for the systematic error of feeding average inputs into a model and trusting the average output. Because most real relationships bend, the average of the outcomes usually isn't the outcome of the averages (this is Jensen's inequality at work). So plans built on averages are, in his phrase, wrong on average, typically optimistic, because the painful tail is longer than the lucky one.

Do I need special software to build a stochastic model?

No. A Monte Carlo simulation is just "run the model many times with varied inputs and look at the spread," and a spreadsheet with a few hundred rows of random samples does it. Dedicated add-ins make it tidier and faster, but they're a convenience, not a requirement. The mindset shift matters more than the tooling.

Will a stochastic model tell me the right answer?

No model does. It gives you a better-calibrated picture of what could happen and how likely each outcome is, which is a real improvement over a confident single guess. But it's still bounded by the distributions you chose, and it's blind to risks you never imagined. It raises the quality of the decision, not the certainty of the result.

Related in the Toolkit

Risk vs uncertainty vs ambiguity, the line that tells you when a stochastic model can even help, and when you're past what any simulation can price.
Decision theory & expected value, what to do with a distribution once you have one; weighing outcomes by their odds.
Bayesian reasoning, priors & updating, how to set the input distributions in the first place, and revise them as evidence arrives.
Real options & preserving optionality, why a wide outcome range is an argument for staging commitments rather than betting it all up front.
Descriptive statistics (mean, median, mode, variance, SD), the mean is the input to the flaw of averages; variance is what it hides.
First principles vs heuristics vs analogical reasoning, choosing when to model from the ground up versus reason by rule of thumb.
Game theory & strategic interaction (zero-sum vs positive-sum), uncertainty that comes from other people's choices, not from random draws.
Macroeconomics: GDP, inflation, interest rates, the cycle, a live arena where deterministic forecasts routinely meet stochastic reality.

Where to go next

Sam Savage, The Flaw of Averages (Wiley), the seminal, readable book on why single-number plans mislead, written for managers rather than statisticians.
"Probability Management, A Cure for the Flaw of Averages" (Sam Savage, 2019), a talk that walks through the core examples on screen if you'd rather watch than read.
Metropolis & Ulam, "The Monte Carlo Method" (1949), the original paper that launched simulation; short, and a fascinating glimpse of where the idea came from.
Jensen's inequality (overview), the mathematics behind why averaging non-linear things misleads, with a clear statement of E[f(X)] ≥ f(E[X]) for convex functions.
Frank Knight, Risk, Uncertainty and Profit (1921, full text), the century-old source of the risk/uncertainty distinction that tells you the limits of any simulation.