Your support team's average ticket-resolution time is four hours. Reassuring, until you notice that half are closed in twenty minutes and a long tail drag on for two days. The "average" describes almost no real customer's experience. The number was never the problem; trusting it without looking at the spread was.
The quick version
- A distribution is the shape of your data, how values pile up, where they cluster, and how far the extremes reach. The average is one dot on that shape, not the shape itself.
- A percentile is a rank: the 90th percentile is the value below which 90% of your data falls. The median is just the 50th.
- Quartiles chop the data into four equal-sized groups (the 25th, 50th and 75th percentiles), giving you a fast read on the typical case and the spread around it.
- When data is skewed or has outliers, the median and percentiles tell the truth; the mean quietly lies.
The idea in depth: shape beats summary
Every dataset has a distribution, a description of which values occur and how often. Picture every data point dropped into a bin by size; the heights of those bins are the shape. Some shapes are symmetric and bell-like (heights cluster in the middle, thin out evenly on both sides). Many real business shapes are not: resolution times, deal sizes, page-load latency and salaries are usually right-skewed, with a hump of ordinary values and a long tail of large ones.
The skew is exactly where the mean betrays you. The mean adds everything up and divides, so a few giant values haul it upward; the median, the middle value when you line everyone up, ignores how extreme the extremes are and reports the genuine midpoint. In a right-skewed distribution the mean sits to the right of the median, and the gap between them is a quick tell that the shape is lopsided. So the move is simple: whenever someone quotes an average, ask for the median beside it. If the two diverge, the average is hiding a tail, and you should be looking at the whole shape, not the one number. (For the wider family of mean, median, mode and spread, see descriptive statistics.)
flowchart LR
A(["A single number
(the average)"]) --> B{"Does it describe
most real cases?"}
B -->|"Symmetric data"| C("Mean ≈ median
average is safe")
B -->|"Skewed / outliers"| D("Mean is pulled
by the tail")
D --> E("Read the median
and percentiles instead")
Percentiles: the rank that survives outliers
A percentile answers "where does this value rank?" The 90th percentile is the point below which 90% of observations fall. Because it is a position in the ordered data rather than an arithmetic average, a single absurd outlier can't drag it around, making percentiles the honest way to talk about typical-and-worst-case together. This is why engineering teams set service-level objectives on the p95 or p99 latency rather than the mean. Jeffrey Dean and Luiz André Barroso laid out the maths in The Tail at Scale (Communications of the ACM, 2013): if a request fans out to 2,000 servers and each has just a 1-in-10,000 chance of a slow response, almost one user in five ends up waiting. The rare hiccup, multiplied across the system, becomes the common experience, so the slow tail, not the comfortable average, is what users actually feel. The same logic applies well outside software. In delivery times, claims processing, or call-centre waits, your reputation is set by the 95th percentile, not the mean.
So the move: pick the percentile that matches the promise you're making. Reporting "average wait under 5 minutes" is meaningless if your p95 is 40. State the target as a percentile ("95% of calls answered within 5 minutes") and you've committed to the experience most customers actually get.
Quartiles and the five-number summary
Quartiles are just three specific percentiles that carve the data into four equal-sized quarters: the lower quartile (Q1, the 25th percentile), the median (Q2, the 50th), and the upper quartile (Q3, the 75th). The distance between Q1 and Q3, the interquartile range (IQR), is the span of the "middle half" of your data, a robust measure of spread that ignores the extremes entirely.
This packaging comes from John Tukey, whose 1977 book Exploratory Data Analysis introduced the five-number summary (minimum, Q1, median, Q3, maximum) and the box plot that draws it. Tukey's project was to put the data front and centre, to let its shape reveal itself before any modelling. A box plot renders the five numbers as a box (Q1 to Q3) with a line at the median and "whiskers" to the reasonable extremes, with stray points flagged as candidate outliers. It is still the fastest way to compare the shape of several groups, five regions, or five reps, or five product lines, side by side on one chart. Next time you compare segments, ask for box plots rather than a bar chart of averages. The bars hide the spread; the boxes show it.
flowchart LR
A(["All data,
sorted low to high"]) --> Q1(["Q1, 25th pct"])
A --> M(["Median, 50th pct"])
A --> Q3(["Q3, 75th pct"])
Q1 --> IQR(["IQR = Q3 − Q1
the middle half"])
Q3 --> IQR
M --> S(["Five-number summary:
min · Q1 · median · Q3 · max"])
Q1 --> S
Q3 --> S
An honest limitation. There is no single, universal formula for a percentile. When a percentile falls between two data points, software has to interpolate, and Rob Hyndman and Yanan Fan, in Sample Quantiles in Statistical Packages (The American Statistician, 1996), catalogued nine different definitions in common use, recommending one as the most defensible default. The practical upshot for a leader: on small samples, Excel, R, Python and your BI tool can each report a slightly different "75th percentile" from identical data. The differences are usually trivial and vanish as samples grow, but if two dashboards disagree by a hair, this is often why, not a data error. Agree on one tool's method and move on.
A worked example: reading a team's cycle time
Suppose your engineering lead reports that the team's average delivery cycle time last quarter was 8 days (figures here are illustrative). It sounds healthy. Then you ask for the five-number summary and get: minimum 1 day, Q1 3 days, median 4 days, Q3 7 days, maximum 46 days.
Now the story changes. The median is 4 days, half of all work shipped within four. The mean of 8 is double that, dragged up by a handful of items near the 46-day maximum. The middle half of work (the IQR) lands between 3 and 7 days, which is genuinely tight. The team isn't slow; it has a tail problem. A few items are getting stuck, and those are what make stakeholders feel deliveries are unpredictable.
The average pointed you at the wrong intervention, "the team needs to speed up." The percentiles point you at the right one: leave the healthy middle alone and hunt the tail. What do the items past the 90th percentile have in common, a particular dependency, a review bottleneck, a class of ambiguous request? Fix the tail and your average improves dramatically without anyone working faster, because you removed the few outliers that were inflating it. That is the difference between managing the number and managing the shape. (Whether "tail items share a cause" is a real driver or a coincidence is a separate question, see correlation vs causation before you act.)
Manage the shape, not the summary. The average tells you a tail exists; the percentiles tell you where to cut.
Frequently asked questions
When should I use the median instead of the mean?
Whenever the data is skewed or has outliers, money, time, and counts almost always are. The median reports the genuine midpoint and shrugs off extremes. A fast rule: if the mean and median are close, either is fine; if they diverge, lead with the median and treat the gap as a signal that a tail is present.
What's the difference between a percentile and a percentage?
A percentage is a proportion (32% of customers churned). A percentile is a rank within a distribution (a score at the 32nd percentile means 32% of people scored lower). "I'm in the 90th percentile for sales" means you outsold 90% of peers, not that you hit 90% of a target.
How big a sample do I need for percentiles to be meaningful?
The median and quartiles are stable on modest samples. Extreme percentiles need more data: a p99 from 50 data points is essentially one or two observations and will swing wildly week to week. As a rough guide, you want comfortably more observations than the reciprocal of the tail you're measuring, hundreds of points before a p99 is worth quoting.
Are quartiles and percentiles the same thing?
Quartiles are a subset of percentiles. Q1, Q2 and Q3 are simply the 25th, 50th and 75th percentiles, the three cuts that split data into four equal-sized groups. Percentiles let you cut at any point; quartiles are the everyday four-way cut.
My BI tool and a colleague's spreadsheet show different quartiles. Who's right?
Probably both. There are several legitimate interpolation methods for percentiles (Hyndman and Fan counted nine), and different tools default to different ones. On small samples this produces small disagreements. Pick one tool's method as the source of truth rather than debating the formula.
Related in the Toolkit
- Data types (discrete/continuous, categorical/ordinal), the kind of data you have decides whether a percentile even makes sense.
- Descriptive statistics (mean, median, mode, variance, SD), the wider family of summary numbers; percentiles are the robust members.
- Correlation vs causation, before you act on a pattern in the tail, check you've found a cause, not a coincidence.
- Regression (linear, non-linear, logistic), once you understand a distribution's shape, regression models the relationship behind it.
- Statistical significance: p-values, t-scores, chi-square, how to know whether a shift in the distribution is real or noise.
- First principles vs heuristics vs analogical reasoning, "always read the median next to the mean" is a heuristic worth keeping.
- Reversible vs irreversible decisions, how much certainty about the data you need depends on how hard the call is to undo.
- Jobs-to-be-Done & needs research, the tail of your distribution is often a different customer with a different job.
Where to go next
- Tukey, Exploratory Data Analysis (1977), the seminal source of the five-number summary and the box plot; still the clearest argument for looking at shape before modelling.
- Dean & Barroso, The Tail at Scale (Communications of the ACM, 2013), why the slow tail, measured in percentiles, dominates real-world experience at scale.
- Hyndman & Fan, Sample Quantiles in Statistical Packages (1996, PDF), the definitive account of why tools disagree on a percentile, and which method to prefer.
- The 68–95–99.7 (empirical) rule, for the special case of a normal distribution, how the spread maps to standard deviations; and a reminder it only holds for bell shapes.
- John Rauser, Statistics Without the Agonizing Pain (Strata, 2014), a short, vivid talk on building statistical intuition from the data's shape rather than formulas.