Every organisation collects information about people, customers, staff, users, suppliers. The moment you can tie a record to a named human, you are holding something the law, and increasingly the public, treats as belonging to that person rather than to you. Data privacy is the discipline of holding it well; the General Data Protection Regulation (GDPR) is the most influential rulebook for doing so, and the template most other regimes now echo.
The quick version
- Personal data (PII) is any information that can identify a living person, directly (a name, an email) or indirectly (an IP address, a device ID, a combination of small clues). If it points to a human, it counts.
- GDPR sets seven principles for handling it, lawfulness, purpose limitation, data minimisation, accuracy, storage limitation, integrity & confidentiality, and accountability, plus a short list of rights the person can exercise over their own data.
- You need a lawful basis to process personal data at all (consent is only one of six), you must report serious breaches within 72 hours, and the top fines reach the higher of €20m or 4% of global turnover.
- The leadership move is to collect less, name a purpose, and keep it for less time, most privacy failures are hoarding failures, not hacking failures.
The idea in depth: what the law is actually asking
Start with the thing being regulated. Under Article 4 of the GDPR, "personal data" means any information relating to an identified or identifiable natural person, and "identifiable" is read broadly, covering an identification number, location data, an online identifier, or factors specific to someone's physical, economic or social identity. Here is where people get caught out. An IP address, a cookie ID, or a pseudonymous customer reference can all be personal data, because they can be tied back to a person directly or in combination with other records. "We don't store names" is not the defence it sounds like.
So widen your own definition before a regulator does it for you. When you map what you hold, count the indirect identifiers, the device fingerprints, the analytics IDs, the free-text notes, not just the obvious name-and-address fields. Most organisations discover they are processing far more personal data than their privacy notice admits.
With the "what" defined, the GDPR's substance is mercifully compact. Article 5 sets out seven principles, and almost everything else in the regulation is detail hanging off them. Personal data must be processed lawfully, fairly and transparently; collected for specified, explicit and legitimate purposes (purpose limitation); limited to what is necessary (data minimisation); accurate and up to date; kept no longer than necessary (storage limitation); held with integrity and confidentiality (security); and the seventh, accountability, the controller must not only comply but be able to demonstrate it. That last one is the quiet revolution: good intentions don't count unless you can show your working.
flowchart TD A(["Personal data
(anything identifying a person)"]) --> B(["1 · Lawful, fair, transparent"]) A --> C(["2 · Purpose limitation"]) A --> D(["3 · Data minimisation"]) A --> E(["4 · Accuracy"]) A --> F(["5 · Storage limitation"]) A --> G(["6 · Integrity & confidentiality"]) B --> H(["7 · Accountability
be able to prove the other six"]) C --> H D --> H E --> H F --> H G --> H
What follows from the principles is to make them defaults, not afterthoughts. Before any new product, feature or campaign touches personal data, ask the Article 5 questions in order: what exactly are we collecting, for which named purpose, is every field necessary, how long will we keep it, and could we prove all of that to someone who asked? This is the "data protection by design and by default" the law expects, and it is far cheaper to ask before the database is built than after.
Lawful bases: consent is only one of six
The single most common misconception is that privacy law runs on consent, that if you got a tick-box, you are covered, and if you didn't, you are exposed. Both halves are wrong. Article 6 gives six lawful bases for processing personal data, and you must have at least one before you start: consent, contract (you need the data to deliver what someone signed up for), legal obligation, vital interests (life-or-death), public task, and legitimate interests (a genuine business need that doesn't override the person's rights). As the IAPP notes in its refresher on the six bases, they are not interchangeable, the right one depends on the actual context, and you fix it before collection, not after.
So write down the basis for each processing activity, deliberately, and stop reaching for consent reflexively. Consent is the weakest basis to lean on operationally: it must be freely given, specific and informed, and it can be withdrawn at any time, which means a service that depends on data has often chosen the most fragile footing for it. Paying your staff doesn't need their consent; it runs on contract and legal obligation. Defaulting everything to a cookie banner is not rigour, it is the absence of it.
An honest limitation. The principles are clear; their application is not always. "Legitimate interests" in particular is a judgement call, it requires a balancing test between your need and the person's reasonable expectations, and reasonable lawyers disagree about where the line sits. The regulation is deliberately principles-based rather than a checklist, which makes it durable but leaves real grey areas. Treat the framework as a way to structure the judgement, not as something that makes the judgement for you, and get qualified advice for the close calls and for the rules in your specific jurisdiction.
Rights, breaches and the price of getting it wrong
Privacy law doesn't just constrain the holder; it arms the person. The GDPR gives data subjects a set of rights over their own information, to be informed, to access a copy of what you hold, to have inaccurate data corrected, to erasure (the "right to be forgotten"), to restrict or object to processing, and to data portability. In practice the right to access and the right to erasure generate the most operational work, because they force you to be able to find every copy of a person's data on demand, which you can only do if you knew where it was in the first place. The rights are, quietly, an audit of your own data hygiene.
When something goes wrong, the clock is short. Under Article 33, a controller must notify the relevant supervisory authority of a personal data breach without undue delay and, where feasible, within 72 hours of becoming aware of it, unless the breach is unlikely to risk people's rights and freedoms. The deadline starts at awareness, not at the end of your investigation, which is why you rehearse the response before you need it: a named decision-maker, a simple severity test, and a pre-drafted notification template. Organisations that miss the window almost always miss it because nobody knew who was allowed to make the call at 9pm on a Friday.
Most privacy failures are hoarding failures, not hacking failures, you can't lose data you never kept.
And the stakes are real. Article 83 sets two tiers of administrative fine: up to €10m or 2% of total worldwide annual turnover for the lower tier (record-keeping, breach-notification and similar failures), and up to €20m or 4% of global turnover, whichever is higher, for breaching the core principles, the lawful-basis rules, data-subject rights, or the international-transfer rules. These are not theoretical ceilings: in May 2023 the Irish Data Protection Commission fined Meta €1.2 billion, the largest GDPR penalty to date, for unlawfully transferring EU users' data to the United States, as reported widely at the time. The deeper signal is reputational: the fine makes the headline, but the lost trust is the lasting cost. (Enforcement also evolves, some large fines are appealed and revised, so treat any single figure as a marker of intent, not a fixed tariff.)
This is also where privacy meets the rest of your security posture: the "integrity and confidentiality" principle is, in plain terms, an instruction to protect the data you hold, which makes good access control a privacy obligation and not merely an IT one.
A worked example
Take a mid-sized online retailer, call it Harbour & Co. (Illustrative scenario; figures and details are a teaching example, not a real company.) Marketing wants to launch a loyalty programme and, in their enthusiasm, proposes collecting date of birth, full address, browsing history and a free-text "tell us about yourself" field, storing it all indefinitely "in case it's useful later."
Run that through the Article 5 questions and the plan thins out fast. Purpose limitation: a birthday offer justifies date of birth, so it earns its place, but indefinite browsing history does not. Data minimisation: the free-text field is a magnet for data you never wanted (someone will type a health condition into it), so it goes. Storage limitation: "indefinitely" becomes "for the life of the membership plus a defined period," written down. Lawful basis: the loyalty account itself sits on contract; the marketing emails on top of it need consent, captured separately so it can be withdrawn without cancelling the membership.
flowchart TD A(["Proposed: collect everything,
keep forever"]) --> B{"Article 5 review"} B --> C(["Purpose: birthday offer ✓
indefinite browsing ✗"]) B --> D(["Minimise: drop free-text
'about you' field"]) B --> E(["Storage: membership life
+ defined period, written down"]) B --> F(["Basis: account = contract;
marketing = separate consent"]) C --> G(["Shipped: less data,
named purpose, defensible"]) D --> G E --> G F --> G
The result is a programme that collects less, says clearly what each field is for, keeps it for a defined time, and rests on the right legal footing, and it is now genuinely easier to operate, because an access or erasure request has far fewer places to reach. The privacy review didn't kill the idea. It made it the version you would actually want to defend.
Frequently asked questions
Does GDPR apply to us if we're not in the EU?
Possibly yes. The GDPR has extra-territorial reach: it applies to any organisation, wherever based, that offers goods or services to people in the EU/EEA or monitors their behaviour. A business in Australia, the US or the UK with EU customers can fall squarely within it. Beyond the EU, most modern privacy regimes, the UK GDPR, Brazil's LGPD, and many US state laws among them, share the same DNA, so building to GDPR principles tends to travel well. Check the specific laws of the markets you operate in.
What's the difference between PII and personal data?
They overlap heavily but aren't identical. "PII" (personally identifiable information) is an older, US-rooted term that often focuses on direct identifiers like name, social-security number or email. GDPR's "personal data" is broader: it captures anything that can identify a person directly or indirectly, including online identifiers and pseudonymous IDs. If you only protect the narrow PII set, you will under-scope what GDPR actually covers.
Is getting consent always the safest option?
No, and treating it as the default is a common mistake. Consent under GDPR must be freely given, specific, informed and as easy to withdraw as to give, and once withdrawn you must stop. For processing a service genuinely depends on, contract or legitimate interests is often the sounder basis. Pick the basis that fits the actual purpose, document it, and reserve consent for things people can realistically say no to without breaking the service.
Do we really have to report a breach in 72 hours?
For breaches likely to risk people's rights and freedoms, yes, to the supervisory authority, without undue delay and where feasible within 72 hours of becoming aware, even if your investigation isn't finished. Low-risk breaches may not need reporting, but that is a judgement you should make deliberately and record. The practical answer is to decide in advance who owns the call and to have a template ready, so the clock isn't eaten by confusion.
Isn't all this just expensive box-ticking?
It becomes box-ticking when treated as a compliance veneer over bad habits. Read as a design brief it does the opposite: collecting less data lowers your storage cost, your breach exposure and your attack surface at once. The organisations that resent privacy law are usually the ones hoarding data they can't justify; the ones that internalise it tend to run leaner systems and earn more trust. The principles are good engineering, not just good law.
Related in the Toolkit
- Security fundamentals & threat modelling, the "integrity and confidentiality" principle is a security obligation, so privacy and security share the same controls.
- Identity & access management, who can see which personal data, and proving it, is half of accountability in practice.
- Data retention, residency & sovereignty, storage limitation and the cross-border transfer rules that the Meta fine turned on.
- Product & data risk, building data protection by design into what you ship, before the database exists.
- Cyber risk & incident response, the breach-response rehearsal that makes the 72-hour rule survivable.
- Financial statements (P&L, balance sheet, cash flow), why turnover-based fines make privacy a board-level financial risk, not just a legal one.
- Lean, Six Sigma, Kaizen & continuous improvement, data minimisation is waste removal applied to information, not just process.
- Hosting & cloud architecture, where data physically lives shapes which privacy and transfer rules apply to it.
Where to go next
- The GDPR, Article 5 (gdpr-info.eu), the seven principles in the regulation's own words; the shortest worthwhile thing you can read on the subject.
- UK ICO, Guide to the UK GDPR, the regulator's plain-English guidance on principles, lawful bases, rights and breach reporting; practical and authoritative.
- IAPP, "Refresher: the GDPR's six legal bases for data processing", a clear walk-through of when to use which basis, from the main professional body for privacy practitioners.
- "GDPR, explained", Vox (YouTube, 2018), a short, well-made primer on why the law exists and what it changed for the big platforms; good for sharing with a non-technical team.