A board asks, "Is our customer data safe?" and someone answers, "Yes, it's all hosted in the EU." That answer feels complete and isn't. Where data sits is one of three separate questions, and the other two, how long you are keeping it, and whose laws can reach it, are the ones that more often go wrong. Pull them apart and each becomes a decision you can actually make.

The quick version

  • Retention is how long you keep data. The rule of thumb in modern privacy law: only as long as you genuinely need it, then delete or anonymise it. Keeping data "just in case" is a liability, not an asset.
  • Residency is where the data physically sits, which country's data centres store it. You often choose this; sometimes a law or contract requires it.
  • Sovereignty is whose laws govern and can compel access to the data, which can differ from where it physically lives, because some laws follow the company, not the disk.
  • The trap is assuming residency buys you sovereignty. Data stored in Frankfurt by a US-owned provider can still be reachable under US law. They are different problems with different fixes.

The idea in depth

Start with retention, because it is the one you control most directly and the one that quietly accumulates the most risk. Under the EU's General Data Protection Regulation, the storage limitation principle (Article 5(1)(e)) requires that personal data be "kept in a form which permits identification of data subjects for no longer than is necessary." The law deliberately sets no universal number of months, it makes you define a retention period per purpose, document it, and actually enforce deletion. The UK's regulator, the ICO, puts the same point bluntly in its storage-limitation guidance: you should not keep personal data indefinitely "just in case," and you must be able to justify how long you hold it.

The practical move is to treat every store of data as having an expiry, not a default of forever. Write a retention schedule, a table of data categories, the purpose each serves, how long you keep it, and what triggers deletion, and wire automatic deletion into the systems that hold the data, rather than relying on someone to remember. Every record you hold past its usefulness is something an attacker can steal, a regulator can fault, and a subject-access request can force you to dig up. Old data is rarely an asset; it is almost always a stored liability.

Data you no longer need is not an asset you are keeping. It is a liability you are storing.

An honest limitation. "Only as long as necessary" sounds clean and gets muddy fast, because other laws pull the opposite way. Tax, employment, anti-money-laundering and sector rules often require you to keep specific records for fixed minimum periods, and those minimums vary by jurisdiction and data type. Storage limitation is a ceiling shaped by purpose; statutory retention is a floor. Real retention schedules live in the gap between them, which is exactly why "delete everything you can" is a principle, not a policy you can copy from someone else.

Residency and sovereignty are not the same thing

Here is the distinction that catches experienced teams out. Residency is geography: the physical or contractual location where your data is stored and processed, say, "all data stays in EU data centres." Sovereignty is jurisdiction: which government's laws can compel access to that data, and under what process. The intuition that picking an EU region gives you both is wrong, and the reason is a single US statute.

The US CLOUD Act of 2018 (Clarifying Lawful Overseas Use of Data) amended the Stored Communications Act so that US-based providers must produce data in their "possession, custody, or control" in response to a valid US legal order, regardless of where in the world that data is stored (Congressional Research Service explainer). It was passed to resolve the long-running United States v. Microsoft dispute, in which Microsoft refused a US warrant for email it held in Ireland. The practical consequence: jurisdiction follows corporate control, not the location of the disk. Data resident in Frankfurt, but held by a US-headquartered cloud provider, can sit within reach of US legal process, a tension with the GDPR that providers and regulators are still working through (see this CMS white paper for a measured legal walk-through).

flowchart TD
  Q(["A government wants
your data, can it get it?"]) --> R{"Where does the
data physically sit?"} R -->|"Residency answers this"| S{"Who controls the
company holding it?"} S -->|"Sovereignty answers this"| T(["Whose laws can
compel access"]) R -.->|"EU data centre,
US-owned provider"| U(["Resident in the EU,
still reachable under
US law (CLOUD Act)"])
Residency asks where; sovereignty asks whose laws. The two can point to different countries. Leaders Loop

So the move is to stop asking your cloud provider only "where will the data live?" and start asking "whose legal orders can compel you to hand it over, and what notice and challenge rights do we get if they do?" If genuine sovereignty matters for a workload, government, health, defence, or regulated data with explicit localisation rules, residency alone won't deliver it. You are then in the territory of sovereign-cloud arrangements, locally-controlled operators, or encryption schemes where you, not the provider, hold the keys, so that even a compelled provider has nothing readable to surrender. None of these is free, which is why this is a risk decision, not a default.

An honest limitation. Sovereignty is not absolute and the providers know it; "EU sovereign cloud" offerings are improving precisely because the demand is real, but the marketing runs ahead of the guarantees. Read past the brochure to the operating model: who legally controls the entity, who holds the encryption keys, and what happens to support and updates if cross-border access is cut off. Sovereignty you cannot operate is theatre.

Cross-border transfers: the moving floor under all of this

Retention and residency are largely your decisions. Cross-border transfer rules are not, they are set by regulators and courts, and they move. The defining moment was the Court of Justice of the EU's July 2020 Schrems II judgment, which struck down the EU–US "Privacy Shield" because US surveillance law did not give EU citizens protection essentially equivalent to the GDPR (a clear summary sits in this European Parliament briefing). Transfers didn't stop, but the burden shifted: organisations relying on Standard Contractual Clauses had to assess, case by case, whether the destination country actually protected the data, and add safeguards where it didn't.

The successor arrangement, the EU–US Data Privacy Framework, took effect with the European Commission's adequacy decision on 10 July 2023, restoring a route for certified US firms to receive EU data without extra paperwork (OneTrust summary). In September 2025 the EU's General Court dismissed the first challenge to it (the Latombe case), but that ruling has been appealed to the Court of Justice, so the framework's long-term stability is, candidly, not settled.

Treat your transfer mechanism, then, as a dependency that can fail, not a box you tick once. Keep a current record of where personal data flows across borders and on what legal basis, and avoid hard-wiring your architecture so tightly to one framework that a court ruling forces an emergency migration. The leaders who weathered Schrems II best already knew which data went where, and why.

A worked example

Take a UK scale-up, call it Mersey Health, running a wellbeing app. (Illustrative figures and choices throughout; this is a teaching example, not legal advice or a real company.) It collects sign-up details, sensitive health logs, and support-chat transcripts, all stored with a large US-owned cloud provider in that provider's EU region. The team's mental model is "we're EU-hosted, so we're fine." Pulling the three questions apart shows three different gaps.

Retention: support transcripts and the logs of users who deleted their accounts two years ago are still sitting in the database, because nothing ever deletes them. That is a storage-limitation problem under Article 5(1)(e) and a breach-radius problem, data with no business purpose, waiting to be stolen. The fix costs nothing but discipline: a retention schedule (say, transcripts purged after 12 months, deleted-account data wiped within 30 days) enforced by a scheduled job.

Residency: the EU region is fine for most data, but the health logs are the crown jewels. The team decides those stay in-region and are encrypted with keys Mersey controls, not the provider.

Sovereignty: because the provider is US-owned, the encrypted health logs could in principle fall under a CLOUD Act order. Holding the keys themselves means that even a compelled provider hands over ciphertext it cannot read, turning a sovereignty exposure into a manageable one without abandoning a provider that works.

flowchart LR
  A(["Customer data
at Mersey Health"]) --> B{"Three questions"} B --> C(["Retention:
schedule + auto-delete
stale records"]) B --> D(["Residency:
keep health logs
in-region"]) B --> E(["Sovereignty:
hold our own keys
vs CLOUD Act reach"])
One data set, three decisions, each with a different owner, cost and fix. Leaders Loop

Notice what the split bought them. "Are we EU-hosted?" had a single, falsely-reassuring answer. The three questions produced three concrete actions, a deletion job, a residency choice for one data class, and a key-management decision, none of which required leaving the provider, and all of which a normal team can do this quarter. That is the payoff of refusing to let one word stand in for three.

Frequently asked questions

What's the actual difference between data residency and data sovereignty?

Residency is where data physically lives, the country whose data centres store it. Sovereignty is whose laws govern it and can compel access. They usually overlap, but not always: data resident in the EU but held by a US-controlled company can still be reachable under US law such as the CLOUD Act. If a workload truly needs legal independence, choosing an EU region is necessary but not sufficient.

How long are we actually allowed to keep personal data?

There is no single number. The GDPR and UK GDPR require you to keep personal data no longer than necessary for the purpose you collected it, and to justify the period you choose, while other laws (tax, employment, anti-money-laundering) often set minimum periods you must keep certain records. So you set a retention period per data category, sitting above any statutory minimum, document it, and enforce deletion. "We keep everything forever" is not a defensible position.

Is storing data in the EU enough to satisfy the GDPR?

Location is one factor, not the whole picture. The GDPR governs how you handle personal data wherever it sits, and separately restricts transferring it outside the EEA without an approved mechanism. EU residency can help with transfer concerns, but it does not by itself address retention, security, lawful basis, or the sovereignty question of who can compel access. Treat "hosted in the EU" as one tick on a longer list.

Does the CLOUD Act mean we can't use US cloud providers?

No, most organisations use them lawfully every day. It means you should understand the exposure and decide deliberately. For ordinary data the risk is often acceptable; for highly sensitive or regulated data you can reduce it by keeping the keys yourself, choosing locally-controlled or sovereign-cloud options, or minimising what you store at all. The point is to make it a conscious risk decision rather than an accidental one.

Our cross-border transfer is covered by the Data Privacy Framework, are we permanently safe?

You are compliant today, which is what matters operationally, but the framework's history counsels humility. Its predecessor (Privacy Shield) was struck down in 2020, and the current framework, though upheld by the General Court in 2025, is under appeal. The resilient posture is to know exactly which data crosses which borders on which basis, so that if the legal ground shifts again you can adapt deliberately instead of scrambling.

Related in the Toolkit

These three questions sit inside a wider security and privacy picture: deciding what could go wrong and what is worth protecting is the work of threat modelling, while the day-to-day mechanics of handling personal data lawfully belong to data privacy & PII handling.

Where to go next