Most AI agent data leaks are not the model secretly memorizing your company. They come from configuration and architecture mistakes you can name and fix. The big ones: running on a consumer-tier account whose defaults train on your inputs, letting an agent inherit one person's broad permissions, wiring a public-facing agent into internal systems, and leaving the "lethal trifecta" intact (private data, untrusted content, and an external send path) so a single prompt injection can read your data and mail it out. That last pattern is exactly how EchoLeak (CVE-2025-32711) exposed sensitive data through Microsoft 365 Copilot from a crafted email, with no user click. The fixes are not exotic, and almost all of them are decisions made before the agent launches, not a checklist handed to you after something leaks.

This is the list we work through before we put an agent we built inside another company's stack, written so a non-technical owner can spot the same mistakes in their own setup or a vendor's. If you would rather we keep this line where it should be for you, see how we run responsible AI governance and risk. Everything below is yours to use either way.

First, a distinction that almost every article on this topic blurs, because getting it wrong is mistake zero. There are two completely different ways data leaves through an agent, and they have completely different fixes:

  • Provider data handling. Does the model provider store or train on your inputs? This is a contract you sign and an account tier you choose. It is solved with terms and configuration.
  • Runtime exfiltration. Can the agent be tricked, at the moment it runs, into sending your data somewhere it should not go? This is solved with architecture, not a contract.

The survey numbers say buyers feel this even when they cannot name it. In Cloudera's research of nearly 1,500 senior IT leaders across 14 countries, 96% plan to expand AI agent use over the next year, yet 53% (more than half) name data privacy as their primary adoption obstacle. The seven mistakes below are where that fear becomes a real leak, and where the fix lives.

Mistake 1: running production agents on consumer-tier accounts

This is the cheapest mistake to make and the easiest to fix. Someone wires a workflow to a personal ChatGPT or consumer Gemini account because it is already there, and now your business data is governed by consumer terms that were never meant for it.

The gap between consumer and enterprise tiers is sharp. On the enterprise side, the pattern is consistent across providers: no training on your data by default, a short retention window for abuse monitoring, and stronger options on request. The OpenAI API retains data for 30 days for abuse monitoring then deletes it, and does not train on it by default. As of September 14, 2025, Anthropic cut API log retention from 30 days to 7 days, and API inputs and outputs are never used for training. Google's Vertex AI (the enterprise tier) is configurable with no training use. On the consumer side the defaults flip: consumer ChatGPT stores conversations indefinitely unless you delete them, and consumer Gemini retention runs up to 36 months and may be used to improve products.

The fix: put every production agent on an enterprise API or business account, never a personal login. The technology is identical. The terms are not, and the terms are the entire difference between data that stays governed and data that quietly becomes training material.

Mistake 2: over-broad permissions the agent inherits

The most common runtime leak is not clever. An agent is handed one employee's access, or worse, an admin key, so it can technically see everything that person can see. Then it is asked a question, or tricked into one, that reaches far beyond its job.

Microsoft's Cloud Adoption Framework states the rule plainly: "Grant agents access only to the specific data sources required for their function. Don't provide broad access to all organizational data." A support agent does not need your payroll system. A scheduling agent does not need the customer database. When an agent acts on behalf of a person, it should inherit that person's permissions by passing their identity securely, so a helpdesk agent shows an employee only their own HR record, not everyone's.

The fix: give each agent its own identity, scoped to least privilege, and have it inherit the acting user's permissions rather than carrying a standing superset. The test is simple. If the agent is tricked tomorrow, the damage is bounded by the narrow set it was granted, not by everything one over-permissioned account could reach.

Mistake 3: public-facing agents wired into internal data

A chatbot on your website and an agent that drafts internal reports are two different security worlds, and the leak happens the moment someone lets them share a backend. A public agent takes input from anyone on the internet. If that same agent can also query internal business data, you have built a door from the public web straight into your private systems.

Microsoft's framework draws the hardest line here: "Public-facing agents must not access internal business data." Keep confidential and public separated by a physical or logical boundary (their example uses separate "corp" and "online" management groups). The point is that the boundary is structural, not a hope that the prompt will hold.

The fix: put a real boundary between agents that take untrusted public input and agents that touch confidential data. If a single agent genuinely must do both, that is the highest-risk design you can choose, and it needs the trifecta-breaking controls in the next mistakes, not just a careful system prompt.

Mistake 4: ignoring the lethal trifecta

This is the mistake that turns the first three into an actual exfiltration. Simon Willison, the independent expert who coined the term "prompt injection," named the "lethal trifecta" for AI agents: three conditions that, when combined, let a single injected instruction steal your data.

The three legsWhat it meansExample
Access to private dataThe agent can read sensitive informationInbox, CRM, internal docs
Exposure to untrusted contentIt processes text it did not authorAn incoming email, a web page, a file
External communicationIt can send data outwardA reply, an API call, a fetched URL

When all three are present, an attacker hides an instruction inside the untrusted content. The model, in Willison's words, "will happily follow any instructions that make it to the model, whether or not they came from their operator or from some other source." It reads your private data and sends it out, and nothing in the prompt told it not to trust the email.

The blunt part, and the reason this is architecture and not a prompt-engineering problem: "we still don't know how to 100% reliably prevent this from happening." Documented exploits have hit Microsoft 365 Copilot, GitHub's MCP server, and GitLab Duo. In one week in January 2026, indirect prompt-injection vulnerabilities were disclosed in four AI productivity tools, all the same trifecta pattern.

The fix: break one leg. Deny the external send path so stolen data has nowhere to go, or never let an agent that reads untrusted content also hold private-data access in the same context. You do not need to win the unwinnable fight of catching every injection. You need to make sure that even a successful injection has no route out.

Mistake 5: treating EchoLeak as someone else's problem

It is worth naming one real incident, because "the lethal trifecta" sounds abstract until you see it executed. EchoLeak (CVE-2025-32711) was a zero-click attack on Microsoft 365 Copilot. An attacker sent a crafted email. Copilot, doing its normal job, processed that email (untrusted content), had access to the user's internal data (private data), and had a way to send information out (external communication). The hidden instruction completed sensitive-data exposure with no user interaction at all. Nobody clicked anything.

The mistake here is assuming a leak requires a careless employee. EchoLeak required none. The defense that mattered was not "train staff to spot phishing," because there was nothing for a human to spot. The defense was architectural: an agent that cannot both read attacker-controlled content and exfiltrate cannot be turned against you this way.

The fix: assume zero-click is the threat model, not the exception. If your agent reads anything an outsider can influence (email, web content, uploaded files, tickets) and it can also send data outward, you have an EchoLeak-shaped hole until you close one of those two capabilities or hard-fence the send path.

Mistake 6: no data residency or retention boundary

Some leaks are not dramatic exfiltrations, they are slow drift. Data ends up stored in a region you never agreed to, or sitting in logs longer than your policy allows, simply because nobody set the boundary. Microsoft's framework lists this as core governance: identify the location of each data source, agent runtime, and output storage, keep data in regions or on-prem that match your residency policy, and keep it encrypted at rest.

The reassuring news, which almost no article states plainly, is how much you can pin down. The enterprise pattern across providers is no-training-by-default plus a short retention window plus stronger guarantees on request. Anthropic offers qualifying enterprises a Zero Data Retention (ZDR) agreement, under which inputs and outputs are not stored beyond what is needed to screen for abuse. OpenAI offers ZDR via a negotiated agreement and data residency. Both let you choose where data lives. Self-hosting an open model (for example with Ollama) keeps data fully local with zero retention by third parties.

The fix: decide and document the boundary before launch. Which region holds the data, how long logs live, whether you need ZDR, and whether sensitive retrieval should run in your own VPC or on-prem. ZDR is usually API-only and negotiated, and it does not automatically cover every product unless you add it, so the detail matters.

Mistake 7: treating privacy as a checklist you hand over

The last mistake is structural. Most guidance, including the excellent Microsoft framework, assumes you will build and govern the agent yourself, so it hands you a 3,000-word stack of Entra identities, Purview labels, and management groups. That is the right answer for a large IT team. For most businesses it is a checklist nobody has the time or specialist skill to fully implement, so it half-ships, and the gaps are exactly where data leaks.

The mistake is treating the controls as documentation rather than as an architecture someone owns end to end. A checklist that lists "isolate confidential data" and "restrict external integrations to trusted MCP servers" is only as good as the person wiring it in, validating every external communication, and standing up the incident plan to disable an agent fast when something goes wrong.

The fix: make sure one party actually owns the boundary, not just the document. Either you build the in-house capability to implement and maintain the full stack, or you have an operator architect these failure modes out and run them, with a published boundary you can point to. The wrong middle ground is a list of controls and nobody accountable for them holding.

How the seven mistakes and fixes line up

Here are all seven in one place, sorted by whether the fix is a contract you sign or an architecture you build. That split is the fastest way to know which team owns each one.

#MistakeFixType
1Consumer-tier accounts in productionEnterprise API or business account, no trainingContract
2Over-broad inherited permissionsLeast-privilege identity, inherit the user's accessArchitecture
3Public agents touching internal dataHard boundary between public and confidentialArchitecture
4Ignoring the lethal trifectaBreak one leg: no untrusted content plus private data plus send pathArchitecture
5Assuming a leak needs a careless clickDesign for zero-click; fence the exfiltration pathArchitecture
6No residency or retention boundaryPin region, set retention, sign ZDR, redact at the boundaryContract
7Privacy as a handed-over checklistOne owner accountable for the boundary holdingOwnership

Notice that only two of the seven are solved by a contract. The other five are architecture and ownership, which is the part a PDF of best practices cannot do for you. It is also the honest reason the authoritative sources leave a gap: they tell you what to build, not who will build it correctly inside your specific systems.

What actually never leaves, when the boundary is built right

It is easy to read seven mistakes and conclude AI agents are a privacy minefield. They are not, when the line is drawn deliberately. On enterprise terms, your inputs are not training data, and retention is days, not forever (7 for the Anthropic API, 30 for OpenAI, deleted after). With ZDR, they are not stored beyond abuse screening. With data residency pinned, they stay in your region. With VPC or on-prem retrieval, sensitive data never leaves your perimeter at all. With redaction at the boundary, the fields that should never travel are stripped before anything is sent. With least-privilege scoping that inherits the user's permissions, the agent can only ever reach what the acting person could already reach.

That is a specific, defensible boundary, and being able to state it plainly is the whole point. The danger was never the model quietly absorbing your company. It is the seven configuration and architecture mistakes above, every one of which is preventable before the first agent goes live.

This is the work we do inside other companies' stacks: choosing the terms, scoping the identities, isolating the public from the confidential, breaking the trifecta, and owning the boundary so it holds. If you want these mistakes architected out of your agents before they touch anything real, book a free consultation below and we will map your boundary together.