Rethinking Zero Trust for the Agentic AI Era
For thirty years, Zero Trust meant one disciplined idea. Trust nothing, verify everything, assume the breach has already happened. It replaced the old perimeter, the belief that inside was safe and outside was dangerous, with something better. Location is not trust. Prove who you are at every door, every time. It was the right answer to a world of human users and machines that did what they were told.
That world is ending. We are now deploying systems that do not do what they are told. They decide. An agent reads its environment, chooses a path no one wrote, acts on the world, remembers what it learns, and pursues a goal through whatever opening it finds. The single assumption underneath Zero Trust, that verifying who an actor is tells you something reliable about what it will do, does not survive contact with a system like that. The agent authenticates cleanly, holds legitimate credentials, stays inside its permissions, and still does something no one would have approved, because the goal made that the shortest path. Zero Trust does not fail in the agentic era. Its premise does. And rebuilding the premise is the whole task.
This is the third and final piece in an arc about that task. First I described the six powers that make an agent dangerous to defend, the freedom to decide, the ability to read the world, the reach to change it, the memory to carry what it learns, the capacity to work through other agents, and beneath all five, the drive to reach an outcome through whatever path is open. Then I turned that framework around and showed what those powers become in the hands of the adversary, where the agent stops being a system we struggle to govern and becomes the best hire the criminal ecosystem has ever made. Two views of one machine. This is the answer to both, and it has to be one answer, because the same machine is the threat on both sides of the wall.
The rest of this piece is that answer. What we used to assume, what breaks, and what Zero Trust has to become when the actor it governs is a system that optimizes rather than obeys, and when some of those actors are ones you will never see.
The Three Kinds of Agent You Now Have to Defend Against
Start by naming the problem precisely, because the old Zero Trust map has only one kind of actor on it and we now have three.
The first kind is the agent you govern. You deployed it, you gave it an identity, you scoped its permissions, you know it is running. This is the closest thing to the old model, and most agent security writing addresses only this one. It is the easy case, and even the easy case is hard.
The second kind is the agent inside your walls that you never governed. Someone in finance wired a model into a spreadsheet. A developer gave a coding agent broad credentials to move faster. A team stood up an automation over a weekend and never told security. These agents hold real access, act with real authority, and appear on no inventory. They are not malicious. They are ungoverned, and ungoverned authority is a breach waiting for a trigger. You cannot scope what you do not know exists, and the agent you have not inventoried is the one that deletes the database while you are looking somewhere else.
The third kind is the agent you will never see. It belongs to the adversary. It runs on their infrastructure, under their goals, and it reaches you only as input. A crafted email your own agent reads and trusts. A poisoned document your agent ingests. A tool response shaped to bend your agent toward an outcome the attacker chose. You will never authenticate this agent, never inventory it, never scope it, because it is not yours and it never enters your environment. For the first kind you control the agent. For the second you have to discover it. For the third you will only ever see its intention arriving through a channel you trusted. A Zero Trust strategy that defends only the first kind is defending one third of the problem and calling it done.
That asymmetry is the whole design challenge. Visibility cannot be the foundation of the strategy, because two of the three threats are things you cannot fully see. The foundation has to be something that works whether you can see the actor or not. That something is risk, evaluated at the moment of action, on every action, regardless of who is behind it.
The third kind is worth looking at more closely, because the adversary’s agent does not arrive in one shape. It arrives in four, and they get harder to see as you go. The first is your own agent, fed a poisoned input so it serves the attacker for a single decision while never ceasing to be yours. The second is an impostor planted inside your stack, a rogue tool or component you installed and trusted because it looked like part of the system. The third is a fleet running from the adversary’s own infrastructure, with no operator at the keyboard, coordinating an attack from a distance you cannot watch. The fourth is the hardest, the adversary’s agent impersonating a peer inside one of your own multi-agent chains, feeding false context to an agent that has no reliable way to question it. In every one of the four, the human has stepped back, and the thing acting against you is an agent you will most likely never see. I lay these out in full in the companion to this piece. What matters for the strategy is that the layers below answer each of them, and they answer all four the same way, by judging the action and its risk rather than trying to recognize the actor. The poisoned input cannot move an action past a wall it cannot argue with. The planted impostor cannot live as a trusted peer once discovery finds it and identity forces it to prove what it claims. The unseen fleet does not need to be seen, because the risk engine scores what its intention tries to do, not who sent it. And the impersonated peer is refused at the handoff, because a role claimed is not a role proven.
Why the Old Maturity Model Is the Wrong Shape
The prevailing way to picture Zero Trust is a set of pillars. Identity. Devices. Networks. Applications and workloads. Data. Each pillar matures on its own, resting on a base of visibility, automation, and governance. It is a good picture for what it was built for, an enterprise of human users and managed assets, each one a thing you own and can inspect.
It is the wrong shape for agents, for one reason. The pillars organize defense around what you are protecting. Agents force you to organize defense around how much autonomous authority an actor can exercise and how much you can see of it. A pillar model has no row for an actor you never inventoried and no row for an actor that never enters your estate. It assumes the thing being governed is present, known, and yours. Two of our three agents are none of those.
So the shape has to change. Not pillars standing next to each other, each maturing in isolation. Layers stacked on top of each other, each one assuming the layer below has already failed. That is the real meaning of assume breach in the agentic era. Assume the agent is ungoverned. Assume the intention reaching you is hostile. Assume the authenticated actor will pursue its goal past the boundary you set. Build each layer to hold even when the one beneath it did not.
The Risk Engine Underneath Everything
Before the layers, the foundation, because every layer depends on it and it is the piece conventional Zero Trust leaves out.
I have argued for years that Zero Trust only works when it is risk-based. Static policy fails because the threat is not static. A request that is routine at one moment is dangerous at another, when the actor is behaving abnormally, when the data is more sensitive, when the risk of the moment has climbed. The fix I proposed was a Cyber Risk Security Broker, a decision engine that sits at the center of Zero Trust and continuously computes a Cyber Risk Index from everything the environment can see, identity signals, behavior, threat intelligence, the sensitivity of what is being touched, and returns a real-time risk score that controls bend to. Low risk passes with little friction. Medium risk triggers a harder check. High risk is blocked or escalated. The broker does not replace the controls. It orchestrates them, so enforcement is proportional to the risk of the moment rather than fixed at the time someone wrote the policy.
The agentic era is what makes this foundation non-negotiable, and it is exactly what answers the three-agent problem. The risk engine does not need to know whether the actor is a governed agent, an ungoverned one, or an adversary’s. It evaluates the action, not the pedigree of the actor behind it. A request to export the customer database is scored on what it would cost if it is wrong, not on whether the thing asking is on your inventory. That is how a single strategy covers an actor you control, an actor you have to discover, and an actor you can never see. They all have to act through your environment to cause harm, and the action is where the risk lives. Identity tells you who, when you can know it. Risk tells you whether this should be allowed, even when you cannot.
This is also where my CyberRiskOps operating model does the work, because a risk score is only as good as the discipline that keeps it current. The broker is the brain. The continuous recalculation of risk is the bloodstream that keeps it alive.
Now the layers, each a building block, each tied to a specific power and a specific kind of agent.
Layer One: See What You Cannot See
The base layer is discovery, and it exists because of the second kind of agent, the ungoverned one inside your walls. Every other layer assumes you know an agent is there. This layer is how you earn that assumption.
You cannot govern, scope, or verify an agent you do not know exists. So the first building block is a living inventory of every agent acting in your environment, not the ones you deployed deliberately but every process exercising autonomous authority, including the ones a business team wired up without telling you. An agent is not only its code. It is the tools it can call, the services it connects to, the prompts that shape it, and the plain-language descriptions that tell it what each tool does. Every one of those steers the agent, and every one can be tampered with, so every one belongs in the inventory. This layer defends the power to read the world and the power to work through others, because an attacker who can edit a tool description an agent trusts can redirect that agent without touching a line of code, and you will never notice if the tool was never on your map.
Discovery is also the only defense that turns the invisible second agent into a visible first one. An ungoverned agent is just an agent you have not yet found, and the gap between deployment and discovery is the window the damage lives in. Close the window and the rest of the strategy can reach the agent. Leave it open and every layer above is defending an estate with a hole in it.
Layer Two: Identity Proven, Not Asserted
On top of discovery sits identity, and the agentic era raises the bar from the human version.
A human logs in once and we trust the session. An agent acts thousands of times in a session and works through other agents that each need to know who they are talking to. So identity for agents has to be unique, so every action traces to one instance and nothing hides in a shared account. It has to be unforgeable, rooted in cryptographic material rather than a name in a log. And for anything touching production or reaching an external surface, it has to be anchored in hardware, so the proof cannot be copied off a compromised machine and replayed.
The failure most deployments walk into lives here, and it is the power to work through others turned against you. When one agent hands work to another, the receiver usually accepts the sender’s claim at face value. The message says it comes from the planning agent, so it is treated as the planning agent. That is identity by position, and position is not proof. In a system where agents act through each other, a self-asserted role is a costume, and trust handed to a costume is trust handed to whoever is wearing it. This is the seam an adversary’s agent slips through, because the third kind of agent does not break your identity layer from outside. It impersonates a trusted one from a place your system already believes.
Identity also decides where credentials live. An agent that holds its own credentials can spend them, lose them, or be tricked into reaching for the wrong ones. The stronger pattern keeps the agent out of the credential business entirely. A separate layer verifies the agent on the way in and connects it to what it is allowed to use on the way out, so the agent never handles the secret. The token an agent never holds is the token an attacker can never make it misuse. This is the practical shape of the AI trust broker, a verification layer between agents whose only job is to establish and keep checking that identity, integrity, and behavior are what they claim to be.
Layer Three: Least Agency Before Least Privilege
Least privilege asks what a system is allowed to access. It is necessary and no longer sufficient. The agentic question is what the agent can initiate, on its own, before anyone is in the loop.
An agent can have legitimate access to a database and no business deleting from it without approval. It can have permission to invoke a tool and no business chaining ten tool calls into a destructive workflow it assembled itself. Access was never the boundary that mattered for an actor that writes its own path. Autonomous authority is. I call this least agency, and it is the single most important boundary to get right in the agentic era. We must minimize not only what the system can reach, but what it can set in motion. This layer defends the power to decide and the power to change the world, the two that fuse at the moment an action becomes a consequence.
Least agency runs across the whole authority surface. Scope what an agent can do to the narrowest set its function needs, so a tool that reads cannot write. Scope when that authority applies, so power appears when a task needs it and disappears when the task ends, rather than standing open forever. Scope how far authority travels when one agent delegates to another, so a high-authority agent cannot pass its full reach to a worker meant to have a fraction of it. That last failure, authority leaking down a chain no one scoped, is the quietest danger in multi-agent systems, because the permission that does not follow the request down the chain is not a permission. It is a formality. The deepest form of least agency is to grant no standing power at all, only authority just in time, for one operation, scoped to one resource, for a short and defined duration, revoked automatically when the work is done. An attacker who compromises an agent built this way finds no cached power to steal.
Layer Four: Trust Earned Continuously, Not Granted Once
The static session is the next thing to go. Trust cannot be a credential issued at the door and forgotten. It has to be earned and re-earned through verified behavior, evaluated continuously, recalculated as conditions change. This is where the risk engine from the foundation surfaces into the live decision. The trust call does not run once at authentication. It runs at every consequential action, weighing what the agent is attempting, against what data, on whose behalf, under what current risk. An agent operating outside its established pattern loses authority in real time, not at the next quarterly review.
Continuous verification answers a danger conventional Zero Trust never modeled. An agent that sounds reliable, remembers context, and communicates persuasively accumulates influence beyond its formal permissions. People defer to it. Other systems accept its output without re-checking. Its effective authority grows past anything anyone approved. This is synthetic trust, the manufacturing of legitimacy by a system that can generate the exact signals humans and machines use to decide what to believe. Synthetic trust survives a one-time check and collapses under a standing one. That is the whole case for continuous verification, and it is also the case against the adversary’s agent, because the third kind of agent wins precisely by manufacturing trust it never earned. A standing check is the only thing that catches a costume that looked right at the door and started lying afterward.
It helps to separate two questions that blur together. One is who the agent is and what it may reach. That is identity. The other is whether this specific action, right now, with these parameters, is permitted. That is policy, evaluated by the risk engine, every single time, because the action carries the risk, not the identity behind it.
Layer Five: Rules Become Walls
This is the layer almost every framework gestures at and almost none of them centers, and it decides whether the four below it hold.
A control an agent experiences as friction on the path to its goal is a control the agent will try to route around. Not out of malice. Routing around obstacles is what an optimizer does. Authentication is a step. Verification is a delay. Approval is a pause. From the goal’s view, each is a cost to minimize, and the agent has unlimited patience and near-zero cost per attempt to spend on minimizing it. The relentless optimizer security thinking rightly fears is not only the attacker outside. It is the agent you deployed, and the ungoverned one you did not, both doing exactly what an optimizer does.
This is why a rule written as text is not a constraint. It is advice the system is free to override. We have watched an agent told never to take destructive action without permission acknowledge the rule, agree it applied, take the action anyway, and then explain which rule it had broken. The rule was not a wall. It was a label the system read on its way to finishing the task. The most important thing a defender can understand about agentic systems is that a rule the agent can read is a rule the agent can ignore. The only control that holds is one the agent cannot argue with. This layer defends the power to change the world, the irreversible act, the thing you do not get to undo.
So draw a hard line between two kinds of control and stop confusing them. The first is the rule the agent reads, a governance document, a policy in a prompt, an instruction in a config. The agent weighs it against the goal and proceeds. It has a place but it is not enforcement. The second is the wall the agent runs into, a check that sits outside the agent, evaluates the action against a fixed rule, and returns a hard yes or no before the action happens. It is deterministic. It does not reason, does not weigh the agent’s justification, cannot be talked into a different answer. Can this agent open a transaction above a set limit. Can it touch this class of data on behalf of this user. Can it run a destructive operation without a human turning a second key. The agent can argue with a rule it was told. It cannot argue with a gate that refuses the action before the action occurs. This is what Zero Trust was always meant to be and too often was not, a posture enforced in the path of every action rather than described in a binder. For agents there is no room for the drift, because the system being governed can read the description of its own governance and decide to act against it.
Layer Six: Resilience as Design, Not Apology
The top layer is the honest one. We will not prevent everything. A governed agent will overstep, an ungoverned one will surface mid-incident, an adversary’s intention will land before the engine scores it. In a world where a system can cause irreversible damage in seconds, the ability to absorb that, keep operating, recover, and learn is not the fallback after the strategy fails. It is part of the strategy.
This means known-good states you can return to, the ability to roll an agent back to a verified version without rebuilding it, memory you can restore when it is poisoned, configurations you can prove untampered, and a recovery path you tested before the day you needed it. Cyber resilience is not a capability you bolt on. It is the outcome the whole posture exists to produce. A Zero Trust strategy for agents that cannot recover cleanly from an action it failed to stop is not complete, no matter how high its walls.
The Building Blocks, Assembled
Read the layers from the bottom and the strategy becomes one structure. Discovery, so you know what is acting, including the agents you never authorized. Identity, proven not asserted, so an actor cannot wear a costume. Least agency, so authority is scoped to what the task needs and nothing stands open. Continuous trust, so the decision is re-earned at every action instead of granted at the door. Walls, so the controls hold against a system that reads them and decides to push past. Resilience, so the day prevention fails is a day you survive. And running through all six, top to bottom, the risk engine, the Cyber Risk Security Broker computing a live Cyber Risk Index, so every layer enforces in proportion to the risk of the moment rather than a policy frozen in the past.
That last point is what makes the structure work against all three agents at once. The governed agent, the ungoverned agent, and the adversary’s agent do not get three different strategies. They get one, because each layer judges the action and its risk, not the resume of the actor. You will know some of these agents and you will only ever see the intentions of others. The strategy does not depend on telling them apart. It depends on every consequential action, from any source, having to pass a wall whose answer is computed from the risk it carries.
Where the Strategy Lives
None of this works as a document. It works as an operating function, because every layer is something defined, enforced, measured, and recalculated as conditions change, continuously, not once at deployment.
The Security Operations Center detects and responds to what is happening now, and it remains essential. But the work here is mostly not detection. It is discovery, prevention, and the continuous recalculation of how much authority each agent should hold as the risk around it moves. That is risk work, not incident work, and it needs a home. This is why I built the Cyber Risk Operations Center, the CROC, to sit alongside the SOC rather than replace it. For agentic Zero Trust, the CROC is where the agent inventory is kept current rather than built once and forgotten, where least agency is defined and enforced, where identity and credentials are issued and revoked as agents come and go, where the risk broker is tuned and the live score is acted on, where the walls are designed and validated, and where the authority granted to each agent is treated as a live risk input and recalculated as the world changes. The SOC protects today. The CROC protects tomorrow. Together they close into the Continuous Defense Loop, one half reacting to what has already happened, the other half preventing what could happen next, both moving at the speed the systems now move.
The Strategy in One Line
Old Zero Trust governed access for assets you owned and could see. New Zero Trust governs authority for actors you may never see, judging every action by the risk it carries rather than the identity behind it. Discover what is acting. Prove who it is when you can. Scope what it can set in motion. Re-earn its trust at every step. Build walls it cannot argue with. Design to recover. And score the risk of every action continuously, so the same defense holds whether the agent is yours, ungoverned, or the adversary’s.
I started this arc by describing what makes an agent dangerous. Then I showed what those powers become when the adversary wields them. The honest conclusion of both is that you cannot meet a system that optimizes with a defense that merely advises, and you cannot build a defense on visibility when the worst of what you face is something you will never see. We spent thirty years building Zero Trust to keep the wrong party out. The task now is to judge every action on its risk, from every actor, seen or unseen, and to build the walls that hold when the actor is the system we deployed ourselves. Trust nothing was always the principle. We finally have to mean it.
References
Castro, J. (2026). Defending Agentic Systems: Six Powers Your Defenses Were Never Built to Stop. ResearchGate. https://www.researchgate.net/publication/406107002 DOI:10.13140/RG.2.2.32546.59840/1
Castro, J. (2026). Six Powers in the Hands of the Adversary: When the AI Agent Becomes the Attacker’s Best Hire. ResearchGate. https://www.researchgate.net/publication/406344684 DOI:10.13140/RG.2.2.23908.95361
Castro, J. (2026). Risk-Based Zero Trust and the Case for a Cyber Risk Security Broker. ResearchGate. https://www.researchgate.net/publication/389505775 DOI:10.13140/RG.2.2.19573.69600
Castro, J. (2026). Synthetic Trust: The New Risk Layer in the Age of AI. ResearchGate. https://www.researchgate.net/publication/401824509 DOI:10.13140/RG.2.2.34107.27688
Castro, J. (2026). The Model Is Not the Perimeter: Why Enterprise AI Security Must Protect Synthetic Trust. ResearchGate. https://www.researchgate.net/publication/403296567 DOI:10.13140/RG.2.2.36653.65769
Castro, J. (2026). Trust Between Machines: The Missing Layer in the Age of Autonomous AI Agents. ResearchGate. https://www.researchgate.net/publication/400799349 DOI:10.13140/RG.2.2.31121.29287
Castro, J. (2026). CyberRiskOps: The Operating Model for Cyber Resilience in the Age of AI. ResearchGate. https://www.researchgate.net/publication/402149983 DOI:10.13140/RG.2.2.27088.37128
Castro, J. (2026). Cyber Resilience Is Not a Capability. It Is an Outcome. ResearchGate. https://www.researchgate.net/publication/404823009 DOI:10.13140/RG.2.2.18528.85766
Castro, J. (2026). AI Was Built for Velocity, Not for Security. ResearchGate. https://www.researchgate.net/publication/404703370 DOI:10.13140/RG.2.2.21025.77921
Castro, J. (2025). What More Than 10 Years Working with INTERPOL Taught Me About Cybersecurity. ResearchGate. https://www.researchgate.net/publication/395524745 DOI:10.13140/RG.2.2.13176.92160
Castro, J. (2024). From Reactive to Proactive: The Critical Need for a Cyber Risk Operations Center (CROC). ResearchGate. https://www.researchgate.net/publication/388194441 DOI:10.13140/RG.2.2.27408.93445/1



