AI Agents Aren’t Trustworthy (But We’re Deploying Them Anyway)

Author: Billy Hewlett, Chief Scientist (AI)

Date: 04/23/2026

For the last couple of years, most enterprise AI has been advisory. Companies have been using language models to summarize, generate content, and help people move faster. A person still had to decide what to do with the output.

Agents are different. An AI agent can retrieve data, write code, trigger workflows, and operate across applications. Once AI starts doing things instead of just saying things, the security problem changes, because now you have to think about what the system can reach.

I think a lot of people are still focused on the wrong part of this. Model accuracy, hallucinations, and output quality. Those things matter, but the more pressing question, and one I do not think enough teams are asking, is what these systems can access and what happens when they get it wrong.

Key concepts

AI agent security risk is driven by access, not model accuracy, as agents act inside enterprise systems with real permissions
Over-permissioned AI agents increase risk by operating with broad, persistent access across multiple systems
Effective AI agent security requires identity-based, task-scoped, and time-bound access controls
Runtime access control and contextual identity enforcement are critical to prevent unauthorized actions and limit blast radius

AI agents vs generative AI: what changes when systems can act

Traditionally, machine learning systems classified, scored, or predicted. They could tell you whether a transaction looked suspicious, whether a customer might churn, or whether a lead should be prioritized. They were undoubtedly useful, but still contained.

Generative AI expanded that. Now the system could explain, summarize, translate, and write code. It could communicate in ways that felt much more flexible and much more “human.” But even then, at its core, the model was still predicting the next token and producing an output.

Asking a chatbot how to book a flight is passive. Asking an agent to actually book the flight means it has to review options, make choices, use personal information, interact with external systems, and potentially spend money. That crosses from generation into action, and once you give a system the ability to act, you are dealing with a participant in your enterprise systems, one that operates as an identity inside your environment.

AI agents create risk even when the model is accurate

Today's models can be very good at prediction. In many cases, they are shockingly accurate. What they are not very good at is understanding how confident they should be in that prediction.

A system can produce an answer that sounds plausible and still have no reliable sense of whether it is operating in a high or low-confidence situation. It does not consistently know when it is wrong. Most of the time, AI models work well, but sometimes they fail in unexpected ways, different from the ways that humans fail. With traditional question/answer LLMs, hallucinations are amusing; you think, "How could it make this kind of mistake?" When a hallucination happens in an agent-based system, your business can be at risk.

Agentic systems are powerful, and they are improving very quickly. But they are not dependable enough to be trusted on their own, especially when the task involves access, authority, and execution.

So when people ask whether AI agents are trustworthy, I think they are starting in the wrong place. Instead of asking “How do I make the agent trustworthy?” The better question is, “How do I constrain what the agent is allowed to do?”

Multi-agent systems expand the AI attack surface

This gets more complicated when you stop thinking about one agent and start thinking about systems of agents.

Agentic workflows are rarely linear. One agent plans, another executes, a third evaluates the results, etc. They may loop through that process several times until they get to an outcome that works. This can be incredibly powerful. It is also risky.

Why do multi-agent workflows create unintended access paths?

Once you have multiple agents working together, the system is no longer following a narrow, predetermined path. It is exploring options, iterating, and adjusting its behavior based on feedback.

The same pattern has already shown up very clearly in real-world red teaming scenarios. A planning agent can map out how to approach a target, an attack agent can interact with the target system, and a judge agent can evaluate whether the attempt succeeded. Then the system can revise its approach and try again. It can keep doing that until it finds a path that works.

The important point is that the system is not reasoning about ethics or acceptable boundaries in any way. It is optimizing towards its objective. If the objective is to get information, it will keep exploring until it finds a successful path unless something outside the agent prevents it.

This is compounded by the fact that agents cannot self-govern in any meaningful way. If you tell an agent not to use certain data, that instruction exists only in the prompt. It does not become a system-level constraint. If the agent can use that data to complete the task, it may still do so, because it is optimizing for the outcome, not enforcing the absence of something.

¹ https://arxiv.org/abs/2202.03286

Access control is the real security boundary for AI agents

For an agent to be useful, it needs access. I know that sounds obvious, but it’s where the security model changes. An agent cannot retrieve financial data, write an email, update a record, or take action in a workflow unless it can interact with the systems involved.

Often, this means operating with inherited permissions, and sometimes it means broad permissions across multiple systems.

In an enterprise environment, the action itself usually does not tell you whether something is wrong. Pulling financial data is normal. Sending an email is normal. Retrieving healthcare records is normal in certain situations. Neither the output nor the workflows are inherently suspicious. The difference between a valid action and a security problem is context:

Who is the agent acting for?
What was it supposed to do?
What data should it have access to at that moment?
What action was actually necessary to complete the task?

Without that information, a lot of bad behavior looks legitimate. The only way to know something has gone wrong is to connect the action to identity, permissions, and intent.

Why is AI agent access control easier to describe than to implement?

Suppose I ask an agent to look at last quarter's financials and draft an email to my boss summarizing the results. That sounds simple, but even that task carries a number of identity questions. There is an easy version of this problem, and there is a hard version.

The easy version of AI access control

The easy version is intersection. What can the user access? What can the agent system access? Give the agent the overlap between those two things. That at least prevents some obvious failures. If I do not have access to my boss's email, the agent should not have access to my boss's email, and it should not be able to read them just because I asked.

That gets you part of the way.

The hard version of AI agent access control

This does not solve the harder problem, which is context. The agent needs enough access to retrieve the financials. It needs enough access to draft or send the email. It does not, however, need access to unrelated data on the same server or read access to my boss’s emails. It likely does not even need persistent access to the financials after that part of the task is complete.

What you actually want is correctly scoped, task-specific, temporary access.

Give the system only the permissions it needs for the task at hand, and only for as long as it needs them. Remove them when that step is done. Limit the blast radius if the system is compromised, manipulated, or simply wrong.

What this really requires is runtime access control for AI agents, which is very close to a zero trust way of thinking. You do not assume the system should be broadly trusted just because it is acting on behalf of a valid user. You constrain the action to the smallest useful scope.

AI agents. An unfinished control model. And what you can do about it.

Most enterprise environments still rely on broad permissions and persistent access. They were designed around static roles, not dynamic execution. In many cases, the system enforcing identity does not even have visibility into the original request that kicked off the agent workflow.

So what happens? Over-permissioning. You give the agent access to everything it might need, because you do not have a reliable way to determine what it needs at a particular moment. This is understandable from an implementation standpoint, but also how risk expands.

Enterprises are not going to wait for perfect security models before they deploy agents because the business pressure is real and the capabilities are improving too quickly to ignore. This means it is imperative that organizations ensure they are putting the right controls in place before these systems become even more deeply embedded in their environments than they already are.

Constrain the identity. Constrain the permissions. Constrain the duration of access. Constrain the blast radius.

Identity connects action to actor. It tells you what the system is doing on whose behalf, helps distinguish valid requests from inappropriate access, and gives you the mechanism to set and enforce boundaries.

Without the right governance, agent behavior may look reasonable right up until the moment it becomes a breach.

Frequently asked questions about AI agent security

What is AI agent security?

Why is access control critical for AI agents?

What is the biggest risk with AI agents?

Do AI agents need their own identities?

Why is AI agent security different from traditional AI security?

Your next read: Managing AI Agent Lifecycles: From Registration to Retirement

07 / 01 / 2026

The Intelligence as a Service Era: How Agentic AI Reshapes Enterprise Software

READ BLOG

AI AI agents AI Agent Governance AI agent identity security AI agent posture management AI Security AI authorization AI agent security AI access control

06 / 23 / 2026

Identity Security Sovereign Cloud: Why It Matters and How We're Building for It

READ BLOG

Identity Security Data sovereignty Technical sovereignty Operational sovereignty Data Residency Sovereign Clouds Air-Gapped Clouds

06 / 23 / 2026

Rethinking Application Access Governance for the AI Era

READ BLOG

Report

Saviynt Named Gartner Voice of the Customer for IGA

Read the Report

EBook

Welcoming the Age of Intelligent Identity Security

Read eBook

Press Release

AWS Signs Strategic Collaboration Agreement With Saviynt to Advance AI-Driven Identity Security

Learn More

Solution Guide

ISPM for AI Agents

Read Blog

The Saviynt Identity Platform

Why Customers Love Us

Products

Advanced Capabilities

Integrations

The Saviynt Identity Platform

The Saviynt Identity Platform

Why Customers Love Us

By Role

By Use Case

By Industry

The Saviynt Identity Platform

Why Customers Love Us

Customer

Why Customers Love Us

The Saviynt Identity Platform

Why Customers Love Us

About Saviynt

The Saviynt Identity Platform

Why Customers Love Us

Resources

Learning & Support

Partners

AI Agents Aren’t Trustworthy (But We’re Deploying Them Anyway)

Author: Billy Hewlett, Chief Scientist (AI)

Date: 04/23/2026

AI agents vs generative AI: what changes when systems can act

AI agents create risk even when the model is accurate

Multi-agent systems expand the AI attack surface

Why do multi-agent workflows create unintended access paths?

Access control is the real security boundary for AI agents

Why is AI agent access control easier to describe than to implement?

The easy version of AI access control

The hard version of AI agent access control

AI agents. An unfinished control model. And what you can do about it.

What is AI agent security?

Why is access control critical for AI agents?

What is the biggest risk with AI agents?

Do AI agents need their own identities?

Why is AI agent security different from traditional AI security?

Related Posts

The Intelligence as a Service Era: How Agentic AI Reshapes Enterprise Software

Identity Security Sovereign Cloud: Why It Matters and How We're Building for It

Rethinking Application Access Governance for the AI Era

Saviynt Named Gartner Voice of the Customer for IGA

Welcoming the Age of Intelligent Identity Security

AWS Signs Strategic Collaboration Agreement With Saviynt to Advance AI-Driven Identity Security

ISPM for AI Agents