Securing AI Agents: Solving the Confused Deputy Problem in LLM Workflows

AI agents are a confused deputy with the keys to your kingdom

Attackers took control of over twenty thousand Instagram accounts by manipulating Meta’s AI support assistant. The breach occurred without a single exploit or password guess, relying instead on a failure in identity verification.

Why This Matters

The technical reality is that many legacy authorization checks were performed by human discretion rather than code; replacing humans with agents removes this judgment without adding programmatic guards. While developers may hope for smarter models, model capability only improves an attacker’s ability to phrase requests; security must reside in an external policy layer that verifies the principal identity independently of the chat context.

Key Insights

The ‘Confused Deputy’ problem allows a privileged process to be manipulated by a less-privileged party, illustrated by a 1988 case involving a compiler writing to protected billing files.
LLM agents lack an inherent notion of authorization because natural language interfaces do not carry caller identity, unlike direct API requests.
Prompt-based controls are insufficient because agents cannot reliably separate instructions from data, leading to attacks where malicious commands are smuggled through ingested content.
Enterprise adoption of task-specific AI agents is projected by Gartner to reach 40% of applications by the end of 2026.

Working Examples

Vulnerable implementation where authorization is based solely on the agent’s ability to call the function.

def add_recovery_email(account, new_email):
    account.recovery_email = new_email # nothing here ties to the caller
    send_reset_link(new_email)

Secure implementation that verifies the principal identity from the authenticated session outside of the LLM prompt.

def add_recovery_email(account, new_email, principal):
    if not principal.owns(account): # who is actually asking, verified
        raise Unauthorized("session not authenticated as the account owner")
    account.recovery_email = new_email
    send_reset_link(new_email)

Practical Applications

, Use case: Meta Business Agent performing payments and CRM edits; Pitfall: Granting standing access instead of scoped, short-lived authority leading to unauthorized refunds or record edits.
, Use case: AI support bots managing account recovery; Pitfall: Relying on model ‘judgment’ or prompts for confirmation rather than hard policy rules or human gates for irreversible actions.

References:

https://stackoverflow.blog/2026/06/17/ai-agents-expose-the-security-checks-you-never-actually-wrote/

On This Page

AI agents are a confused deputy with the keys to your kingdom

Why This Matters

Key Insights

Working Examples

Practical Applications

Continue reading

Related Content

Preventing Confused Deputy Attacks in AI Agent Deployments

Securing AI Agents: Lessons from a 40-Minute AWS Credential Leak

ClawJacked Vulnerability: Malicious Websites Hijack Local OpenClaw AI Agents