Markdown

Agent intent verification¶

Scope: confirming that a high-stakes action matches what the user actually asked for. The mechanism is an out-of-band signed-intent attestation checked at the policy gate, because a confirmation asked inside the chat can be fabricated by the same channel an attacker may have poisoned. Intent is the "does it match the request" facet of the control plane, alongside policy and identity.

Tokens and flows here are reference templates; pin versions and validate before relying on them.

flowchart LR
  REQ["First user request"] --> HASH["Chat layer: hash original intent"]
  REQ --> OOB["Out-of-band confirm (passkey / CIBA)"]
  OOB --> TOK["Signed-intent token: hash + scope + recency"]
  HASH --> TOK
  TOK --> GATE["Policy gate checks hash==first request, scope covers resource, recency < N"]
  GATE -->|"match"| RUN["Action runs"]
  GATE -->|"drift / stale / scope creep"| DENY["Deny"]

Overview¶

Detection and policy decide whether an action is hostile or forbidden; intent verification decides whether an allowed action is the one the user actually wanted. The naive approach, asking "are you sure?" in the conversation, is structurally broken: the same chat surface that may carry a prompt injection also phrases the confirmation, so an attacker who controls the channel can both request the action and answer the confirmation. Real attestation has to come from a different, authenticated channel.¹

Core knowledge¶

Why in-chat confirmation fails¶

An injected instruction can fabricate both the dangerous action and a convincing "yes, proceed." Worse, the confirmation prompt itself is rendered by the model from the same context, so it can be reworded or suppressed. Treating an in-conversation acknowledgement as authorization gives an attacker a free approval. The confirmation must be unforgeable by anything inside the chat.¹

Out-of-band signed intent¶

Issue an intent attestation over a separate, authenticated channel: a passkey or WebAuthn challenge, or a decoupled flow such as CIBA where the user approves on a device the agent does not control. The result is a signed token whose fields are minted by different layers so no single compromised component can forge the whole bundle: the chat layer stamps a hash of the original request and a short intent summary, and the out-of-band layer stamps the confirmed scope and the time of confirmation.¹ Rich Authorization Requests (RFC 9396) provide the structured, fine-grained authorization detail that a coarse scope cannot express.

Check intent at the gate¶

The policy gate verifies the attestation against the action: the action's request hash matches the original request hash, the resource is inside the attested scope, and the confirmation is recent enough. Actions that widen scope beyond what was attested, or that arrive after the attestation has gone stale, are denied. This keeps the check declarative and in one place rather than scattered through the agent.¹

Banking already shipped this¶

The pattern is not new. PSD2 strong customer authentication requires dynamic linking, where the authentication is cryptographically bound to the specific amount and payee, and FIDO Secure Payment Confirmation binds a passkey signature to transaction details. The agent mapping is direct: the tool and its arguments are the merchant and amount, the attested scope is the transaction detail, and the passkey or CIBA device is the authenticator.²

When to require it¶

Out-of-band intent costs a round trip and user friction, so reserve it for high-stakes and irreversible actions (moving money, deleting data, changing permissions, sending external messages). Routine reads run under policy and identity alone.

Don't-miss checklist¶

Never treat an in-chat "yes" as authorization for a high-stakes action.
Attest intent out of band (passkey, WebAuthn, or CIBA) on a channel the agent does not control.
Spread the token's fields across layers so no single compromise forges the bundle.
Check hash match, scope coverage, and recency at the policy gate; deny on drift, scope creep, or staleness.
Reserve out-of-band intent for irreversible or high-value actions; keep routine actions friction-free.

Failure modes¶

In-chat confirmation. The poisoned channel approves its own action.
Replayed attestation. A stale token authorizes a later, different action; recency checks prevent it.
Scope creep. The action touches more than was attested; the gate must compare against the attested scope.
Single-layer token. All fields minted by one component; compromising it forges the whole attestation.
Over-prompting. Out-of-band intent demanded for every action; users develop approval fatigue and rubber-stamp.

Open questions & validation¶

Production deployments of out-of-band agent intent are still rare; validate the end-to-end flow before relying on it.
Mapping a free-form chat request to a precise, hashable intent is imperfect; test drift detection adversarially.
Recency thresholds trade safety against friction; tune them per action class and measure approval fatigue.

References¶

RFC 9396, OAuth 2.0 Rich Authorization Requests: https://datatracker.ietf.org/doc/html/rfc9396
W3C Web Authentication (WebAuthn) Level 3: https://www.w3.org/TR/webauthn-3/
FIDO Alliance (Secure Payment Confirmation): https://fidoalliance.org/
OpenID Client-Initiated Backchannel Authentication (CIBA): https://openid.net/specs/openid-client-initiated-backchannel-authentication-core-1_0.html
OWASP Top 10 for LLM Applications (LLM06 Excessive Agency): https://genai.owasp.org/llm-top-10/

In-chat confirmation fails when the chat is the attack channel; out-of-band signed intent is issued over a separate authenticated channel (passkey/WebAuthn or a decoupled flow such as CIBA), with fields minted across layers so no single compromise forges the bundle, and the policy gate checks original-request hash, attested scope, and confirmation recency. ↩↩↩↩
PSD2 strong customer authentication mandates dynamic linking (binding the auth to a specific amount and payee) and FIDO Secure Payment Confirmation binds a passkey signature to transaction details; the agent mapping is tool/args to merchant/amount, attested scope to transaction detail, passkey/CIBA device to authenticator. ↩