Agent Authentication
The broken primitive nobody's fixed yet.
Agents are no longer just answering questions. They are calling APIs, sending emails, pushing commits, reading calendars, filing tickets, and making purchases. Every one of those actions requires credentials. And the tooling for managing those credentials — for agents specifically — barely exists.
What exists instead is a collection of workarounds assembled by teams who needed agents to work now and patched together whatever was available. The result is a security landscape that looks fine at demo time and falls apart at scale.
The wild west
Look at how credential handling actually works in tools built for agents today. These are not obscure examples — they are representative of the entire ecosystem.
A typical agent tool for calling an API looks like this:
ahrefs.js
. The authentication logic is two lines: read
AHREFS_API_KEY from the environment, exit with an error if
it's missing, pass it as a Bearer token in every request. That is
the entire auth story. There is no notion of which agent is calling,
why, or on whose behalf. The key is a static secret that lives in
whatever environment the agent runs in — a shell, a CI runner, a
container. It never expires, it never rotates unless someone remembers
to rotate it, and if it leaks, every system that accepted it is
compromised.
For services that require OAuth,
xurl
— a CLI for the Twitter/X API — shows how teams adapt. OAuth 2.0 PKCE
is the mechanism: a browser redirect, a consent screen, a callback. A
human must complete this flow once. The resulting tokens are stored in
a YAML file at ~/.xurl. When an agent needs to authenticate,
it reads from that file. Token rotation is manual. There is no concept
of the agent as a distinct actor — the token is just the token, and
whoever can read that file can use it.
The
Resend CLI
offers a common three-tier fallback: --api-key flag,
then RESEND_API_KEY environment variable, then a
credentials file at ~/.config/resend/credentials.json.
The key is static and long-lived. There is no rotation, no scoping,
no record of which invocation used it. In CI environments,
the key is typically injected as a secret, shared across every
pipeline run, every branch, every developer who can trigger a build.
The
Google Workspace CLI
makes the problem explicit. Its documentation describes the agent auth flow as: the agent opens the authorization URL, selects the account, handles consent prompts, and returns control once the localhost callback succeeds. This is browser automation standing in for authentication. The agent is not authenticating — it is simulating a human completing a flow designed for humans. The credentials that result are still stored locally, tied to a human account, with no agent-specific scope or revocation path.
These are not edge cases. This is the state of the art.
The security failure of plain secret access
When teams ship agents with API keys in environment variables, they are making a set of implicit bets, none of which hold at any real scale.
Static secrets don't stay secret. A key that
never rotates accumulates exposure surface over time. It gets copied
into .env files that get committed. It gets logged by a
debugging tool. It shows up in a screenshot in a Slack thread. It gets
included in a container image layer. It gets exported from a CI system
that has too many admins. Every day a static secret exists, the
probability that it has left its intended environment increases. The
blast radius of that leak is bounded only by what the key had access to.
Shared credentials collapse accountability. When an agent and a human developer share the same API key — which is the default in most setups — there is no way to separate what the agent did from what the developer did. If the API logs a suspicious request, you cannot tell whether it came from an autonomous agent acting unexpectedly, a developer testing something, a compromised key, or a misconfigured pipeline. The security event is real; the ability to investigate it is not.
Environment variables are not a secrets management system.
They leak through process listings, core dumps, error reports, and child
processes. They propagate into subshells and get inherited by every tool
the agent invokes. They are not encrypted at rest. In containerized
environments, they can be read by anything with access to the container
metadata API. The ergonomic convenience of process.env.API_KEY
papers over a surface that security teams have been trying to move away
from for years.
No scope means no containment. A key with full account access — which most API keys have by default — means an agent with that key has full account access. If the agent is compromised, misbehaves, or is manipulated by a prompt injection attack, the damage is bounded only by what the service allows. Scoped, short-lived credentials would limit the blast radius; static broad-access keys maximize it.
Compliance frameworks demand more than this. SOC 2, ISO 27001, GDPR, HIPAA — every major compliance standard requires demonstrating who accessed what, when, and under what authority. An API key log entry that says “key ending in 4a2f accessed resource X” does not meet that bar. Which agent? Which task? Which user delegated the access? On whose behalf was the data retrieved? With static API keys and no identity context, these questions are unanswerable. For regulated industries, that is not a gap — it is a blocker.
Why API keys aren't enough
An API key answers one question: does the caller know this secret? That is the entirety of what it encodes. It says nothing about who ran the agent, what task it was executing, whose data it was operating on, whether that task was still in scope, or how the authority to perform it was granted.
This gap is invisible when agents are small and controlled. It becomes visible fast when agents are longer-running, more autonomous, and acting across multiple services. When a billing anomaly appears, the key tells you a secret was used. It does not tell you which agent invocation triggered it, what prompt drove that invocation, or whether the user whose account was charged ever intended to authorize that action. The audit trail ends at the token.
The implicit assumption in the API key model is that the secret and the agent are the same thing — that authenticating the key authenticates the actor. But agents are not the same as the keys they carry. An agent can be manipulated, misconfigured, or compromised while the key remains valid. A key can be stolen and used by something that is not the agent at all. Conflating credential possession with agent identity is a category error that grows more costly as agents gain capability.
Why OAuth and passkeys don't fit
The industry has two mature primitives for delegated access: OAuth and passkeys. Neither was designed for agents, and retrofitting them produces the workarounds already described.
OAuth's authorization code flow assumes a user is present in a browser. The consent screen, the redirect URI, the session cookie — every element of the design encodes a human making an active approval decision. OAuth 2.0 added extensions for non-interactive flows: client credentials for machine-to-machine access, device authorization for constrained devices. But these flows model services and devices, not agents. They say “this service has permission” — not “this agent execution, delegated by this user, for this specific task, has permission.”
Passkeys go further in the wrong direction. They are explicitly biometric, device-bound, and presence-dependent. The user must physically authenticate. There is no path from passkey to agent delegation — they are orthogonal primitives.
The problem is not that OAuth or passkeys are bad. They solve the problems they were designed for. The problem is that the agent use case is structurally different: an autonomous process, running without a user present, potentially executing over hours or days, acting on behalf of a user who granted access at some earlier point, across multiple services simultaneously. Squeezing this into OAuth flows produces the browser-automation workarounds in the gws CLI — technically functional, semantically wrong.
The two questions that matter
Agent authentication is fundamentally about answering two questions that current tooling cannot answer:
Who ran the agent? Not the team that deployed it, not the user whose account it runs under — the specific agent instance, the specific task invocation, the specific execution context that produced this action. When an agent makes a hundred API calls in parallel, each call needs to be traceable to its originating invocation.
Who did the agent act on behalf of? The full delegation chain — from the user who initiated the task, through whatever orchestration layer dispatched the agent, to the specific action being taken against the specific resource. Not just that permission exists somewhere, but that permission was correctly delegated all the way down.
Without both, you cannot audit. You cannot scope permissions to tasks. You cannot revoke access to a specific agent execution without revoking access to everything. You cannot reason about blast radius. You cannot answer a compliance questionnaire, respond to a security incident, or give a user a meaningful account of what an agent did with their data.
Credentials are not identity
The core confusion running through the entire current ecosystem is treating credentials as a substitute for identity. They are not the same thing.
A credential proves that access was granted. An OAuth token says: a user once approved this scope. An API key says: whoever has this secret was provisioned access. Neither encodes who the acting agent is, what authority it was given, what task it is executing, or whether that task is still within the bounds of what was intended.
In human-facing systems, identity and credentials are loosely coupled because humans are present to provide context. A user logs in, their session carries their identity, their actions are naturally attributed. Agents have no persistent presence. Each execution is stateless with respect to the systems it calls. Without an explicit identity layer, those calls are attributed to a token — and a token is not an agent.
The industry is shipping credentials where it needs identity. The gap is small enough to ignore when agents are narrow tools running under direct supervision. It becomes structural when agents are autonomous, long-running, and acting across systems with real consequences. That is exactly where AI development is heading.
Emerging answers
The problems above are not yet solved, but some pieces are taking shape.
Passport (agentr.dev) is a concrete attempt at agent-native authentication:
each agent receives a W3C did:key identifier that encodes its
Ed25519 public key, and every protected resource request requires both an
access token and a fresh proof of private key possession via DPoP (RFC 9449).
The agent proves it controls its signing key on each request — not merely
that it once received a shared secret.
Delegation is modeled through RFC 8693 act claims: the token
carries the principal (the user or system the agent acts for), the acting
agent's DID, and the scopes authorized for that delegation. Access tokens
are bound to specific resource servers via the aud claim and
carry no expiry — revocation is enforced through registry status (UNCLAIMED,
CLAIMED, REVOKED) rather than time-based invalidation. Agent handles provide
stable human-readable labels that survive key rotation without disrupting
downstream references.
This approaches but does not complete the answer to the two questions above.
It establishes who the agent is (a cryptographic identifier) and who it acts
for (the act chain). What it does not yet provide is
task-scoped dynamic credentials — the agent's identity is stable across
invocations rather than issued per-task execution. The delegation chain is
expressed but not infrastructure-enforced as a narrowing constraint. These
remain open gaps. Passport demonstrates the shape of an answer; the full
answer is still being built.
A different piece is addressed by Anthropic's Workload Identity
Federation (WIF). Where Passport handles agent-to-service authentication,
WIF handles workload-to-Claude API authentication: workloads exchange
short-lived OIDC tokens from identity providers they already operate —
AWS IAM, GCP, Azure, GitHub Actions, Kubernetes service accounts, SPIFFE
JWT-SVIDs — for short-lived Anthropic access tokens via RFC 7523
jwt-bearer grant. Service accounts (svac_...) are named
non-human identities inside an organization; a federation rule maps
specific workload claims to a service account and scope, so access is
constrained by what the rule permits, not by who holds a shared secret.
Static sk-ant-... API keys are replaced by tokens that
expire in minutes.
WIF answers the credential lifecycle problem — rotation, expiry, no shared secrets — but only at the Claude API layer. It does not answer who the agent is when calling third-party services, or carry delegation context through the agent's tool calls. Managed Agents' security architecture addresses a third piece: by running the stateless harness outside sandboxes, credentials never appear in the model's context window regardless of what a prompt injection attempts. Each layer solves a different slice of the same underlying problem. None of them, individually or together, yet delivers the full task-scoped delegation chain the problem requires.
Reading
Passport Whitepaper
— agentr.dev. Agent authentication using W3C
did:keyidentifiers, Ed25519 key pairs, and DPoP (RFC 9449) per-request proof of possession. Delegation modeled with RFC 8693actclaims. A concrete implementation of agent-native authentication that moves beyond shared secrets toward cryptographically verifiable agent identity.Workload Identity Federation
— Anthropic. OIDC-based token exchange (RFC 7523) replacing static API keys: workloads on AWS, GCP, Azure, GitHub Actions, Kubernetes, SPIFFE, or Okta exchange their platform identity token for a short-lived Anthropic access token scoped to a service account. Eliminates static secrets from the Claude API surface.
Scaling Managed Agents: Decoupling the Brain from the Hands
— Anthropic Engineering, April 2026. The security architecture that keeps credentials out of the model's context: harness outside sandboxes, resource-bundled initialization auth, and vault-stored credentials via proxy. Security by structure, not by prompt.
xurl
— Twitter/X API CLI. OAuth 2.0 PKCE with tokens stored in
~/.xurl. A concrete example of OAuth adapted for a non-interactive tool.resend-cli
— Resend email service CLI. Three-tier API key resolution with no rotation or agent-awareness. The standard pattern.
Google Workspace CLI
— Explicitly documents browser automation as the agent auth path. Makes the gap visible.
ahrefs.js
— Minimal agent tool auth: environment variable, Bearer token, done. The simplest version of the pattern in production.
RFC 6749 — The OAuth 2.0 Authorization Framework
— The original spec. Read the abstract and Section 1 to see how explicitly human-centric the design assumptions are.
SPIFFE / SPIRE
— Workload identity for infrastructure. Not agent-specific, but demonstrates that the problem of non-human identity at runtime is understood and solvable in adjacent domains.