TL;DR: Your agent lives for 2 minutes. Its credential lives for 60. That mismatch is your attack surface. A broker that issues task-scoped, short-lived credentials closes the gap before the sprawl starts.
AI agents are still new. Most teams are just now deploying their first agents at scale. 2026 is year one. And a lot of the identity conversation already assumes the mess exists: registries, inventories, entitlement reviews, cleanup workflows.
But the mess is not inevitable. It's a choice you make at the beginning.
If you start with a broker where every agent gets a short-lived, task-scoped credential at spawn time, the individual agent credential doesn't have to become another long-lived thing you track forever.
This is the prevention argument: govern the things that persist, but issue ephemeral credentials to the things that don't.
The Problem Nobody Talks About
Right now, most teams are credentialing their agents one of three ways:
- Shared service account with a static API key. Every agent uses the same key. When one gets compromised, you rotate the key and everything breaks.
- OAuth token with a 15-60 minute TTL. The agent runs for a short task, but the credential stays valid much longer.
- Broad IAM role assigned "just in case." Scoped wide enough to handle every possible task. When an agent gets compromised, it has access to everything.
The common thread: credentials outlive the work. The agent is ephemeral. The credential is not. That mismatch is your attack surface.
The Math on Credential Exposure
Let's make it concrete.
| Approach | Agent Lifetime | Credential Lifetime | Exposure Window |
|---|---|---|---|
| Static API key | 2 minutes | Forever | Forever |
| OAuth token | 2 minutes | 60 minutes | 30x agent lifetime |
| Broker (task-scoped) | 2 minutes | Short TTL + release/revocation | Close to task lifetime |
At scale, the difference is not academic. The exact numbers depend on your workload, TTLs, and renewal policy, but the shape of the risk is the same.
Every 2-minute agent task backed by a 60-minute token leaves 58 extra minutes where a stolen credential is still useful. Multiply that across thousands of agent runs and you're generating a massive amount of unnecessary credential lifetime every single day.
When a credential gets stolen, the attacker doesn't get access to what the agent was doing. They get access to everything that credential could do, for as long as it stays valid.
Broker vs. Registry: Two Philosophies
Registry model: Persistent systems, applications, owners, policies, and audit trails get registered and governed. That's useful. But if every short-lived agent instance also becomes a persistent identity record, you accumulate thousands of identities, entitlements, and cleanup tasks.
At that point, the registry's value proposition becomes "we'll help you manage the sprawl."
Broker model: Every agent gets a credential at spawn. The credential is scoped to exactly what that task needs. It has a short TTL and can be released or revoked when the work is done. The persistent governance layer still exists above the agent, but the per-agent credential doesn't become a standing entitlement.
The broker assumes at least some sprawl is preventable. Its value proposition is "don't create long-lived agent credentials in the first place."
Prevention is usually cheaper than cleanup. Fewer stale identities. Fewer periodic access reviews. Fewer "why did this old agent still have access?" incidents.
What It Looks Like in Code
Same prompt, same LLM, same decision. Three different ways to credential the agent that executes it.
All three examples start here:
from openai import OpenAI
llm = OpenAI()
# The system prompt defines what this agent is and what tools it can call.
system_prompt = """You are a customer support agent. You have these tools:
- lookup_billing: Fetch billing history for a customer
- edit_account: Edit a customer's account info
- lookup_billing_all: Fetch billing history across all customers
Use the appropriate tool based on the customer's request."""
# A request arrives. The LLM decides what tool to call.
response = llm.chat.completions.create(
model="gpt-4.1",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": "What's the billing history for customer 12345?"},
],
tools=tools, # lookup_billing, edit_account, lookup_billing_all
)
# The LLM chose lookup_billing for customer 12345.
# But it COULD have chosen lookup_billing_all and pulled every customer.
# Now: how does the agent get access?
Static API key (what most teams do today):
# The LLM only needs customer 12345.
# But the key can read ALL customers. And it never expires.
api_key = os.environ["SHARED_API_KEY"]
headers = {"Authorization": f"Bearer {api_key}"}
result = requests.get("https://api.example.com/customers", headers=headers)
# What if the LLM hallucinates a different tool call next time?
# Doesn't matter. Same key. Same access. To everything.
# No validation. No expiry. No way to know which agent used it.
OAuth token (better, but still mismatched):
# The LLM only needs customer 12345.
# But the token is scoped to read:customers:* and lives for 60 minutes.
token = get_oauth_token(client_id=os.environ["CLIENT_ID"],
client_secret=os.environ["CLIENT_SECRET"],
scope="read:customers:*")
headers = {"Authorization": f"Bearer {token}"}
result = requests.get("https://api.example.com/customers", headers=headers)
# Agent is done in 2 minutes. Token is valid for 58 more.
# No way to revoke it early. No way to scope it to one customer.
# We don't know what the LLM might decide next. The token doesn't care.
Broker (task-scoped, ephemeral):
from agentwrit import AgentWritApp, validate
from agentwrit.errors import AuthorizationError
app = AgentWritApp(
broker_url=os.environ["AGENTWRIT_BROKER_URL"],
client_id=os.environ["AGENTWRIT_CLIENT_ID"],
client_secret=os.environ["AGENTWRIT_CLIENT_SECRET"],
)
# The LLM chose customer 12345. Create an agent scoped to exactly that.
try:
agent = app.create_agent(
orch_id="billing-agent",
task_id="billing-12345",
requested_scope=["read:data:customer-12345"],
)
except AuthorizationError as e:
# Broker says no. Scope exceeds what this app is allowed to issue.
print(e.problem.detail) # "scope exceeds app ceiling"
print(e.problem.error_code) # "scope_violation"
raise
# Any service can verify the token independently.
result = validate(app.broker_url, agent.access_token)
print(result.claims.scope) # ['read:data:customer-12345'] — nothing else
# Use it.
resp = httpx.get(
"https://api.example.com/customers/12345/billing",
headers=agent.bearer_header,
)
# Done. Kill the token at the broker. Right now. Not in 58 minutes.
agent.release()
The LLM decided what to do. The broker scoped the credential to exactly that. If the scope was too broad, the broker denied it before the agent ever touched data. Any downstream service can call validate() to verify the token independently. And when the task ends, release() kills the credential immediately.
If the same agent gets a different prompt tomorrow, it gets a different scope. That's the point. Without the broker, you'd have to pre-assign permissions broad enough to cover every possible LLM decision.
Multi-Agent Delegation: The Attack Vector Nobody Is Talking About
This is where it gets interesting.
Most serious agent deployments use multiple agents working together. Agent A researches. Agent B drafts. Agent C reviews. Agent D publishes. The output of one agent becomes the input of the next.
The problem: How does Agent A give Agent B permission to act on its behalf?
The naive approach: Agent A shares its credential with Agent B. Now Agent B has Agent A's permissions. If Agent A could read all customers, so can Agent B. Permissions expanded. This is credential escalation, and it's trivially easy in most agent architectures.
The registry-only approach: Agent B gets its own standing identity and permissions, and you rely on governance later to prove that those permissions are still correct.
The broker approach: delegation chain verification.
When Agent A delegates to Agent B, it passes a token that says:
"Agent A authorized Agent B to act on its behalf, with scope exactly equal to Agent A's current scope. Agent B cannot escalate. The delegation is cryptographically signed and time-bounded."
If Agent A had read:data:customer-12345, Agent B gets read:data:customer-12345. Not read:data:*. Not write:data:customer-12345. Exactly what Agent A had, nothing more.
The delegation chain is a series of signed tokens. Each link is bound to the previous one, so resource layers can verify the lineage instead of treating each delegated token as unrelated.
This isn't just a feature. It's the security property I care about most: delegation should preserve or reduce authority, never expand it.
Why Now
2026 is year one for agent deployment at scale. Most teams are figuring this out right now. The architectural decisions made in the next 12 months will persist for years.
If you bake in long-lived agent credentials today, you'll spend the next two years cleaning them up. Access reviews. Entitlement audits. "Who had access to what when" forensics. Or it just doesn't get cleaned up at all. The enterprise vendors will sell you tools to manage the mess, because the mess will be real.
If you start with a broker, you still need governance for the persistent systems around the agent. But the short-lived agent instance doesn't need to leave behind a standing credential.
The registry vendors aren't wrong that sprawl is a problem. I think they're too quick to assume all of it is inevitable.
The Argument, Not the Pitch
I'm not here to sell you AgentWrit. I'm here to argue that the credential model you choose today determines your security posture for the next five years.
If you start with long-lived credentials and registry-managed sprawl, you're choosing a future of cleanup, audits, and accumulated risk.
If you start with ephemeral, task-scoped credentials, you're choosing a future where credentials don't outlive the work, where delegation doesn't escalate, and where the individual agent instance doesn't become a permanent entitlement.
The broker model isn't new. It's how cloud-native systems have handled short-lived compute for years. VMs get credentials at boot. Containers get credentials at start. Serverless functions get credentials per invocation. The credential dies with the compute.
Agents are just compute that happens to be intelligent. The same principle applies.
What I Built
I built AgentWrit because I needed this for my own agent deployments. It's a self-hosted credential broker for AI agents, source-available under PolyForm Internal Use for internal deployments.
- Ephemeral identity: Every agent spawns with a unique cryptographic identity
- Task-scoped tokens: Scoped to exactly what the task needs, not broad IAM roles
- Short-lived credentials: Tokens expire in minutes and can be released or revoked early
- Four-level revocation: Token, agent, task, or full delegation chain
- Delegation chain verification: Permissions cannot expand at each hop, cryptographically enforced
It's written in Go. Runs with Docker. The broker is source-available under PolyForm Internal Use 1.0.0; the Python SDK is MIT-licensed and live on PyPI.
GitHub: https://github.com/devonartis/agentwrit
Python SDK: https://github.com/devonartis/agentwrit-python
Security pattern (CC BY-SA 4.0): https://github.com/devonartis/AI-Security-Blueprints
The pattern is aligned with OWASP Agentic Top 10 (2026), NIST IR 8596, and IETF WIMSE. It's published separately because the architecture matters more than any single implementation.
The full security architecture is also published as a preprint on Zenodo.
Try It in 10 Minutes
Pull the broker Docker image, install the Python SDK, and run one of the two demos end to end.
MedAssist is a FastAPI clinical assistant. You ask a plain-language question about a patient; an LLM picks tools (records, labs, billing, prescriptions); the app spawns broker agents on demand, each scoped to one patient and one category. Cross-patient questions are denied. Prescription writes flow through a delegation chain.
demo/README.md has run instructions, a scenario playbook, and a code map.
Support Tickets is a three-agent pipeline built with Flask + HTMX + SSE. Three LLM-driven agents (triage, knowledge, response) process customer tickets. Anonymous tickets halt at triage. Dangerous tools like delete_account and send_external_email are in the LLM's tool list but not in the agent's scope, so they never execute. One scenario deliberately skips release() to watch a 5-second TTL die on its own.
demo2/README.md has run instructions, five scenarios, and a code map.
The Question
How are you credentialing your agents today?
If the answer involves shared API keys, long-lived OAuth tokens, or broad IAM roles, you might be building the mess that registry vendors will later sell you tools to manage.
Start with prevention. It's cheaper to avoid standing agent credentials than to clean them up later.
Devon Artis. Principal Security Engineer. CSA AI Controls Matrix contributor. Published the Ephemeral Agent Credentialing pattern as a preprint on Zenodo. Building AgentWrit. One person, no VC.
United States
NORTH AMERICA
Related News

‘The Testaments’ Just Brought Back Another Surprising ‘Handmaid’s Tale’ Character
2h ago
Islamic Medicine (2018)
April 19, 2026
LLM and Generative AI Interview Questions with Answers 2026
April 20, 2026
How nylas mcp uninstall Works: Remove MCP integration from an AI assistant
April 19, 2026
🌍 Earth's Last Letter
April 20, 2026