Fetching latest headlines…

NORTH AMERICA

🇺🇸 United States•June 20, 2026

Agent frameworks create workflows. Production needs run receipts.

0 views0 likes0 comments

Originally published byDev.to

Everyone is comparing agent frameworks: LangGraph, CrewAI, AutoGen, OpenAI Agents SDK, Claude Code, Codex, MCP routers, custom harnesses.

That comparison matters, but it misses the layer that starts hurting once the demo works.

The framework creates the workflow. It does not automatically answer:

what is installed and running locally?
which tools, MCP servers, skills, and providers are mounted?
what repo, files, or workspace state were in scope?
what did the agent change?
which actions created side effects?
which actions required approval, warning, redaction, block, or review?
what evidence came from tests, evals, traces, or browser checks?
what can be retried, resumed, rolled back, or cleaned up safely?

That is the layer we are building Armorer for: a local control plane around agents.

The split we are converging on:

Armorer: sessions, jobs, tool inventory, config, approvals, run records, and recovery
Armorer Guard: fast runtime decisions on proposed tool calls and model/tool-output transitions

The goal is not to replace agent frameworks. It is to make agents operable once they exist.

The artifact I keep coming back to is a run receipt.

A useful agent run receipt should capture:

the agent/app, version, and config
the mounted tools, MCP servers, skills, and providers
the workspace/repo/files in scope
checkpoints before and after the run
tool calls and side effects
approval and review decisions
test/eval/check evidence
retry, resume, rollback, and cleanup state

Without this, debugging agent runs turns into transcript archaeology.

With it, operating agents starts to feel more like operating software again.

Repos:

Armorer: https://github.com/ArmorerLabs/Armorer
Armorer Guard: https://github.com/ArmorerLabs/Armorer-Guard

Questions I would love feedback on:

What is the minimum useful run receipt for an agent session?
Which approval events should become first-class history?
Where should MCP/tool metadata stop and runtime policy begin?
What recovery action do you wish your agent harness exposed after a bad run?

Disclosure: I am building Armorer and Armorer Guard.

Comments (0)

Sign in to join the discussion

Be the first to comment!

🇺🇸

United States

NORTH AMERICA

More news from United States

Related News

Apple just said the thing about Siri that we’ve long wanted to hear

Apple just said the thing about Siri that we’ve long wanted to hear

1d ago

Reading the web with half-understood words everywhere

1d ago

Summer Solstice Is Tangled: The Final Knot

1d ago

Cayman Islands company register — what the public record shows

1d ago

Use AI Like a Senior Engineer: Actually Fixing Bugs, Not Just Asking Questions

1d ago

View all NORTH AMERICA news →