TL;DR: I built a 3-LLM code reviewer (Claude + GPT-5 + Gemini that deliberate). My synthetic-bug benchmark shows 3×
the depth at the same catch rate as Claude alone. But 15 synthetic PRs is not enough. I need YOUR PRs to validate or
kill the hypothesis.
Background:
6 months ago Claude solo review kept missing things I considered blockers but it called "minor". Tried adding more
models in parallel + deliberation. Result on my private corpus:
- Claude alone: 3.80 blockers/PR
- 3-agent council: 10.93 blockers/PR
- Both 100% catch on synthetic bugs
Pattern, after debugging the gap: one model skips a missing test that another catches. A "minor" by Claude becomes a
blocker by Gemini. Single-agent has no second perspective.
The bigger feature is PRD-aware review. .conclave/prd.md → agents flag spec deviations as first-class blockers. Scope
creep, route mismatches, forgotten acceptance criteria.
What I need:
- Run on a real PR, tell me where wrong
- Compare vs your usual reviewer (Claude / Cursor / human)
- Send false positives, I incorporate into federated failure-catalog
How:
- Demo (3 free/day): https://conclave-ai.dev/#try
- GitHub App + BYO key = unlimited free
- CLI: npm i -g @conclave-ai/cli
Source-available (FSL-1.1-Apache-2.0): https://github.com/seunghunbae-3svs/conclave-ai
Stack: TS / Node 20 / Cloudflare Workers + Containers + D1 / Mastra. 26 packages, 2691 tests.
Limitations I know:
- Beta, things break
- Cost scaling on large diffs untested
- Spec-mismatch only useful if you maintain a PRD
- I'm one person + Claude pair-programming — bus factor 1
If the numbers don't survive contact with real codebases, I want to know. Poke holes.
United States
NORTH AMERICA
Related News
How Braze’s CTO is rethinking engineering for the agentic area
10h ago
Amazon Employees Are 'Tokenmaxxing' Due To Pressure To Use AI Tools
21h ago

Implementing Multicloud Data Sharding with Hexagonal Storage Adapters
15h ago

DeepMind’s CEO Says AGI May Be ~4 Years Away. The Last Three Missing Pieces Are Not What Most People Think.
15h ago

CCSnapshot - A Claude Code Configs Transfer Tool
21h ago