Benchmarks

By Parsa Khazaeepoul, co-founder of Pane. Tested every agent manager in this comparison set in production. Last reviewed May 2026.

We publish reproducible measurements of agent managers each quarter. Methodology is pre-registered before any numbers are taken; raw logs ship as JSON in the public kit repo; corrections route through GitHub issues, not email. Pane is one of the tools measured — we publish the rows it loses on the same page as the ones it wins.

Methodology

The rules of measurement — pre-registered before any data is captured. Metric definitions, the process-set rule, statistical handling, the pinned agent and model.

Q2 2026 results scheduled

Methodology is pre-registered. Measurements in progress — raw data lands soon. We're not linking the page from the hub until real numbers replace the placeholders.

Reproduce

Step-by-step on how to clone the kit, run the scripts, and submit your results back as a PR. Anyone with a modern laptop and an Anthropic API key can re-run.

The kit (github.com/dcouple/agent-manager-benchmark)

Public 5-package TypeScript monorepo, OS-agnostic measurement scripts, per-manager workflow walkthroughs, raw logs per run. MIT licensed so anyone can fork.

- - - - - - - - - - - - - - - -

Context on the category lives on the compare pages; longer-form perspective on why this exists is on the author page.

Download*

Windows SmartScreen warningDirect downloads can show a SmartScreen warning while Pane is unsigned. Pane is fully open source, so you can audit the code and build from source yourself.1. Click More info2. Click Run anyway3. Continue the installerThe PowerShell install downloads the official release directly and avoids most browser download friction.npm global install

Paste that in PowerShell.MacLinux