← back to benchmarks
DRAFT — measurements in progress. Numbers below are placeholders. The methodology is final; raw data lands when the operator completes the Q2 sweep. This page is currently noindex'd.

Q2 2026 benchmark run

By Parsa Khazaeepoul, co-founder of Pane. Tested every agent manager in this comparison set in production. .

Seven agent managers — Pane, Conductor, Superset, Emdash, Crystal, Claude Squad, and cmux — measured on five pre-registered metrics across macOS, Windows, and Linux. The methodology was published before any measurements were taken; raw logs are in the public kit repository. Disagreements get a correction issue, not an email.

See the methodology page for metric definitions, the process-set rule, statistical handling, and the pinned agent/model.

macOS

Memory at N=4 parallel agents

Total RSS of the launcher PID and every descendant process after four panes are active, each running a Claude agent. Measured via the process-set rule (see methodology).

Memory at N=4 parallel agents (MB)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trials

If you stack an IDE and a browser on a 16GB laptop, this number is the difference between idle headroom and swap. Pending real measurements — see DRAFT banner above.

Disk overhead per worktree

Ratio of the manager's worktree directory size to the source repo size. Conductor's copy-checkout approach is expected near 3x; git-worktree-based managers near 1x.

Disk overhead per worktree (x)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trials

Higher means each parallel session eats more disk. Matters on small SSDs or when running 8+ panes against a large monorepo.

Cold-start time

Wall-clock time from launching the app to a state ready to accept the first input. Operator-visual timing where the app exposes no programmatic ready event.

Cold-start time (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trials

Felt every single time you open the manager. Anything under ~1500ms reads as instant; above that, the wait is visible.

Steps from task to PR opened

Keystrokes plus clicks counted from the workflow/<manager>.md script in the kit repo. Single, fixed task: replace console.log with logger.info across five packages.

Steps from task to PR opened (actions)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trials

Lower means less interface friction per shipped agent task. Compounding cost: a 30-step workflow on 10 tasks a day is 300 mode switches.

Time-to-status-awareness

Time from the moment the agent pauses for input or finishes to a user-visible signal (notification, color change, sound). Manual stopwatch where no event hook exists.

Time-to-status-awareness (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trials

How quickly you know an agent needs you. Matters in parallel — every second waiting on a finished agent is wall time you can't recover.

Windows

Memory at N=4 parallel agents

Total RSS of the launcher PID and every descendant process after four panes are active, each running a Claude agent. Measured via the process-set rule (see methodology).

Memory at N=4 parallel agents (MB)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/AN/Amin · max · 5 trialsN/AN/AN/A

If you stack an IDE and a browser on a 16GB laptop, this number is the difference between idle headroom and swap. Pending real measurements — see DRAFT banner above.

Disk overhead per worktree

Ratio of the manager's worktree directory size to the source repo size. Conductor's copy-checkout approach is expected near 3x; git-worktree-based managers near 1x.

Disk overhead per worktree (x)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/AN/Amin · max · 5 trialsN/AN/AN/A

Higher means each parallel session eats more disk. Matters on small SSDs or when running 8+ panes against a large monorepo.

Cold-start time

Wall-clock time from launching the app to a state ready to accept the first input. Operator-visual timing where the app exposes no programmatic ready event.

Cold-start time (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/AN/Amin · max · 5 trialsN/AN/AN/A

Felt every single time you open the manager. Anything under ~1500ms reads as instant; above that, the wait is visible.

Steps from task to PR opened

Keystrokes plus clicks counted from the workflow/<manager>.md script in the kit repo. Single, fixed task: replace console.log with logger.info across five packages.

Steps from task to PR opened (actions)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/AN/Amin · max · 5 trialsN/AN/AN/A

Lower means less interface friction per shipped agent task. Compounding cost: a 30-step workflow on 10 tasks a day is 300 mode switches.

Time-to-status-awareness

Time from the moment the agent pauses for input or finishes to a user-visible signal (notification, color change, sound). Manual stopwatch where no event hook exists.

Time-to-status-awareness (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/AN/Amin · max · 5 trialsN/AN/AN/A

How quickly you know an agent needs you. Matters in parallel — every second waiting on a finished agent is wall time you can't recover.

Linux

Memory at N=4 parallel agents

Total RSS of the launcher PID and every descendant process after four panes are active, each running a Claude agent. Measured via the process-set rule (see methodology).

Memory at N=4 parallel agents (MB)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/Amin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsN/A

If you stack an IDE and a browser on a 16GB laptop, this number is the difference between idle headroom and swap. Pending real measurements — see DRAFT banner above.

Disk overhead per worktree

Ratio of the manager's worktree directory size to the source repo size. Conductor's copy-checkout approach is expected near 3x; git-worktree-based managers near 1x.

Disk overhead per worktree (x)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/Amin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsN/A

Higher means each parallel session eats more disk. Matters on small SSDs or when running 8+ panes against a large monorepo.

Cold-start time

Wall-clock time from launching the app to a state ready to accept the first input. Operator-visual timing where the app exposes no programmatic ready event.

Cold-start time (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/Amin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsN/A

Felt every single time you open the manager. Anything under ~1500ms reads as instant; above that, the wait is visible.

Steps from task to PR opened

Keystrokes plus clicks counted from the workflow/<manager>.md script in the kit repo. Single, fixed task: replace console.log with logger.info across five packages.

Steps from task to PR opened (actions)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/Amin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsN/A

Lower means less interface friction per shipped agent task. Compounding cost: a 30-step workflow on 10 tasks a day is 300 mode switches.

Time-to-status-awareness

Time from the moment the agent pauses for input or finishes to a user-visible signal (notification, color change, sound). Manual stopwatch where no event hook exists.

Time-to-status-awareness (ms)PaneConductorSupersetEmdashCrystalClaude Squadcmux
medianmin · max · 5 trialsN/Amin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsmin · max · 5 trialsN/A

How quickly you know an agent needs you. Matters in parallel — every second waiting on a finished agent is wall time you can't recover.

hardware tested

cpumemoryos
Apple M216GBmacOS 14.x
AMD Ryzen 5 760032GBWindows 11
Intel i7-1260P32GBUbuntu 24.04

One bench per platform. Hardware specs are recorded with every run so future quarters comparing the same numbers know what changed.

raw logs

Every measurement above is reproducible from the data in runs/2026-q2/ on GitHub. One JSON file per manager, raw process-set captures and per-trial timings inside.

disputed numbers?

Open a correction issue on the kit repo. Cite the row, the manager, the platform, your expected value, and your evidence. If the methodology itself is the problem, use the methodology question template instead — those route to a different review.

vendor responses

No vendor responses yet. Submit one as a PR to the kit repo and it gets linked here next to the row it addresses.

Want to re-run this on your own hardware? The reproduce page walks through it end-to-end.