Someone in the community asked me two questions last week. They called them "dumb questions." They were the smartest questions anyone has asked about the workflow.

The first: "I want to try your whole methodology - /discussion, /plan, /implement. To do that in Pane, I need to copy your .claude contents into each repo I'm working in, right? Pane doesn't port them automatically?"

The second: "A workspace is a product, a pane is a feature, tabs are activities within a pane. All of this has to live in a monorepo, even across products?"

I've written three posts about commands, voice-first development, software factories, and a Turing Award winner's validation of the approach. I never wrote about the foundation that makes all of it work. That's the part everyone skips. It's also the part that matters most.

- - - - - - - - - - - - - - - -

the repo is the system of record

OpenAI published a blog post in February called "Harness engineering: leveraging Codex in an agent-first world". Ryan Lopopolo's team shipped roughly a million lines of production code in five months with zero manually written source code. The entire post is about one thing: the infrastructure that made that possible.

Their first principle is one I want to start with because it changes how you think about everything else: the repository is the system of record. Agents can't access your Google Docs. They can't read your Slack messages. They can't look inside your head. If context isn't in the repo, it doesn't exist to the agent.

This sounds obvious. It isn't. Most teams have critical decisions scattered across Notion pages, Slack threads, meeting notes, and whiteboard photos. Humans context-switch between all of those effortlessly. Agents can't. Every piece of context your workflow depends on — architecture decisions, naming conventions, why you chose library X over library Y, the import pattern you settled on six months ago — has to live in the repo or the agent will make it up.

- - - - - - - - - - - - - - - -

progressive disclosure: the map and the manual

OpenAI's approach to agent documentation uses what they call "progressive disclosure." Instead of one massive instruction file, you give the agent a short map that points to deeper documentation. They use an AGENTS.md file — roughly 100 lines — as a table of contents injected into every agent's context.

We arrived at the same pattern independently. Two files at the root:

→ AGENTS.md — The map. Under 100 lines. Project structure, build commands (pnpm dev, pnpm lint, pnpm typecheck), coding conventions (2-space indentation, camelCase variables, PascalCase components, kebab-case filenames), testing guidelines, commit rules, and pointers to deeper docs. Every agent reads this first. It fits in context comfortably.
→ CLAUDE.md — The manual. 30,000 characters. Full architecture reference: Electron main process + React renderer + shared types. Database schema. IPC communication patterns. Every feature that's been implemented. The full technical stack. Agents reference this when they need depth on a specific area.

The split is load-bearing. If you dump everything into one file, agents either skip it (too long, attention degrades), waste context window on sections irrelevant to the current task, or read selectively and miss the one convention that matters. AGENTS.md is what every agent reads every time. CLAUDE.md is what they pull from when the task requires it.

OpenAI takes this further with structured docs/ directories: design-docs/, exec-plans/active/ and exec-plans/completed/, product-specs/, references/. Plus top-level files like ARCHITECTURE.md, DESIGN.md, QUALITY_SCORE.md. We do the same with our docs/ directory and the plan lifecycle folders inside .claude/skills/plan/tmp/ — active plans in ready-plans/, completed plans in done-plans/. The plans are first-class versioned artifacts, not throwaway notes.

- - - - - - - - - - - - - - - -

the .claude directory

Every repo needs a .claude directory at its root. This is the brain your agents boot from. Pane doesn't port it automatically (honestly, it probably should — that's going on the roadmap as an opt-out onboarding step since ours are pretty refined at this point). But once you commit it to your repo root, every worktree inherits it. Every agent session in every pane sees it. You set it up once per repo and never think about it again.

The Pane repo's .claude directory is 53 files. 5 agent definitions (codebase-explorer, implementer, implementation-reviewer, plan-reviewer, researcher). 8 skills with their own SKILL.md files (discussion, plan, implement, commit, prepare-pr, investigate, simple-plan, research-web). 34 slash commands organized by category — cl/ for the core pipeline, linter/ for codebase-wide fixes, refactor/ at three granularity levels (simple, medium, deep), review/ with a master orchestrator that spawns 11 principle-specific review agents in parallel. Plus a plan template (plan_base.md) with built-in validation gates and lifecycle directories that track plans from creation through completion.

One detail worth calling out: our custom /plan skill intentionally overrides Claude Code's built-in /plan. This is a feature. The built-in is generic. Ours loads discussion briefs from .context/context.md, spawns plan-reviewer sub-agents, and runs an iterative review loop before the plan is marked ready. Same command name, completely different pipeline.

Each skill's SKILL.md frontmatter includes an allowed-tools field that constrains which tools that skill can use. But spawned sub-agents get their own tool access — the permission mode is what gates the whole thing. Pane initializes every Claude Code session with permissions pre-approved, so sub-agents run in parallel without prompts. Without this, every sub-agent tool call triggers an approval dialog, serializing all parallel work and defeating the purpose.

This is what I mean by harness engineering. You're not writing product code. You're building the operating system your agents run on.

- - - - - - - - - - - - - - - -

the monorepo

Every repository should be structured as a monorepo. Not "if it makes sense for your use case." Every one.

This isn't a preference. It's a constraint imposed by how agents work. An agent can only reason about code it can see. If your API lives in one repo and your frontend lives in another, the agent building a feature can't trace the data flow from database to UI. It can't verify that the types match. It can't check that the API endpoint it just created has a corresponding client. You become the integration layer, manually copying context between repos that should be one.

I mentioned in the second post that every founder I know who ships with AI agents has a monorepo now. The companies that didn't consolidate are struggling because their agents can't see across service boundaries. That's still true.

Our Doozy codebase is a 300,000-line Next.js monorepo managed with Nx: @doozy/webapp, @doozy/api, @doozy/shared. Two production apps, shared packages, infrastructure. One repo. Every agent sees the full system.

Pane itself is the same pattern with pnpm workspaces:

# pnpm-workspace.yaml
packages:
  - 'frontend'    # React + Vite renderer process
  - 'main'        # Electron main process (TypeScript)
  - 'shared'      # shared types across packages

Different monorepo tool, same principle. The tool doesn't matter. The structure does.

- - - - - - - - - - - - - - - -

mechanical invariants

This is where most people's agent setups fall apart. They have agents writing code but no automated way for agents to check their own work. OpenAI calls these "mechanical invariants" — automated checks that enforce architectural boundaries without human oversight. Custom linters and structural tests that codify your taste into executable rules.

Their key insight: write your lints so that the error messages themselves inject remediation instructions into the agent's context. The agent doesn't just see "error on line 47." It sees "error: relative import detected. Use @/ path alias instead. See ARCHITECTURE.md section 3 for import conventions." The lint error becomes a teaching moment. The agent self-corrects without you.

Every package in your monorepo needs at minimum:

→ typecheck — TypeScript compilation with strict mode. tsc --noEmit. This is the first gate.
→ lint — ESLint configured per package. Zero warnings allowed. --max-warnings=0. Not "try to keep warnings low." Zero.
→ build — The full build command to verify nothing broke downstream.

Root scripts delegate to packages:

{
  "lint": "pnpm run -r lint",
  "typecheck": "pnpm run -r typecheck"
}

Or with Nx: npx nx build @doozy/webapp, npx nx lint @doozy/webapp --max-warnings=0.

But the real value is in the specific rules. Here's what's in our ESLint configs:

→ @typescript-eslint/no-explicit-any: 'error' — Not 'warn'. Error. Hard gate. Every any type is a place where the agent loses the ability to reason about your code. TypeScript's type system is the single best tool you have for giving agents structural understanding of your codebase. When an agent sees function processData(input: any): any, it has no idea what that function does, what it accepts, or what it returns. When it sees function processData(input: TranscriptSegment[]): ProcessedTranscript, it can trace the types, understand the contract, and implement callers correctly. Every any you allow is a hole in the agent's understanding. Making it a hard error means agents will never introduce new ones, and over time the codebase becomes more legible to every future agent session.
→ react-hooks/rules-of-hooks: 'error' — Agents love to write hooks inside conditionals. Humans sometimes get away with it. Agents do it constantly when they're generating code fast and not thinking about React's rules. Hard error catches it before it ships.
→ Separate configs per package. Frontend and backend have different concerns. The frontend config enforces React rules. The main process config allows console.log (no-console: 'off') because Electron's main process legitimately needs it for logging. One config for the whole repo means either too strict for some packages or too loose for others.

These aren't rules you set once and forget. They're rules you arrive at because an agent shipped code with 14 any types and you spent an hour manually fixing them. Or because an agent wrote a useEffect inside an if-statement and the app crashed in production. You encode the lesson into a lint rule, and the agent never makes that mistake again. The lint rule outlives the conversation. It becomes institutional memory.

Our implementer agent runs typecheck → lint → format after every major section of work. The implementation reviewer runs it again after all work is done. The prepare-pr skill runs it before creating the pull request. Three separate checkpoints, same commands. If the agent introduces a type error in step 3, it catches and fixes it before step 4. The error never compounds.

- - - - - - - - - - - - - - - -

the little things that compound

Beyond TypeScript and ESLint, there are dozens of small infrastructure decisions that feel trivial in isolation but compound dramatically when agents are writing thousands of lines a day.

→ Path aliases over relative imports. @/components/Button not ../../../components/Button. This shows up in both our ESLint rules and our review system. Agents navigate by aliases. Relative paths create ambiguity when the agent doesn't know what directory it's reasoning from, and as files move around during refactors, relative paths break silently. Path aliases are stable. OpenAI's blog implements the same thing via their "rigid architectural model" — every layer has a fixed dependency direction, and imports reflect that structure explicitly.
→ Shared types in a shared package. @doozy/shared or pnpm's shared/ workspace package. Types that cross package boundaries live in one place. The agent doesn't have to guess whether the User type it needs is in frontend/src/types/, main/src/types/, or somewhere else. It's in shared/. Always. Our implementation reviewer specifically checks for this: "Missing @doozy/shared exports for cross-app types" is a flagged gap.
→ One way to do everything. This is the single most important principle in our entire review system. We run 11 review agents in parallel when checking code, and the one labeled "single way to do things" is flagged as the most critical. The review prompt explains why: "If two hooks exist for audio recording, the LLM will create a third for the next feature. If one unified hook exists in multiple places, the LLM reuses it." Proliferation of patterns is the number one way codebases become illegible to agents. OpenAI calls this the "rigid architectural model" — fixed layers, fixed patterns, fixed dependency directions. Not for human aesthetics. For agent legibility.
→ Locality with underscore prefixes. Local components live in _components/. Local hooks in _hooks/. Shared code lives in src/. The underscore prefix tells the agent (and you) that this code is scoped to this route. Don't import it from elsewhere. This convention came from agents importing local components into shared contexts and creating circular dependencies.
→ Thin pages, orchestration hooks. Page files are JSX composition only. All logic lives in orchestration hooks. This keeps pages readable for agents and forces the kind of clean separation that agents can reason about. Our frontend architecture review agent checks this specifically: "Pages are JSX composition only, all logic in orchestration hooks."
→ Node version pinning. .node-version and .nvmrc in the repo root. Ours pins Node >= 22.14. Agents don't think about node versions. If your CI runs Node 20 and your dev environment runs Node 22, the agent writes code that works locally and breaks in CI. Pin it. One version. Everywhere.
→ pnpm only. Not npm, not yarn. One package manager. If an agent runs npm install instead of pnpm install, it creates a package-lock.json alongside your pnpm-lock.yaml and now you have two lockfiles and a confused dependency tree. We specify this in engines and our AGENTS.md says it explicitly. Eliminate the ambiguity before it exists.

Every one of these rules exists because agents broke things when we didn't have them. You learn the rules the hard way. Then you encode them in AGENTS.md, CLAUDE.md, and ESLint configs so the agents never make the same mistake twice.

- - - - - - - - - - - - - - - -

the mental model

Now the second question. This is the Pane-specific piece.

→ Workspace = a product. One workspace per repo. Your monorepo is your workspace.
→ Pane = a feature. Each pane gets its own git worktree, isolated from every other pane. Agent works on feature A in pane 1, agent works on feature B in pane 2, they can't step on each other's changes.
→ Tabs = activities within a pane. Your agent terminal, the diff viewer, the file explorer, the git tree, logs. Everything you need to manage one agent working on one feature.

The workspace/pane/tab hierarchy maps directly onto the monorepo/worktree/tool hierarchy. The product is the repo. The feature is the branch (via worktree). The tools are the tabs. There's no abstraction gap. What you see in Pane is exactly what's happening in git.

This connects to something OpenAI emphasizes: isolation via worktrees. Each agent task in their system runs in a fully isolated git worktree with its own sandbox to enable parallel work without conflicts. Pane makes this invisible. You don't manage worktrees. You open a pane. The worktree is created, configured, and connected to the agent automatically. Three panes, three worktrees, three agents, all running simultaneously on the same repo. Ctrl+Up/Down to cycle between them.

Worktrees over branches matters because branches require context-switching. Checkout branch A, run the agent, stop, checkout branch B, run another agent. Sequential. Worktrees are parallel by design. And because each worktree inherits the repo root's .claude directory, AGENTS.md, CLAUDE.md, and all your lint configs, every new pane boots with the full harness. No per-session setup. No copying configs. The infrastructure is just there.

- - - - - - - - - - - - - - - -

the compounding effect

None of this is sexy. Setting up a monorepo, configuring lint rules, writing AGENTS.md, organizing a .claude directory, pinning node versions, enforcing no-explicit-any — this is not the content that goes viral. "Two founders, 300k lines, zero engineers" goes viral. "I configured ESLint with zero warnings allowed" does not.

But here's the thing. Every post I've written about the workflow, every command that collapsed, every voice note that turned into a PR — all of it sits on top of this foundation. The /discussion skill works because the codebase-explorer agent can navigate a well-structured monorepo. The /plan skill works because there's an AGENTS.md telling the agent where things live and a plan template with validation commands that actually run. The /implement skill works because typecheck and lint are configured, so the agent can self-correct in a loop that catches errors before they compound.

Remove the foundation and the commands are just markdown files that produce broken code.

OpenAI's blog says it clearly: "The primary job of our engineering team became enabling the agents to do useful work. For every workflow or domain, the question was: what capability is missing, and how do we make it both legible and enforceable?" Legible and enforceable. Not aspirational. Not documented in a wiki no one reads. Legible in the codebase. Enforceable by a lint rule or a typecheck. That's the bar.

Me and Tyler did this intentionally from day one with Doozy. Monorepo structure, Nx configuration, shared packages, strict TypeScript, per-package lint rules, path aliases, CLAUDE.md conventions, structured plan lifecycle. It felt slow at the time. Every hour spent on setup was an hour not spent on features. But the compounding is real.

Every agent session runs faster because the foundation is solid. Every new feature slots into a structure the agents already understand. Every mistake gets encoded into a rule that prevents it from happening again. The cleaner the codebase, the better the agent output, which keeps the codebase clean, which improves the next output. OpenAI runs background Codex tasks on a cadence to scan for deviations, update quality grades, and open refactoring PRs to pay down technical debt continuously. We do the same with our linter commands — spawning parallel agents to fix type errors, clean unused imports, optimize React hooks, all in one pass.

Friction removal compounds. That's the whole insight. Any friction you don't remove is friction every agent hits every session, forever. A missing path alias is one confused import per session times hundreds of sessions. A missing lint rule is one pattern violation per session that silently degrades the codebase until agents can't reason about it anymore.

This is why I called it a software factory. Factories don't run on commands. They run on infrastructure.

- - - - - - - - - - - - - - - -

the setup checklist

If you're starting from scratch, here's the order:

1. Structure as a monorepo. pnpm workspaces, Nx, or Turborepo. Separate packages for frontend, backend, shared types. One repo, one workspace.
2. Configure the validation loop. TypeScript strict mode. @typescript-eslint/no-explicit-any: 'error'. ESLint per package with relevant rules for each (React hooks for frontend, different concerns for backend). Root scripts that delegate. Zero warnings policy. Pin your node version in .node-version and .nvmrc. Specify your package manager in engines. If the agent can't run pnpm run typecheck && pnpm run lint and get a clean result, nothing else works.
3. Write AGENTS.md. Project structure, build/dev/test commands, coding conventions, commit guidelines. Under 100 lines. The map, not the manual. This is what every agent reads first.
4. Write CLAUDE.md. Full architectural reference. Every convention, every pattern, every decision you've made and why. This grows over time as you encode lessons from agent mistakes. It's the manual agents reference for depth.
5. Set up .claude/. Copy our skills and agents from the Pane repo as a starting point (they're open source). Customize the plan template for your stack. Adapt the validation commands to your build system. Add review principles that match your architecture.
6. Commit to repo root. Everything lives in the repo. Every worktree, every agent session, every pane inherits it automatically. As OpenAI puts it: if it's not in the repo, it doesn't exist to the agent.
7. Start with /discussion. Not /implement. Talk to the agent about your first feature. See if it can navigate your codebase, find the right patterns, understand the structure. If it can't, your AGENTS.md and CLAUDE.md need more context. Iterate on the foundation before you try to build on it.

- - - - - - - - - - - - - - - -

the questions aren't dumb

The community member's questions were actually the most practical questions you can ask about this workflow. "How do I set up the infrastructure?" and "What's the mental model?" These are the questions that, if answered well, make everything else in the previous posts actually reproducible.

I should have written this post first.

- - - - - - - - - - - - - - - -

All of our Claude Code commands, skills, agent definitions, AGENTS.md, and CLAUDE.md are open source: github.com/dcouple/Pane/.claude

OpenAI's harness engineering blog: openai.com/index/harness-engineering

Before Your First /discussion: The Repository Setup That Makes Everything Else Work

the repo is the system of record

progressive disclosure: the map and the manual

the .claude directory

the monorepo

mechanical invariants

the little things that compound

the mental model

the compounding effect

the setup checklist

the questions aren't dumb