Claude Code's source code leaked this morning. A .map file left in their npm package. Someone pulled the thread and the entire codebase unraveled onto GitHub. By midday, thousands of developers had combed through every file, every comment, every function name.
Here's what they found: the company that builds the most popular AI coding agent in the world ships code that violates almost every principle that makes codebases agent-readable.
I just wrote an entire post about repository setup - monorepo structure, mechanical invariants, zero eslint warnings, no-explicit-any: 'error', progressive disclosure, one way to do everything. The practices that make agents capable of reasoning about your code. Anthropic's own codebase is a case study in what happens when you skip all of them.
This isn't a dunking exercise. It's a mirror. If the company building the agent can't keep its own house clean, it tells you something important about what actually matters.
460
Four hundred and sixty eslint-disable comments.
In the last post, I wrote: "Zero warnings policy. Not 'try to keep warnings low.' Zero." Our ESLint config uses --max-warnings=0. The agent can't introduce a warning. The build fails. The loop catches it. The warning never compounds.
Anthropic went the other direction. 460 times, an engineer (or an agent) hit a lint error and instead of fixing it, added // eslint-disable and moved on. Once. Fine. Everybody's done it. Ten times. You're shipping fast. A hundred times. You've lost the plot. 460 times. You don't have a linter anymore. You have a comment generator.
Here's why this matters beyond code aesthetics. Every eslint-disable is a hole in the mechanical invariant. The rule exists because someone decided this pattern causes bugs. The disable says "I know, but not right now." The agent doesn't understand "not right now." The agent sees a file full of eslint-disables and learns that disabling rules is an acceptable pattern. So it starts doing it too. And because the linter is already disabled in 460 places, nobody notices 461.
Our implementer agent runs typecheck → lint → format after every major section of work. The implementation reviewer runs it again. The prepare-pr skill runs it before creating the pull request. Three checkpoints. Same commands. Zero warnings allowed at each one. That's not perfectionism. It's infrastructure. The lint rule outlives the conversation. It becomes institutional memory.
Anthropic's institutional memory is 460 exceptions to its own rules.
4,683 lines in a single file
main.tsx. 803,924 bytes. 4,683 lines. Six files over 4,000 lines each. Their print utility alone is 5,594 lines.
I wrote about this in post 6: "If two hooks exist for audio recording, the LLM will create a third for the next feature. If one unified hook exists in multiple places, the LLM reuses it." The "single way to do things" review agent is the one I flagged as the most critical in our entire review system.
A 4,683-line file is the architectural opposite of that principle. It's not one thing. It's dozens of things crammed into a namespace because nobody split it. An agent reading that file has to hold nearly 5,000 lines of context to understand any single function. Its attention degrades. It hallucinates relationships between code that has no business being in the same file. It adds new code to the bottom because that's where the cursor lands, and the file grows to 5,000, then 6,000.
Our architecture review agent specifically checks that page files are JSX composition only with all logic in orchestration hooks. Thin pages. Clean separation. Not because we're precious about file length, but because agents reason about code at the file level. A 200-line file with one responsibility is a unit an agent can hold in its head. A 4,683-line file is a maze.
There's a comment in the r/ClaudeAI thread that gets it exactly right: "Code quality is mostly important so that agents can reason about the project accurately and efficiently. Directives like staying DRY, maintaining consistency, refactoring large files, using strong types actually reduce token usage and bugs."
This person understood what Anthropic apparently didn't: code quality in the AI age isn't about human readability. It's about agent legibility. A 4,683-line file is expensive. Not in disk space. In tokens. In attention. In every agent session that has to parse it and loses signal in the noise.
_DEPRECATED is just a vibe
They have a function called writeFileSyncAndFlush_DEPRECATED(). It handles saving your auth credentials to disk. It's called in production. There are 50+ functions with _DEPRECATED in the name that are still actively called.
A senior dev in the Reddit thread explained this charitably: "The reason the functions are named deprecated is probably to communicate that they should not be used in new code while they are slowly migrating the old functions to the new way of doing things."
That's fair. Deprecation as documentation. Reasonable in a human-maintained codebase where engineers read the function name and understand the social contract.
Agents don't understand social contracts.
An agent sees writeFileSyncAndFlush_DEPRECATED() and has two options. Use it, because it exists and works. Or don't use it, because the name says deprecated. If it uses the deprecated function, you've just propagated a pattern you're trying to kill. If it avoids the function but can't find the replacement (because the replacement isn't documented, or doesn't exist yet, or is in a file the agent hasn't loaded), it writes a third version. Now you have the deprecated function, the replacement, and the agent's improvisation. Three ways to do the same thing.
This is exactly the proliferation problem. One way to do everything. Not two. Not three. Not "one, plus one that says deprecated, plus whatever the agent invents when it's confused." One.
Every _DEPRECATED function that still exists in a codebase is a TODO that metastasized. Someone intended to replace it. They marked it. Life happened. The deprecated function stayed. The replacement either exists in parallel (two ways to do one thing) or was never finished (the deprecated way is the only way, despite the name saying otherwise). Both outcomes are agent-hostile.
When we find ourselves in this situation, the fix goes through /discussion first. What's the correct pattern? What should replace the deprecated version? Then /plan. Scoped migration plan. Then /implement. The old function disappears. The new function is the only function. The agent can't be confused because there's nothing to be confused about. Not "deprecate." Delete. The function either exists or it doesn't. No social contracts. No vibes. No names that mean one thing to humans and nothing to machines.
50+ deprecated functions still in production is 50+ opportunities for an agent to make the wrong call. Every single one.
9 empty catch blocks
config.ts. The file that manages your authentication. Nine empty catch blocks.
try {
// do something important
} catch (e) {
//
}Nine times. In the authentication handler. They literally had a bug (GitHub issue #3117) where config saves wiped your auth state. They had to add a guard called wouldLoseAuthState() after the fact.
In the last post, I described mechanical invariants as "automated checks that enforce architectural boundaries without human oversight." The invariant isn't the catch block. The invariant is the lint rule that prevents empty catch blocks from existing. no-empty: 'error' in ESLint. Or the TypeScript-specific @typescript-eslint/no-empty-function. One line in your config. Every empty catch block becomes a build failure. The agent can't ship it. The human can't ship it. The bug that wipes your auth state never gets the chance to exist.
Anthropic didn't have the invariant. So they got the bug. Then they patched it with a guard function. Then hopefully they still didn't add the invariant, because nine empty catch blocks remain.
This is the whole thesis of post 6 in one example. Write your lints so that the error messages inject remediation instructions into the agent's context. The agent doesn't just see "error on line 47." It sees why it's wrong and how to fix it. The lint error becomes a teaching moment. Without the invariant, the same mistake happens in every session, forever.
TODO: figure out why
My favorite finding from the leak. This comment is in their error handler:
// TODO: figure out whyThe function that handles your errors doesn't understand its own errors.
There were more.
// Not sure how this became a string
// TODO: Fix upstreamThe upstream is their own code. A value that should be a number is a string. Nobody knows where the type changed. Nobody traced it. They cast it and moved on and left a note for a future version of themselves that never showed up.
// This fails an e2e test if the ?. is not present.
// This is likely a bug in the e2e test.Read that twice. An engineer added optional chaining to fix a test failure, decided the test was probably wrong, and shipped the workaround anyway. The optional chaining is still there. The test is still there. Nobody confirmed whether the test is actually wrong. Both the fix and the suspected bug coexist in production, indefinitely.
And my personal favorite, from an engineer named Ollie:
TODO (ollie): The memoization here increases complexity by a lot,
and im not sure it really improves performanceOllie shipped code they openly admit might be pointless. The memoization is still running. Thousands of developers are paying for cycles on an optimization that its own author doesn't believe in.
These aren't bugs. They're something worse. They're the residue of problems that were never understood. Each comment is a moment where someone hit a wall, couldn't break through, patched around it, and left a breadcrumb hoping someone would revisit. Nobody revisits. The breadcrumbs accumulate. The codebase grows around them like a tree growing around a fence.
I've written about the death loop in three different posts now. Every one of these comments is a death loop that ended early. Someone stopped looping - either because they ran out of time, patience, or tokens - and shipped the patch instead of the fix. The death loop didn't resolve. It just got a comment and a merge.
// Not sure how this became a string is a symptom. The root cause is a type coercion happening somewhere upstream. In their own code. Finding it would require tracing the value through multiple files, understanding the transformation pipeline, and fixing the actual source of the type mismatch. That's a /discussion. That's probing. "If the agent gives you a weirdly specific answer that doesn't feel right, it's probably fixing a symptom. Push it to dig deeper." Nobody pushed deeper. The string cast is the weirdly specific answer, and it became permanent.
Pat Hanrahan said it in three words: spec, read, verify. These comments exist because nobody completed the spec phase. Nobody fully understood the problem before writing the fix. The fix is a patch on a mystery. That's what // TODO: figure out why actually means. "I fixed the what. I never understood the why."
the discussion that never happened
Our /discussion command exists specifically to prevent this. Before any code gets written - before a plan exists, before an agent touches a file - you have a conversation. Unstructured, open-ended, digging. The agent spawns a codebase-explorer and a researcher. You go back and forth until you have genuine clarity.
The output of /discussion is written to .context/context.md. Decisions. Rationale. Tradeoffs considered. If you can't articulate the why, you're not done discussing. You keep going. It's "the rabbit hole loop" from the first post. You loop until the rabbit hole bottoms out.
// Not sure how this became a string is what happens when you skip the rabbit hole. You see the string. You need a number. You cast it. You move on. The rabbit hole - why is this a string in the first place? what upstream transformation changed it? is this a design flaw or a bug? - stays unexplored. The TODO is the log entry for an expedition that never launched.
I do almost all of my discussion input through voice now. Talking instead of typing. When you type, you naturally try to structure your thoughts. When you talk, you ramble, you go on tangents, you think of edge cases mid-sentence. The agents handle the rambling fine, and the tangents often contain exactly the context they need that you would have forgotten to type.
I can imagine the Anthropic engineer staring at that string value, thinking "this doesn't make sense," and wanting to dig deeper. But they were two days into a sprint. They had three other PRs waiting for review. The ship date was Friday. So they cast the string, wrote the TODO, and context-switched.
Voice eliminates the activation energy. You don't have to sit down and type a structured investigation. You just talk. "Hey, there's a value in config.ts that's a string when it should be a number. I need to trace where the type changes. Let's look at every place this value gets passed or transformed." Thirty seconds of voice. The agent does the tracing. The rabbit hole gets explored. The TODO never gets written because the question gets answered.
Every TODO in a codebase is a debt instrument. It's a loan against future understanding. "I don't know why this works, but I'll figure it out later." The interest rate on that loan is every downstream decision that gets made without the understanding the TODO promises. Voice-first development doesn't just make the workflow faster. It makes the rabbit holes cheap enough to actually explore. The thirty seconds of talking that replaces the TODO is thirty seconds that saves hours of compounding confusion downstream.
Anthropic's codebase has hundreds of these loans outstanding. The interest is compounding. The understanding never arrives. And the agents writing code against this codebase inherit all of that confusion. The context window fills with workarounds. The agent produces more workarounds. The TODOs multiply.
the cobbler's children
There's an old saying. The cobbler's children have no shoes. The plumber's house has leaky pipes. The painter's walls are bare.
Anthropic builds the most popular AI coding agent in the world. Claude Code is the tool people use to write code with AI assistance. And their own codebase - the codebase of the agent itself - is hostile to the thing they're telling everyone to do.
- → 460 eslint-disables in a codebase that tells your agents to follow lint rules
- → 4,683-line files in a world where agent attention degrades past a few hundred lines
- → 50+ deprecated functions that agents can't distinguish from active ones
- → 9 empty catch blocks in the authentication handler
- → TODO comments that admit they don't understand their own errors
This isn't unusual. The Reddit threads are full of experienced developers saying "this is normal for a large codebase." They're right. It is normal. That's the point. Normal is what gets you 460 eslint-disables. Normal is what gets you // TODO: figure out why in production for months.
Normal was fine when humans maintained code. Humans can read a function called _DEPRECATED and understand the social context. Humans can skim a 4,683-line file and hold the relevant parts in working memory. Humans can look at an empty catch block and think "someone was in a hurry, I should probably handle this."
Agents can't.
Every practice I wrote about in the last post - monorepo structure, AGENTS.md, strict TypeScript, zero warnings, no-explicit-any, one way to do everything, path aliases, thin pages, node version pinning - exists because agents broke things when we didn't have them. We learned the rules the hard way and encoded them into the codebase so agents never make the same mistake twice.
Anthropic hasn't done that for their own codebase. The irony is the lesson: the company building the coding agent is the strongest evidence that the harness around the agent matters more than the agent itself.
The cobbler's children have no shoes. And until you build the harness, neither do yours.
All of our Claude Code commands, skills, agent definitions, AGENTS.md, and CLAUDE.md are open source: github.com/Dcouple-Inc/Pane/.claude
Previous posts:
- → Two Founders, 300k Lines, Zero Engineers: Our AI-Native Development Workflow
- → Building a Software Factory: 3 Commands, Custom Agents, and the Harness That Runs It
- → A Turing Award Winner Just Described Our Exact Workflow
- → Same Chef, Six Hats: What a Viral Agent Post Gets Right and Wrong
- → Before Your First /discussion: The Repository Setup That Makes Everything Else Work