THINK · Jun 14, 2026

Your agent is 1.6% model. The rest is the harness.

UCL researchers reverse-engineered Claude Code. Only 1.6% is AI logic. The rest is permission gates, context compaction, tool routing, and recovery. Here's what a production harness looks like.

Agent-ready: drop this post into Claude Code or Codex

TL;DR: UCL and MBZUAI researchers reverse-engineered Claude Code from its leaked TypeScript source (v2.1.88). They found 512,000 lines of code across 1,902 files. Only 1.6% is AI decision logic. The rest is the operational harness: 7 permission layers, a 5-stage context compaction pipeline, 4 extension mechanisms at different cost tiers, a custom bash parser with 4,437 lines across 23 files, a subagent system with worktree isolation, and an append-only session store. The model reasons inside a deterministic harness. The harness does everything else.

Key takeaways:

  • 1.6% of Claude Code is AI logic. 98.4% is operational infrastructure. The model is the smallest part of the system.
  • A 7-layer deny-first permission pipeline with ML classifier. Entering auto mode strips dangerous allow-rules from your allowlist.
  • Context management uses a graduated 5-stage pipeline. Each layer runs only when cheaper ones fail.
  • Four extension mechanisms ordered by context cost: hooks (zero), skills (low), plugins (medium), MCP (highest).
  • Subagents return only summary text. Full transcripts in sidechain files. Token cost is 7x a standard session by design.
  • The quality of the harness, not the model, is the differentiator as frontier models converge.

Day 1 of reverse-engineering Claude Code. I expected to find AI logic. I found permission gates instead.

Day 3. Still finding permission gates. Somewhere under them is a model. Probably.

Day 14. Found it. 1.6% of the codebase. The other 98.4% is a harness designed to let that 1.6% work without breaking everything.

This isn’t my discovery. It’s from Dive into Claude Code, a paper by researchers at University College London and MBZUAI. They reverse-engineered Claude Code v2.1.88 after Anthropic accidentally shipped source maps in an npm package on March 31, 2026. They analyzed 512,000 lines of TypeScript across 1,902 files. They traced five human values through thirteen design principles into concrete implementation choices.

The result is the most detailed picture of a production-grade agent harness ever published. This is what it looks like.

The 1.6% that everyone talks about

The core loop is a simple while-true:

while not stopped:
    context = assemble(system_prompt, tool_schemas, history, hook_additions)
    action = model(context, tools)
    if action.is_text_only():
        stopped = run_stop_hooks(action)
        continue
    if not permitted(action):
        continue
    action = run_pre_tool_hooks(action)
    result = execute(action)
    result = run_post_tool_hooks(result)
    history.append(action, result)

Call the model, run tools, repeat. The ReAct pattern, barely 50 lines of pseudocode. This is the AI decision logic. It decides which tool to call, formats the arguments, and declares when it’s done. That’s it.

Everything else in those 512,000 lines exists to make that loop reliable.

The paper identifies five human values that drive the architecture: human decision authority, safety and security, reliable execution, capability amplification, and contextual adaptability. These trace through thirteen design principles into concrete implementation choices. The principles include deny-first with human escalation, graduated trust spectrum, defense in depth, context as a scarce resource, append-only durable state, minimal scaffolding with maximal harness, and reversibility-weighted risk assessment.

The principle that does the heavy lifting: minimal scaffolding, maximal harness. Instead of building explicit planning layers or state graphs, Claude Code invests in operational infrastructure. The model reasons however it wants. The harness ensures those decisions stay safe, recoverable, and within budget.

The 98.4% that matters

1. The permission system

Users approve 93% of permission prompts. This number comes from the paper. Habituation has made interactive confirmation almost useless. People click yes without reading.

Claude Code’s answer is deny-first architecture with seven independent layers:

Layer 1: Tool pre-filtering. Forbidden tools never reach the model’s view. The model never wastes a turn proposing an operation it can’t execute.

Layer 2: Rule hierarchy. Deny rules always beat ask rules. Ask rules beat allow rules. The system evaluates explicit rules before the model sees any tool.

Layer 3: Permission modes. Seven modes control the entire permission posture. plan restricts everything to read-only. default requires approval for destructive operations. acceptEdits auto-approves file edits. auto delegates approval to an ML classifier. dontAsk bypasses the user for known-safe operations. bypassPermissions is the YOLO mode. bubble isolates subagents.

Layer 4: The ML classifier. A two-stage pipeline. Stage one is a fast 64-token binary decision: allow or escalate. Temperature zero. Stop sequences force the output into <block>yes</block> or <block>no</block>. A yes runs the tool immediately. A no triggers stage two: 4,096 tokens of full chain-of-thought reasoning, also at temperature zero. Results cache for 1 hour with a 60-80% hit rate.

Layer 5: Shell sandbox. Bottom-layer isolation for dangerous operations. Docker or Bubblewrap depending on the platform.

Layer 6: The bash parser. A hand-rolled recursive descent parser across 23 files. It runs full AST analysis on every bash command before execution. It detects 15 categories of dangerous AST nodes including command substitution, process substitution, subshells, loops, conditionals, and function definitions. It blocks 35+ dangerous shell builtins: eval, exec, source, trap, and all 18 of zsh’s dangerous builtins (zmodload, ztcp, etc.).

The pre-parse checks go deep. Control characters. Unicode whitespace attacks. Zsh dynamic tilde expansion and equals expansion. Brace expansion with embedded quotes. Array subscript arithmetic RCE attacks where bash evaluates arbitrary code inside array subscripts. The parser has a 50ms timeout and a 50K AST node budget.

The philosophy: fail-closed, allowlist-based. An unrecognized AST node type flags as too complex and requires explicit user approval.

Layer 7: Circuit breaker. Three consecutive denials or 20 total in a session triggers fallback to manual prompts (interactive) or AbortError (headless). The Karan Prasad analysis provides a walkthrough of all 7 layers with 16 architectural diagrams.

The detail that matters: entering auto mode strips dangerous allow-rules from your allowlist. Python, node, bash, ssh, sudo, eval, exec, and every process spawner disappear. The classifier doesn’t trust your past approvals. This is reversibility-weighted risk assessment: light oversight for reads, heavy gates for writes.

2. The context compaction pipeline

Long-context models don’t eliminate the context problem. They make bad management more expensive because you can fit more noise before the model breaks.

Claude Code runs five distinct compactors in sequence before every model call:

Stage 1: Budget reduction. Trim oversized tool outputs. Replace the full output with a content reference or summary. This operation costs the least and catches most cases.

Stage 2: Snip. Lightweight history trimming. Drops low-value turns from the conversation. The paper calls this HISTORY_SNIP. It runs a fast heuristic to identify turns that contributed nothing to the final outcome.

Stage 3: Microcompact. Fine-grained compression. The cache-aware path defers boundary decisions until after the API response. This lets it make smarter choices about what to keep and what to drop.

Stage 4: Context collapse. Read-time projection over the conversation. This is the heavyweight option. It rewrites the entire conversation history into a compressed form. Full history stays intact on disk for session resume.

Stage 5: Auto-compact. Model-generated summary. This fires only when everything else fails. It asks the model to summarize the conversation so far. It costs the most and delivers the least reliability.

The design principle: never pay for a heavy compaction if a cheap one works. Each stage degrades gracefully into the next. If budget reduction clears enough space, snip never runs. If snip is sufficient, microcompact never fires. And so on. The aihumanlove.com breakdown summarizes the pipeline as: if your harness has one auto-summary mechanism, it has the wrong number.

The paper frames context as the binding resource of the entire architecture. All subsystems serve context management. The compactors exist because the model’s transcript uses append-only JSONL files. Full history always stays on disk. Compaction only changes what the model sees, not what the system remembers.

3. The four extension mechanisms

Claude Code has four ways to add capabilities, ordered by ascending context cost:

Hooks. Twenty-seven documented lifecycle events: PreToolUse, PostCompact, PreExecute, PostExecute, and more. Hooks can run shell commands, send HTTP requests, call the model, or spawn subagents. Their context cost is zero because they load nothing into the model’s window until they fire.

Skills. SKILL.md files with YAML frontmatter. Skills load into the model’s context only when explicitly invoked. The model must decide to use a skill before it loads. This means skills consume no context window space on every turn.

Plugins. Multi-component bundles that register commands, agents, skills, hooks, and LSP support. Higher cost than skills because plugins register tools and schemas the model sees on every turn.

MCP servers. External tools over stdio, SSE, HTTP, or WebSocket. The highest context cost because MCP tools appear to the model through dynamically generated schemas that consume prompt space.

The ordering is deliberate. Each integration pattern gets its own surface at the right cost tier. If you can solve a problem with a hook, you should not use a plugin. If a skill works, you should not use MCP.

4. Subagents built for isolation

Subagents create new, isolated context windows. They don’t share the parent’s permission state or transcript. They return only summary text to the parent. Their full transcripts live in sidechain files.

The isolation is physical. Each subagent gets its own git worktree, created by a literal git worktree add command. This stays separate from the user’s working tree and from every other subagent’s tree. The blast radius of a delegated task stays contained by design.

The paper reports that agent teams cost roughly 7x the tokens of a standard session. The overhead isn’t a bug. It’s the price of isolation. Subagents serve tasks that genuinely need independence, not parallelizing everything.

5. Session storage

Session transcripts use append-only JSONL files. Everything stays. Compaction only changes what the model sees on the next read, not the stored history.

Permissions deliberately don’t restore across session resumes. Trust re-establishes every session. The paper calls this friction by design. Every resume is a fresh trust negotiation.

6. The boot sequence

The paper documents the startup sequence. Claude Code boots in 10 steps taking 1.2 to 1.8 seconds:

  1. Side-effect imports fire before the rest of the code loads. MDM reader takes about 135ms. macOS Keychain reads take about 65ms each.
  2. Debug mode detection. Silently exits for non-internal builds.
  3. Deep link URI handling. Feature-gated behind a flag.
  4. Custom URL scheme parsing: cc:// and cc+unix:// handlers.
  5. SSH remote flag extraction.
  6. Interactive versus non-interactive detection.
  7. Entrypoint determination across 4 modes: CLI, SDK, MCP Server, Sandbox.
  8. Early settings loading and validation.
  9. Commander.js setup with preAction hooks.
  10. Startup mode routing across 10 possible configurations.

Steps 1 through 3 overlap. Auth tokens already sit in memory by the time Commander.js finishes.

7. The terminal rendering engine

Claude Code ships a complete React Fiber reconciler for the terminal:

React components flow through a custom reconciler into Yoga Flex Layout into DOM-to-screen rendering into an Int32Array screen buffer into frame diffing.

Each screen cell is a 32-bit integer. Twenty-one bits for the Unicode codepoint, 4 for foreground color, 4 for background color, 3 for style flags. Flat typed arrays mean zero object allocation per frame.

Frame diffing compares the buffer cell by cell and emits ANSI escape codes only for changed cells. On an idle scroll, output drops from roughly 10KB to about 50 bytes.

The frame rate stays at 10 FPS. Virtual scrolling handles 2,800-plus messages. The team rewrote the Yoga layout engine in TypeScript with about 3,600 lines of native reimplementations to eliminate native module dependencies.

This is infrastructure that many agent frameworks skip because they offload rendering to the browser or a terminal emulator. Claude Code owns the full stack.

What the comparison with OpenClaw reveals

The paper compares Claude Code with OpenClaw, an independent open-source multi-channel personal assistant gateway. Both systems answer the same design questions but arrive at different answers because their deployment contexts differ.

Design QuestionClaude Code (Coding CLI)OpenClaw (Gateway)
Safety modelPer-action classification through a 7-layer pipelinePerimeter-level access control at the gateway boundary
RuntimeSingle CLI query loopEmbedded runtime within a gateway control plane
Capability registrationContext-window extensions (tools in the prompt)Gateway-wide capability registration
Session modelEphemeral per-task sessions, explicit resumePersistent sessions, continuous presence

The comparison matters because it shows that harness architecture isn’t one-size-fits-all. The same design principles produce different implementations when the deployment context changes. A CLI tool that terminates after every task and a gateway that never sleeps have different safety profiles, session models, and extension surfaces.

Six open design directions

The paper concludes with six open directions for future agent systems. These are the problems no production system has fully solved:

  1. Converging harness design. As more agents ship, will harness architectures converge or stay fragmented by deployment context?

  2. Safety at the frontier. Current permission systems assume known tool sets. Autonomous agents that generate their own tools create a fundamentally harder safety problem.

  3. Context management beyond length. When models have effectively unlimited context windows, the problem shifts from capacity management to relevance management.

  4. Multi-agent coordination standards. Subagents with opaque sidechain files work for two agents. They don’t scale to teams of 50.

  5. Evaluation at the harness level. Benchmarks evaluate models, not harnesses. The paper argues that harness quality should be independently measurable.

  6. Economic models for agent compute. When 98.4% of the system is infrastructure, the cost model of running an agent shifts from per-token to per-harness-operation.

These are the problems that will define the next generation of agent systems.

What this means for how you build

The paper is a research artifact from a specific codebase. But the architectural principles generalize to any production agent system.

Permission systems need layers. A single approval dialog isn’t a safety system. Users habituate. Build deny-first pipelines with multiple independent gates where each layer catches what the previous one missed. The overhead is worth it.

Context management is a graduated problem. One compaction strategy doesn’t work for all cases. Start cheap, escalate only when necessary. Every compaction layer should degrade gracefully into the next.

Extension mechanisms should match context cost. If every integration loads into the model’s context window on every turn, you are paying for something the model rarely uses. Design extension surfaces at different cost tiers. Hooks for zero-cost interception. Configuration files for cheap injection. External services for expensive capabilities.

Subagents need isolation by default. Shared context windows are the default in popular frameworks. Isolation should be the default. The token overhead is the price of bounded blast radius.

The harness is the product. The model reasons. The harness routes, protects, recovers, persists, and extends. Every line of permission gate, compaction logic, and recovery path is an investment in making the model’s decisions safe and composable. As frontier models converge on raw capability, the harness is the only differentiator left.

The split’sn’t 50-50. It’s 1.6% to 98.4%. Invest accordingly.

FAQ

Why is 98.4% of Claude Code not AI? The model’s job is narrow: pick a tool, format arguments, decide when done. Everything else is infrastructure that makes those decisions reliable. Seven permission layers, five compaction stages, a custom bash parser with 4,437 LOC, a terminal rendering engine with a complete React reconciler. The model reasons. The harness does everything else.

Should I build my agent like Claude Code? Not exactly. Claude Code is a general-purpose coding CLI with 512,000 lines serving that one use case. Your agent probably serves a narrower purpose. But the principles apply: deny-first permissions, graduated context compaction, extension mechanisms at different cost tiers, isolated subagents, append-only state.

Does the Claude Code leak affect users? The leak exposed source code, not user data. Anthropic shipped source maps in an npm package by accident. No user sessions, API keys, or private data escaped. The paper analyzes the architecture, not vulnerabilities.

What is the most surprising finding? The bash parser. A hand-rolled recursive descent parser, 4,437 LOC, 23 files, full AST analysis, 15 categories of dangerous nodes, 35 blocked builtins, array subscript RCE attack detection, 50ms timeout. Claude Code treats every bash command as a security boundary.

How much of the code is the terminal UI? A significant portion. Claude Code ships a custom React Fiber reconciler for terminal rendering, a Yoga layout engine rewritten in TypeScript, an Int32Array screen buffer with cell-by-cell frame diffing, and virtual scrolling for 2,800-plus messages. It doesn’t use a standard terminal emulator. It’s one.


This article was published on Agentic Up (https://agenticup.dev): practical guides for developers and founders building with AI agents. Reach me at [email protected]

Newsletter

Get the brief on AI agents

Practical posts on shipping agents, automating work, and building in public. No hype, no fluff.

Contact: [email protected]