Context Will Eat Software

The bitter lesson, applied

Rich Sutton argued that researchers who build hand-crafted knowledge into AI systems always lose to researchers who build systems that scale with computation. Chess engines with grandmaster heuristics lost to engines that searched deeper. NLP systems built on linguistic rules lost to transformers trained on raw text.

The lesson applies to software infrastructure too.

Teams that hand-craft onboarding documents, maintain wikis, and write architecture decision records are doing the right thing. But they are doing it in a way that does not scale. The documents go stale. The wiki diverges from reality. The ADRs stop getting written after the third sprint.

The approach that scales is extraction, not authoring. Let the repository be the source of truth. Let computation do the work of surfacing what matters.

The context problem

Every repository is two things at once.

The first is a collection of files. Functions, classes, configs, tests. This is what tools see. This is what AI reads.

The second is a record of decisions. Why the auth system was rewritten in Q3 2023. Why the scheduler lives in its own crate. Why nobody touches the migration directory without checking with the platform team first. Why that one file has 47 edge cases and a comment that says "do not refactor."

The first is executable. The second is invisible. And the second is what actually determines whether you ship or spend three weeks debugging something that was solved in a PR two years ago.

What we lost

We used to lose context slowly. An engineer leaves. Their mental model goes with them. The team adjusts. Tribal knowledge decays over months.

Now we lose it instantly.

An AI agent opens a pull request against a codebase it has never seen. It reads the files. It does not read the history. It proposes a change to a module that was deliberately frozen after three failed rewrites. The CI passes. The tests pass. The architecture breaks.

This is not an edge case. This is the default behavior of every AI coding tool on the market.

Context packs in action

Onboard ingests git history, PR discussions, and code structure, then synthesizes structured context with Opus 4.6. Here is what that looks like.

01

Onboarding a new engineer to facebook/react

18,247 commits · 412 contributors · 2,847 files

onboard cli

$ onboard --repo facebook/react --persona "new frontend engineer"

  Ingesting facebook/react...
  ├─ git history: 18,247 commits across 412 contributors
  ├─ github: 127 open PRs, 24 recent releases
  ├─ code: 2,847 files, 14 languages detected
  └─ done in 4.2s

  Synthesizing with Opus 4.6 (effort: high, budget: 16k)

  ── Thinking ─────────────────────────────────
  The React codebase has evolved through three
  major architectural phases. The key insight for
  a new frontend engineer is understanding the
  reconciler abstraction that decouples the
  component model from rendering targets...
  ─────────────────────────────────────────────

  Context pack: 12,847 tokens
  Sections: architecture, key-decisions, risk-zones, first-pr
  Receipt: ctx_react_a7f2e3b4

02

Evidence-graded context for an AI agent

MCP server · Every claim cited to a commit or PR

mcp tool call

Tool: onboard.get_context_pack
Repo: anthropics/anthropic-sdk-typescript
Persona: "typescript tooling engineer"
Depth: brief

  Evidence grades:
    A (commit-cited):  14 claims
    B (pr-cited):       8 claims
    C (inferred):       3 claims

  Unknowns: 2 flagged
  Receipt: ctx_sdk_ts_b3c4d5e6

03

Watch Opus think through a codebase in real time

Streaming extended thinking · 128K output tokens

onboard studio

Opus 4.6 thinking (12.4s elapsed)...

  The express codebase represents a fascinating case
  study in minimalist API design. At 64,521 lines
  across 156 files, it powers 65% of Node.js web
  applications while maintaining a remarkably small
  surface area...

  Key architectural decision: the middleware pipeline
  pattern was chosen over a monolithic router because
  [PR #2237] demonstrated that composability reduced
  downstream breaking changes by 73%...

  ▊

Context as infrastructure

We think about this the way the industry thought about observability ten years ago.

Before Datadog, teams logged to files and grepped. Before Sentry, teams read stack traces in production logs. The infrastructure did not exist to make runtime behavior legible at scale.

Repository context is in that same pre-infrastructure era. Teams grep through git logs. They search Slack for that one thread where someone explained the deployment process. They ask the person who has been on the team the longest. If that person left, they guess.

Observability made runtime behavior queryable. Onboard makes institutional memory queryable.

How it works

1. Ingest Git history, GitHub PRs, code structure, file churn, tech stack, TODOs

2. Synthesize Opus 4.6 with adaptive thinking, persona-aware prompts, effort routing

3. Present Web studio, REST API, MCP server, CLI, or Claude Code skill

Depth maps directly to the Opus API: brief uses low effort with 8K budget, standard uses high with 16K, and deep uses max effort with 48K output tokens. You control the cost-quality tradeoff per request.

The bet

We are betting that the next decade of software will be defined not by who writes the most code, but by who has the best context.

Code is becoming commoditized. Context is not.

The teams that win will be the ones where every contributor, human or machine, understands the repository they operate in. Not just the files. The history. The intent. The architecture. The risk.

That requires infrastructure. We are building it.

Where we're going

Context packs with Opus 4.6

Streaming synthesis with extended thinking, persona-aware prompts, evidence grading.

Multi-delivery: Studio, API, MCP, CLI

Same context engine, five interfaces. Use it however your workflow demands.

Org-wide context

Analyze multiple repositories in a single pass. Cross-repo architectural patterns.

Context registry

Public registry of context packs. Browse, search, and reuse context across teams.

Incremental updates

Context packs that update on each push, not regenerated from scratch.

Enterprise

Self-hosted, SSO, SOC-2 compliance, audit log export, dedicated SLA.