Anthropic Just Dropped a Masterclass on Building Agent Harnesses (for Large Codebases)

By Cole Medin

Share:

Key Concepts

  • AI Layer: The ecosystem of context, tools, and configurations (rules, skills, MCP servers) surrounding an LLM that enables effective coding in large codebases.
  • Claude Code: An AI coding agent that uses agentic search (CLI-based navigation) rather than traditional RAG or indexing.
  • Global Rules (claude.md): Foundational instructions that dictate agent behavior; these should be lean and layered.
  • Progressive Disclosure: The strategy of loading specific context (rules/skills) only when operating in relevant subdirectories.
  • Hooks: Scripts that run at the start or end of a session to provide dynamic context or perform self-improvement (e.g., updating rules).
  • Skills: Reusable, scoped workflows or prompts for specific task types.
  • LSP (Language Server Protocol): A standard that allows the agent to perform symbol-level navigation (definitions/references) rather than simple string-based grep searches.
  • MCP (Model Context Protocol): A framework for connecting AI agents to external tools and data sources.
  • Sub-agents: Specialized agents used to offload exploration tasks, preventing context window bloat in the primary session.

1. Navigating Large Codebases

Claude Code operates by mimicking an engineer’s workflow, using command-line tools like grep and folder structure analysis rather than relying on a pre-indexed database.

  • The Challenge: As codebases grow to hundreds of thousands of lines, standard agentic strategies fail.
  • The Solution: Curate context upfront. By initializing Claude Code in specific subdirectories, you restrict its working directory, forcing it to focus on relevant files while still inheriting root-level claude.md rules.

2. The AI Layer Framework

The "harness" (AI Layer) is as critical as the model itself. It consists of seven components, with the following being most vital:

Global Rules (claude.md)

  • Strategy: Keep them lean. Avoid thousands of lines of text, which can overwhelm the LLM.
  • Layering: Use a root claude.md for general conventions and subdirectory-specific claude.md files for domain-specific logic. The agent automatically walks up the directory tree to aggregate these rules.

Hooks for Self-Improvement

  • Start Hooks: Dynamically load team-specific context (e.g., pulling documentation from Confluence or Git history) based on the developer or the task.
  • Stop Hooks: Enable continuous improvement. After a session, a hook can run in headless mode to reflect on changes made and propose updates to the claude.md file, ensuring documentation doesn't go stale.

Skills and Progressive Disclosure

  • Skills are reusable workflows (e.g., "Add API Route").
  • Scoping: Use the path parameter to ensure a skill only appears as an option when the agent is working in a relevant directory. This prevents "context pollution" by hiding irrelevant workflows.

LSP and MCP Integration

  • The Problem: grep is token-inefficient and slow for massive codebases.
  • The Solution: Use an MCP server to expose LSP functionality. This allows the agent to perform "symbol-level" searches (finding class definitions and references) rather than simple string matching.

Sub-agents

  • Use sub-agents to handle "exploration" tasks (e.g., researching architecture or searching the web). This keeps the primary session's context window clean, as the sub-agent only returns a summary of its findings.

3. Implementation and Best Practices

  • Assign Ownership: Organizations should designate a "champion" or small team to build the initial AI Layer. This prevents fragmented, inconsistent setups across the company.
  • Quiet Investment Period: Before full-scale rollout, spend time building out the standard rules, skills, and MCP servers to ensure developers have a high-quality experience from their first interaction.
  • Plugin Usage: The author provides a plugin that automates the installation of the self-improving stop hook, the explorer sub-agent, and the LSP-based search tools.

4. Synthesis

The effectiveness of AI coding agents in complex environments is not determined by the model's raw intelligence, but by the AI Layer—the surrounding infrastructure of rules, tools, and workflows. By moving away from monolithic, static prompts toward a layered, scoped, and self-improving architecture, developers can maintain control and efficiency even in massive, legacy-heavy codebases. The ultimate goal is to treat the AI agent as a teammate that follows organizational standards, evolves with the codebase, and utilizes professional-grade navigation tools like LSPs.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video