Back to all videos

Mergeable by default: Building the context engine to save time and tokens — Peter Werry, Unblocked

By AI Engineer

AI Agent Orchestration LLM Context Engineering AI-Assisted Software Development

Share:

Key Concepts

Context Engine: A system that provides highly optimized, relevant organizational context to AI agents, moving beyond simple RAG (Retrieval-Augmented Generation) to include relationships, historical decisions, and expert knowledge.
Satisfaction of Search: A cognitive bias (borrowed from radiology) where an agent stops searching once it finds a "good enough" answer, potentially missing critical information elsewhere.
Doom Loops: Iterative cycles where an agent fails to perform a task correctly, requiring constant human intervention and correction.
Social Engineering Graph: A mapping of organizational relationships, PR reviews, and expert contributions used to identify "who knows what" and "how things are done."
Bottling the Expert: The process of distilling an individual’s historical contributions, Slack discussions, and PR comments into "memories" that guide future AI agent behavior.
MCP (Model Context Protocol): A standard for connecting AI agents to data sources and tools.

1. The Role and Necessity of Context Engines

Modern AI agents often start at "ground zero" regarding a codebase or organization. Without a context engine, they rely on inefficient "code splunking." A context engine acts as the organizational memory, supplying only the necessary context to ensure agents align with best practices, thereby preventing "doom loops" and reducing token consumption.

2. Three Myths About Context Engines

Myth 1: Naive RAG is a Context Engine. Simply wiring up vector search or MCP servers is insufficient. It lacks the ability to resolve conflicts, understand relationships between data, and avoid the "satisfaction of search" trap.
Myth 2: Bigger Context Windows Solve Everything. While models like Gemini can handle millions of tokens, they struggle with reasoning across disparate data sources and selecting the "truth" among conflicting information.
Myth 3: Caching Answers. Caching AI responses is counterproductive because codebases and organizational reasons for decisions change constantly. Caching leads to "regression toward the mean" and stale information.

3. Framework for a Context Engine

A robust context engine requires six core pillars:

Unified System Context: Building relationships between data (e.g., linking Slack discussions to specific PRs).
Conflict Resolution: Moving beyond recency-based bias to prioritize "truth" (e.g., favoring main branch code or expert-verified Slack threads).
Access Governance: Ensuring that synthesized knowledge respects existing permissions (e.g., private Slack channels are only accessible to authorized users).
Personalized/Targeted Retrieval: Focusing context on the specific repositories and tasks relevant to the user.
Memory Distillation: Converting historical PR comments and incident reports into reusable "memories."
Expert Identification: Using social graphs to identify the right people to consult or emulate.

4. Real-World Applications

Planning & Review: Using the engine to enrich tickets and provide context-aware code reviews.
Incident Management: Integrating with tools like DataDog and Sentry to correlate production signals with past Slack discussions and code changes.
Engineering Support: Automating answers to common questions in engineering support channels.
Onboarding: Helping new employees understand the "why" behind historical architectural decisions.

5. Performance and Metrics

The speakers presented a case study where a complex task was performed with and without a context engine:

Without Context Engine: 2.5 hours, 21 million tokens, multiple "doom loop" corrections required.
With Context Engine: 25 minutes, 10 million tokens, zero corrections required.
Key Insight: The primary bottleneck in AI performance is often the output tokens and the time spent in correction loops, not the input token size.

6. Notable Quotes

"Access doesn't equal understanding." — Peter, on the limitation of simply connecting data sources.
"AI-generated code should just feel like it was written by someone that's been in your team for 20 years." — Peter, on the ultimate goal of context engineering.
"The puck is going down the line towards background agents for sure." — Peter, regarding the future of autonomous AI workflows.

7. Synthesis/Conclusion

The transition from human-managed AI to autonomous background agents is inevitable, but it is currently blocked by a lack of organizational context. A context engine is not just a search tool; it is a sophisticated layer that understands organizational relationships, resolves data conflicts, and respects security boundaries. By "bottling" expert knowledge and providing it at the right time, teams can achieve dramatic reductions in task completion time and token usage, effectively turning AI agents into long-tenured, highly knowledgeable team members.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video