The emerging skillset of wielding coding agents — Beyang Liu, Sourcegraph / Amp

Key Concepts

Coding Agents
LLMs (Large Language Models)
Agentic Architecture
Tool Use
Feedback Loops
Context Management
Sub-agents
Prompt Engineering
Code Review
Unix Philosophy
MCP (Modular Code Platform)

Main Topics and Key Points

The Agent Discourse and Shifting Paradigms

Initial Skepticism: The video starts by addressing the debate around the efficacy of AI coding agents, referencing contrasting opinions from developers like Jonathan Blow (skeptical) and Eric S. Raymond (supportive).
Dynamic Technical Landscape: The speaker emphasizes the rapidly evolving nature of AI coding tools, where best practices can change significantly within months.
Three Eras of Generative AI:
- GPT-3 Era: Characterized by text completion models and co-pilot/autocomplete tools.
- Chat-GPT Era: Marked by instruct-tuned chatbots and the rise of Retrieval-Augmented Generation (RAG) bots.
- Agent Era: The current era, driven by tool-using LLMs and agentic architectures.
Agentic Architecture: The speaker argues that the capabilities of modern LLMs necessitate a shift in application architecture, moving beyond simple chatbots and RAG systems.

Controversial Design Decisions for the Agent Era

Direct Edits: Agents should directly modify files without constant human confirmation.
Minimal UI: The need for thick clients (e.g., VS Code forks) is questioned, suggesting a move towards simpler interfaces.
Model Coupling: LLMs are deeply integrated into agentic chains, making model swapping difficult.
Flexible Pricing: Fixed pricing models create perverse incentives to use dumber models, which ultimately waste more human time.
Unix Philosophy: Favoring composable, command-driven tools over vertically integrated solutions.
New Application from the Ground Up: The speaker's company, Source Graph, built a new application (AMP) specifically for the agentic world, rather than retrofitting existing tools.

AMP: A Coding Agent Example

Minimalist Design: AMP features a bare-bones VS Code extension and CLI, emphasizing functionality over complex UI.
Tool Access: AMP has access to various tools, including file manipulation, bash commands, Playwright, Postgres, and Linear (issue tracker).
Automated Tool Use: The agent autonomously selects and uses tools based on the task.
Sub-agent for Search: AMP utilizes a sub-agent with multiple search tools (keyword search, file name lookup, etc.) to find relevant information.
Multi-threaded Interface: Users can run multiple threads concurrently, allowing for parallel tasks and context switching.
Diff View Focus: The speaker notes that he now spends more time reviewing diffs generated by the agent than directly editing code.

Live Demo: Customizing the Linear Connector Icon

Real-world Application: The speaker demonstrates AMP by implementing a feature request in the AMP codebase itself.
Automated Implementation: The agent identifies the relevant Linear issue and implements the necessary code changes.
Iterative Refinement: The agent initially fails to update the icon on the settings page due to nuanced configuration differences.
Human Nudge: The speaker provides a hint, prompting the agent to investigate the discrepancy.
Successful Resolution: The agent identifies the correct configuration parameter and updates the icon on both the admin and settings pages.

Power User Patterns and Best Practices

Long Prompts: Power users write detailed prompts to guide the LLM effectively.
Contextual Guidance: Directing the agent to relevant context and feedback mechanisms.
Front-end Feedback Loops: Using Playwright and Storybook to create fast feedback loops for front-end development.
Code Understanding: Agents are used to better understand existing codebases, accelerating onboarding and code review processes.
Thorough Code Reviews: Agents are used to generate high-level summaries and identify entry points for code reviews.
Sub-agents for Complex Tasks: Sub-agents are used to preserve context and manage longer, more complex tasks.

Anti-Patterns

Micromanaging: Treating the agent like a chatbot and constantly intervening.
Underprompting: Not providing enough detail in prompts.
Blindly Accepting Code: Failing to thoroughly review code generated by the agent.

Additional Tips and Tricks

Parallel Agents: Running multiple agents in parallel to work on different parts of a project.

Important Examples, Case Studies, or Real-World Applications Discussed

Braid: Mentioned as an example of a game single-handedly coded by Jonathan Blow, highlighting his coding prowess.
Canva: Jeff Huntley's work at Canva, interviewing developers using AI tools and identifying anti-patterns.
AMP Codebase: The live demo of AMP modifying its own codebase to customize the Linear connector icon.
Tyler Bruno's Onboarding: A new hire using AMP to quickly ramp up on the AMP codebase.
Jeff Huntley's Compiler Project: Using multiple agents in parallel to develop a compiler.

Step-by-Step Processes, Methodologies, or Frameworks Explained

Using AMP to implement a feature:
1. Formulate a prompt describing the desired change.
2. Allow the agent to autonomously select and use tools.
3. Review the generated diffs in VS Code.
4. Provide feedback and guidance as needed.
5. Verify the changes and iterate.
Creating front-end feedback loops:
1. Make code changes.
2. Use Playwright to open the relevant page in a browser.
3. Snapshot the page.
4. Loop back to the agent for further refinement.

Key Arguments or Perspectives Presented, with Their Supporting Evidence

Coding agents are substantively useful: Supported by examples of AMP automating code changes and accelerating development tasks.
The agent era requires a new application architecture: Supported by the argument that existing chatbot and RAG-based tools are not optimized for the capabilities of tool-using LLMs.
Power users are leveraging agents to enhance, not replace, human skills: Supported by examples of agents being used for code understanding, thorough code reviews, and complex problem-solving.

Notable Quotes or Significant Statements with Proper Attribution

Jonathan Blow: (Implied) Skeptical about the efficacy of AI coding agents.
Jesse Friselle: "I think you're right, but you're in like the top 0.1% of programmers Jonathan, for the rest of us... it actually helps a lot."
Eric S. Raymond: (Implied) Believes AI coding agents are helpful even for experienced programmers.
Thomas Tachek: (Implied) Argues that AI coding agents are very useful.
Jeff Huntley: (Implied) Found that "most people were holding it wrong" when using AI coding tools.

Technical Terms, Concepts, or Specialized Vocabulary with Brief Explanations

LLM (Large Language Model): A deep learning model trained on a massive amount of text data.
Agentic Architecture: An application architecture that leverages the capabilities of tool-using LLMs to automate complex tasks.
Tool Use: The ability of an LLM to interact with external tools and APIs.
Feedback Loops: The process of providing feedback to an agent to guide its behavior and improve its performance.
Context Management: The process of providing relevant information to an LLM to improve its accuracy and relevance.
Sub-agent: A specialized agent that is used to perform a specific subtask within a larger agentic system.
Prompt Engineering: The process of designing effective prompts to elicit desired behavior from an LLM.
Code Review: The process of reviewing code changes to identify bugs and ensure code quality.
Unix Philosophy: A set of design principles that emphasize simplicity, modularity, and composability.
MCP (Modular Code Platform): A platform for managing and sharing code modules.
RAG (Retrieval-Augmented Generation): A technique for improving the accuracy and relevance of LLM-generated text by retrieving relevant information from a knowledge base.
Playwright: A framework for automating web browser interactions.
Storybook: A tool for developing and testing UI components in isolation.
Diff: A representation of the differences between two versions of a file.

Logical Connections Between Different Sections and Ideas

The video begins by establishing the context of the AI coding agent debate, then transitions to the speaker's perspective on the shifting paradigms in the field.
The discussion of design decisions for the agent era logically follows the explanation of the different eras of generative AI.
The AMP example serves as a concrete illustration of the principles and best practices discussed in the earlier sections.
The power user patterns and anti-patterns are presented as insights derived from observing real-world usage of AMP.
The conclusion summarizes the main takeaways and reinforces the speaker's belief in the potential of coding agents.

Data, Research Findings, or Statistics Mentioned

Context Window Degradation: LLMs like Sonnet 4 experience performance degradation beyond a certain context window size (around 120-130K tokens).
User Spending Variance: Significant variance in user spending on AMP, with some power users spending thousands of dollars per month on inference costs.

Brief Synthesis/Conclusion of the Main Takeaways

The video argues that coding agents represent a significant advancement in AI-assisted development, requiring a shift in application architecture and development practices. The speaker advocates for a minimalist, tool-centric approach, emphasizing the importance of feedback loops, context management, and thorough code reviews. By sharing insights from power users and demonstrating the capabilities of AMP, the video encourages developers to embrace coding agents as a powerful tool for enhancing their productivity and code quality. The key takeaway is that coding agents are not a replacement for human developers, but rather a tool that can amplify their skills and enable them to tackle more complex challenges.