Convert Any MCP Server to Code Execution (Template)

Key Concepts

MCP Code Execution: A new method proposed by Anthropic for building AI agents that significantly reduces token consumption and increases agent autonomy.
Direct MCP Method: The older approach to building AI agents that involves passing tool descriptions and arguments directly into the agent's context window.
Token Consumption: The amount of data processed by a language model, directly impacting cost and performance.
Agent Autonomy: The ability of an AI agent to operate independently and evolve its own skills.
Code Execution Tool: A tool that allows AI agents to discover and run local files, particularly Python scripts.
Persistent Shell Tool: A tool that enables agents to interact with the operating system and discover local files.
MCP Servers: Services that provide specific functionalities to AI agents, such as interacting with Google Drive or Notion.
Mnt Skills Folder: A directory where agents can save and reference their self-created skills for persistent storage.
Prompting: The process of providing instructions and context to an AI agent to guide its behavior and task execution.
Tool Calls: The actions an agent takes to utilize its available tools to complete a task.
Traces: Logs that record the execution flow and token consumption of an AI agent.

Building an AI Agent with MCP Code Execution

This video details the step-by-step process of building an AI agent using Anthropic's new MCP code execution method, contrasting it with the older direct MCP approach. The primary goal is to demonstrate how this new technique drastically reduces token consumption while enhancing agent autonomy and flexibility.

Agent Architecture and Setup

The agent constructed in this tutorial is a sales operations agent designed to read meeting transcripts from Google Drive and attach them to a CRM in Notion without needing to read the transcript content directly. The architecture is straightforward, featuring a single agent equipped with a code execution tool that interfaces with Google Drive MCP and Notion MCP.

To begin, users are instructed to copy a starter template that includes a new command for the MCP code execution pattern. This template should be set to private visibility and opened in a code editor like Cursor (though other AI coding agents like Cloud CodeX are also compatible). If not using Cursor, the workflow and command files need to be tagged in the prompt.

Initial Agent Creation

The process starts by instructing the AI coding agent (Cursor in this example) to create a sales operations agent with two essential built-in tools:

IPython Interpreter: Allows the agent to execute Python code.
Persistent Shell Tool: Enables the agent to discover local files and interact with the system.

These tools are crucial for the new MCP code execution method, as they allow the agent to discover and run local files. The video also mentions a PRD command for more sophisticated MVP development, offering greater control over agent structure.

Implementing MCP Code Execution

The core of the new method is the MCP code execution command. This command allows users to add MCP servers to an agent using the new pattern. The recommended approach is to find the specific MCP servers needed on platforms like GitHub and provide their links to the AI agent.

The MCP code execution command provides the AI agent with comprehensive information about the new pattern, including links to relevant blog posts and resources. This enables the agent to reliably implement the pattern as described in Anthropic's blog post.

The AI agent then proceeds to:

Create a server directory containing the code for the MCP servers.
Generate code for individual tools.
Handle authentication if the server uses OAuth.
Create files for individual tools, storing their descriptions and arguments directly in code rather than within the agent's context window. This ensures that only the necessary tool is loaded when needed, further optimizing token usage.

In this demonstration, Cursor created 15 tools for the Notion MCP server and 4 for the Google Drive MCP server. While the Notion server was tested successfully, the Google Drive server testing failed due to missing credentials. The video emphasizes the importance of providing all necessary credentials upfront, referencing the readme file of the MCP server for instructions.

Writing Agent Instructions

The next critical step is writing instructions for the agent. The write instructions command is recommended for this purpose. The AI agent will then ask a series of questions to define the agent's context, with the "process and workflow" being the most important.

The suggested workflow for the agent is as follows:

Check available skills: Always check the mnt skills folder for existing skills.
Use existing skill: If a relevant skill is found for a specific task, use it.
Read necessary tool: If no skills are found, read only the tool required for the task.
Combine tools: Combine multiple tools to successfully complete a task.
Suggest new skills: Provide suggestions for new skills to be added.

This workflow, derived from the blog post, can be further refined through effective prompting.

Direct MCP Method Comparison

For comparison, the video also sets up a "sales ops direct MCP agent" using the traditional add MCP command. This involves linking the same MCP servers that were added to the new agent.

Testing and Deployment

The newly built agent is tested by running python agency.py in the terminal. A simple test asking "what's on this notion page" is performed. The agent first lists the mnt skills directory, then reads the fetch tool file, and despite encountering import issues, successfully reads the Notion page by performing a fetch tool request and providing a preview.

The agent is then deployed to a platform (referred to as "agency") by pushing changes to GitHub and creating a new agency, providing keys from the .env file.

Performance Comparison: MCP Code Execution vs. Direct MCP

The deployed agents are tested with a task from Anthropic's blog post: copying a transcript from Google Drive and pasting it into a Notion page.

MCP Code Execution Agent:

Successfully reads the transcript from Google Drive and pastes it into Notion.
Proposes creating a new skill for itself, which is then saved in the mnt skills directory. This demonstrates the agent's ability to self-improve and evolve.
When the same task is performed in a new chat, the agent utilizes the saved skill, significantly reducing execution time and token consumption.
Token Consumption (with skill): Approximately 4,000 tokens.
Token Consumption (first attempt, no skill): Approximately 12,000 tokens.

Direct MCP Agent:

Reads the entire transcript using the G drive read file tool.
Manually types the transcript into the Notion page, leading to extremely high token consumption.
Token Consumption: 32,000 tokens.

The video highlights that the direct MCP agent consumed a massive 32,000 tokens, with a significant portion being expensive output tokens. In contrast, the MCP code execution agent consumed 12,000 tokens on its first attempt and only 4,000 tokens when using a pre-existing skill. This represents a substantial reduction in token usage, especially when the agent leverages its learned skills.

Analysis of Results and Conclusion

The traces, enabled by default in the framework, are analyzed to compare token consumption. The direct MCP agent's 32,000 tokens are deemed "insane," while the MCP code execution agent's 12,000 tokens are significantly less, though still considered improvable with better prompting. The 4,000 tokens used when the agent leveraged its saved skill show a tenfold reduction.

The presenter concludes that the MCP code execution approach is "extremely powerful" but requires "great responsibility." While intermediate tool outputs saw reduced token consumption, the agent still made unnecessary tool calls, executed code too many times, and read unnecessary files on the first attempt. However, the overall token savings compared to the direct MCP method are substantial, and the ability to self-improve by creating new skills is a key advantage.

Final Verdict: The approach is deemed "ready for production" but necessitates "proper prompting." LLMs are not yet fully trained for this method, so careful instruction on how to use the new pattern is crucial to avoid mistakes. When implemented correctly, the benefits of increased autonomy and flexibility outweigh the costs.

Future Applications and Recommendations

The presenter suggests that this new paradigm, where agents can generate code to perform tasks, is the future. However, it's recommended for more sophisticated general agents like analytics, research, or operations, rather than simple agents like customer support.

For deployment, the presenter points to their platform, stating that infrastructure overhead is a significant downside of this approach, and their platform offers out-of-the-box support for all necessary features. More advanced AI agent templates are promised for the future.