Lessons from Scaling GitHub's Remote MCP Server — Sam Morrow, GitHub

By AI Engineer

Share:

Key Concepts

  • MCP (Model Context Protocol): An open standard that enables AI agents to interact with external data and tools.
  • Context Window Management: The challenge of balancing the number of available tools with the limited token capacity of LLMs.
  • Tool Sets: A grouping mechanism for related product tools to prevent agent confusion.
  • OAuth 2.1 & PKCE: Security protocols used to manage authentication and authorization for remote MCP servers.
  • Step-up OAuth: A mechanism allowing agents to request additional permissions (scopes) dynamically during a session rather than failing.
  • Stateless Architecture: A server design where each request initializes a fresh server instance, improving scalability and reliability.
  • Agent Intent: Encoding the purpose of a tool call into the server-side logic to reduce round-trips and improve success rates.

1. Challenges in Scaling MCP Servers

GitHub’s journey with MCP began in April 2023. The primary challenge was the "lethal trifecta" of agent performance: as more tools were added, agents became confused, forgetful, and exhausted context windows.

  • The "More Tools" Problem: Research from LangChain confirmed that simply providing more tools degrades agent performance.
  • User Behavior: Despite implementing "Tool Sets" and dynamic discovery, most users stuck to default settings, highlighting a gap in client-side configuration and the need for better default experiences.
  • Token Efficiency: GitHub achieved a 49% reduction in initial load context by focusing on general-use tools and a 75% reduction in output tokens for specific tools like list pull requests.

2. Security and Authentication

Security remains a "constant menace" due to the prevalence of long-lived, over-privileged personal access tokens (PATs).

  • Remote HTTP & OAuth: GitHub moved toward OAuth 2.1 with PKCE (Proof Key for Code Exchange) to avoid the risks associated with storing plain-text tokens.
  • Dynamic Client Registration: GitHub rejected this approach due to concerns regarding unbounded database growth and the lack of reliable app identity.
  • Prompt Injection: The speaker acknowledged the risk of prompt injection (e.g., Invariant Labs' research) and noted that while GitHub provides the tools, the security burden is shared across the entire agentic ecosystem.
  • Scope Filtering: GitHub automatically filters tools based on the scopes provided by the user's token, ensuring the agent only sees what it is authorized to access.

3. Methodologies and Frameworks

  • Stateless Server Design: GitHub runs a stateless server where a new instance is created for every request. This avoids session affinity issues and allows for dynamic tool loading based on user permissions.
  • Evaluation Strategy: Instead of micro-optimizing individual tool descriptions, the team uses comparative testing to ensure tools are called at the right time and do not "fight" each other for the agent's attention.
  • Human-in-the-Loop: The "Insiders" mode allows for features like AI-generated issue editing, where the user reviews the output before final submission, balancing automation with human oversight.

4. Notable Quotes

  • "More tools don't make better agents; they get confused and forgetful." (Referencing the LangChain research).
  • "It's hard to make configuration easy and secure at the same time."
  • "The utility of agents is in direct conflict with protecting this stuff [data]."

5. Data and Statistics

  • Usage: The server handles approximately 7 million tool calls per week.
  • Community: 126 contributors, 2,300+ issues/PRs, and nearly 4,000 forks.
  • Success Rate: Tool execution success rate is currently over 95%.
  • Efficiency: 75% reduction in output tokens for specific tool calls; 49% reduction in initial context load.

6. Future Outlook

The speaker envisions a future where:

  • Automatic Discovery: Servers will be discovered and configured automatically without user intervention.
  • Compositional Tool Use: Tools will be piped together (e.g., bash-style piping) to create complex workflows.
  • Autonomous Selection: Agents will move toward truly autonomous tool selection, rendering the current manual configuration phase obsolete.

Synthesis

GitHub’s approach to MCP has evolved from a "more is better" philosophy to a focus on context efficiency, security-by-default, and stateless scalability. By moving away from static, over-privileged tokens toward dynamic, scope-aware authentication and optimizing tool descriptions through comparative evaluation, GitHub has managed to scale to 7 million calls per week. The core takeaway is that the future of agentic systems lies in reducing the cognitive load on the LLM through better tool design and seamless, secure authentication, rather than simply increasing the number of available functions.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "Lessons from Scaling GitHub's Remote MCP Server — Sam Morrow, GitHub". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video