Rubber Duck Thursdays - Let's build with custom agents
By GitHub
Key Concepts
- Custom Agents: AI entities that can be configured to perform specific tasks, previously known as custom chat modes.
- Custom Instructions: User-defined rules and context provided to AI agents to guide their behavior and output.
- Model Context Protocol (MCP) Server: A backend server that facilitates communication between AI agents and web applications, enabling games and other interactive experiences.
- GitHub Copilot: An AI-powered coding assistant that suggests code and helps with development tasks.
- Bring Your Own Key (BYOK): An enterprise feature for GitHub Copilot allowing users to connect their own API keys from supported model providers.
- Linter Integration: The ability of Copilot code review to run static analysis tools like ESLint and PMD to enhance code quality.
- Claude Opus 4.5: Anthropic's latest model, now in public preview for GitHub Copilot, offering improved performance and efficiency.
- Vibe Coding: Using natural language to code and build projects, often in collaboration with AI tools.
- Spec-Driven Development: A development approach where AI is used to define requirements and scope before implementation.
- Meta-Instructions: Instructions designed to guide the AI in writing or modifying other instructions.
GitHub Changelog and Updates
The session began with a review of the GitHub changelog from the past week, highlighting several key updates:
- Enterprise Bring Your Own Key (BYOK) for GitHub Copilot: Now in public preview, this feature allows enterprises to connect API keys from providers like Anthropic, Microsoft Foundry, OpenAI, and XAI. Usage through BYOK is billed directly by the provider and does not count against GitHub Copilot request quotas. It requires the OpenAI Completions API, not the Responses API.
- Linter Integration with Copilot Code Review: Copilot code review now supports running static analysis tools. This includes ESLint for JavaScript/TypeScript and PMD for Java, Apex, and other supported languages. Tools like CodeQL are also integrated.
- GitHub Actions Cache Size Increase: The cache size can now exceed 10 GB, useful for handling larger artifacts.
- Quick Access to Pull Request Description: A new information button in the "Files Changed" preview on PR pages provides quick access to the description.
- Claude Opus 4.5 in Public Preview: Anthropic's latest model is rolling out to GitHub Copilot (Pro, Pro Plus, Business, Enterprise). It has shown to surpass internal coding benchmarks while reducing token usage. It will be priced at a promotional 1x multiplier until December 5th.
- Custom Labels for Dependabot: This feature is now generally available at the organization level for self-hosted and larger GitHub-hosted action runners.
- Code Scanning Default Setup Bypass: Code scanning default setup now runs even if organization policies restrict which GitHub Action workflows can run, ensuring security coverage.
- Secrets in Unlisted GitHub Twists: These are now reported to secret scanning partners.
- Secret Scanning Alert Assignees: Security campaigns are generally available.
- Copilot Agent Sessions from External Apps: These are now available in GitHub mobile for Android.
Project: MCP Server for Web Games
The core of the session focused on enhancing the use of GitHub Copilot within a Next.js web application that serves as an MCP server for playing games like Tic-Tac-Toe and Rock, Paper, Scissors against an AI agent.
- Previous Work: The project previously established a grid-based interface for playing games and introduced a 3D view for Tic-Tac-Toe. The MCP server makes API calls to the backend app, handling game logic and turn management.
- Current Goal: To improve the use of Copilot by updating custom instructions and exploring custom agents.
Enhancing Custom Instructions
The session emphasized the importance of providing context to Copilot through custom instructions.
- Review and Update: The existing
copilot-instructions.mdfile was reviewed. It was noted that it lacked specific context about the project's tech stack. - Copilot's Recommendations: Copilot was used to analyze and suggest improvements to the instructions file, focusing on:
- Adding a clear purpose section.
- Using short, imperative bullet points instead of narrative paragraphs.
- Including code standards.
- Reorganizing with clearer headings.
- Moving language/framework-specific guidelines to path-specific instruction files.
- Keeping instructions concise and focused on repository-wide concerns.
- Adding concrete code examples.
- Refactoring Instructions: Copilot was further prompted to refactor instructions into separate files, specifically separating testing guidelines into a dedicated file (
testing.md) and keeping game-specific instructions in the maincopilot-instructions.md. - Meta-Instructions for Instructions: A "meta" instruction file (
meta-instructions.md) was created to guide Copilot in writing and modifying Copilot instruction files, targeting repository-wide and path-specific concerns.
Exploring Custom Agents
The session delved into the creation and utilization of custom agents.
- Custom Agents Overview: These are configurable AI entities that can perform specific tasks, offering different modes like "ask," "edit," and "plan."
- Creating a Custom Agent for Game Playing:
- Planning Phase: The "plan" agent was used to outline the requirements for a
game-playeragent. This involved defining its responsibilities, workflow, stopping rules, and interaction with MCP server tools. Key considerations included handling turns, timeouts, and game-specific guidance. - Implementation Phase: The plan was then handed off to another agent (using Claude Opus) to implement the
game-playeragent. This involved creating necessary directories and code files. - Testing and Refinement: The implemented agent was tested. An issue was identified where the agent made a chat response after a tool call, potentially superseding the intended workflow. This highlighted the need for explicit instructions to avoid such behavior. The agent was also programmed to make random moves for "easy" mode, and a tip about not giving tips was added.
- Planning Phase: The "plan" agent was used to outline the requirements for a
- Agent vs. Tools: It was clarified that custom agents can use tools, but they are not the same. Agents define behavior and workflows, while tools are specific functions or capabilities that agents can access.
Key Arguments and Perspectives
- Context is King: The recurring theme was the critical importance of providing sufficient context to AI models through custom instructions and agent configurations. This directly impacts the quality and relevance of the generated code and responses.
- AI as a Co-Pilot: The analogy of the user being the "pilot" and AI being the "co-pilot" was emphasized. This underscores the need for human oversight, review, and control over AI-generated output.
- Fundamentals Remain Crucial: Despite the advancements in AI coding tools, the importance of fundamental software development practices (builds, testing, linting, architecture, patterns, documentation) was repeatedly stressed. These fundamentals are essential for ensuring code quality and preventing reliance on AI for core logic.
- Iterative Development with AI: The process demonstrated an iterative approach to working with AI, where plans are refined, and implementations are reviewed and adjusted based on feedback and observed behavior.
- Vibe Coding vs. Spec-Driven Development: The session touched upon "vibe coding" (using natural language to code) and contrasted it with a more "spec-driven" approach where AI helps define detailed requirements before implementation. Both have their place, but the latter offers more structured control.
Technical Terms and Concepts Explained
- MCP Server: A protocol and server infrastructure that allows AI agents to interact with applications, often for tasks like playing games or controlling workflows.
- API Keys: Credentials used to authenticate and authorize access to an API.
- Static Analysis Tools: Tools that analyze code without executing it to find potential errors, bugs, and style issues (e.g., ESLint, PMD).
- Token Usage: In the context of LLMs, tokens are units of text that the model processes. Reducing token usage can lead to cost savings and faster processing.
- Monorepo: A software development strategy where code for many projects is stored in the same repository.
- Glob: A pattern used to match file names, often used in configuration files to specify sets of files (e.g.,
*.mdfor all markdown files). - Front Matter: Metadata included at the beginning of a markdown file, typically in YAML format, providing information about the content.
- Tool Calls: When an AI agent needs to perform an action that is outside its direct language model capabilities, it can "call" a tool (e.g., an API function) to execute that action.
- Stopping Rules: Conditions defined for an AI agent that determine when it should cease its current task or execution.
Logical Connections Between Sections
The session flowed logically from reviewing general updates (GitHub changelog) to applying those updates and concepts to a specific project. The enhancement of custom instructions directly paved the way for more effective custom agent creation. The exploration of meta-instructions demonstrated a deeper level of AI interaction, where the AI helps refine the very rules that govern its behavior. The game-playing agent example served as a practical, albeit slightly buggy, demonstration of how custom agents and tools can be integrated to create interactive experiences.
Data, Research Findings, or Statistics
- Claude Opus 4.5 Performance: Mentioned as surpassing internal coding benchmarks while cutting token usage in half.
- GitHub Copilot Usage: While not explicitly stated with figures, the session implies widespread use and ongoing development of its capabilities.
Clear Section Headings
The summary is structured with clear headings to delineate different topics covered in the video.
Synthesis/Conclusion
This session provided a comprehensive look at leveraging GitHub Copilot's advanced features, particularly custom instructions and custom agents, to enhance development workflows and build interactive applications. The presenter emphasized the importance of context, fundamental coding practices, and iterative refinement when working with AI. The practical demonstration of creating a game-playing agent, despite encountering some debugging challenges, illustrated the potential for specialized AI agents to automate complex tasks and contribute to project development. The session concluded with a forward-looking perspective, suggesting continued exploration of these capabilities in future streams.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Rubber Duck Thursdays - Let's build with custom agents". What would you like to know?