Rubber Duck Thursdays!
By GitHub
Share:
Key Concepts
- Rubber Ducking (Agentic): An experimental feature in GitHub Copilot CLI where one AI model acts as a "critique agent" to review the work of another model (e.g., a GPT model reviewing Claude’s output).
- Agentic Engineering: A paradigm shift in software development where AI agents are used to plan, implement, and review code, often orchestrating multiple models to complete complex tasks.
- GitHub MCP (Model Context Protocol) Server: A protocol that allows AI agents to interact with GitHub resources (PRs, issues, code) to perform tasks like automated code reviews and secret scanning.
- Maintainer Month: A celebration in May dedicated to highlighting and supporting open-source maintainers through various community events.
- Autopilot Mode: A mode in Copilot CLI that allows for autonomous task execution, toggled via
Shift + Tab.
1. Main Topics and Key Points
- Rubber Duck Experimental Feature: The host demonstrated how to enable experimental features in the Copilot CLI using
slashexperimental on. The "Rubber Duck" agent serves as a second pair of eyes, identifying gaps in logic or implementation plans. For instance, if Claude Opus 4.7 generates a plan, a GPT 5.5 model can be invoked to critique it, ensuring higher reliability. - GitHub MCP Server & Secret Scanning: The GitHub MCP server is now generally available for secret scanning. It proactively detects exposed credentials (like personal access tokens) before code is committed or pushed, preventing security leaks.
- Enterprise Managed Plugins: Organizations can now manage which plugins are available to their developers in Copilot CLI, ensuring that teams use vetted, secure tools for agentic workflows.
- Agentic PR Review Workflow: The host showcased a powerful workflow:
- Copilot CLI implements a feature.
- The code is pushed to a PR.
- Copilot performs an automated code review.
- The developer instructs the agent to read the PR comments and automatically apply fixes to the local codebase.
2. Real-World Applications
- Automated Code Reviews: Instead of manual back-and-forth, developers can use Copilot to review PRs and then use the agent to "address comments" by automatically refactoring code based on the review feedback.
- Security Compliance: Using the MCP server to scan for secrets in real-time during the development process, rather than relying on post-commit hooks.
- Full-Stack Development: The host mentioned building a complete voice-transcription application in two hours using Copilot CLI, highlighting the speed of modern agentic development.
3. Methodologies and Frameworks
- Plan Mode vs. Autopilot Mode:
- Plan Mode: Used to generate a roadmap for a feature.
- Autopilot Mode: Used for execution.
- Switching: Users can toggle between these modes using
Shift + Tabin the terminal.
- The "Critique" Loop: The methodology involves:
- Orchestration: Using a primary model (e.g., Claude) to build.
- Critique: Invoking a secondary model (e.g., GPT) via the Rubber Duck agent to find gaps.
- Refinement: Applying the critique to improve the final output.
4. Key Arguments and Perspectives
- Multi-Model Flexibility: The host argues against using a single model, emphasizing the value of "vacillating" between different models (GPT, Claude, local models) within a single interface (Copilot CLI or VS Code) because different models excel at different tasks.
- The "Agentic" Future: The host asserts that agentic development is not a passing trend but a fundamental shift in how software will be built, noting that as models become more complex, the need for "AI as a judge" (LLM-as-a-judge) becomes critical for verification.
5. Notable Quotes
- "It’s like having a teammate review your PR... having a GPT 5.5 review Opus 4.7’s work is so incredible." — Cadia, on the value of the Rubber Duck agent.
- "LLM as a judge will fundamentally change the way we build our workflow pipelines." — Sheldon (via chat), echoed by the host.
- "It’s not just about the code; it’s managing everything else that surrounds the build—PRs, code reviews, and dependency upgrades." — Cadia, on the utility of the GitHub MCP server.
6. Synthesis and Conclusion
The stream highlights the rapid evolution of agentic engineering, where the focus is shifting from simple code generation to complex, multi-agent workflows. By utilizing tools like the GitHub MCP server and the Rubber Duck critique agent, developers can create self-correcting pipelines that handle everything from security scanning to PR remediation. The main takeaway is that while AI coding tools are moving at an "insane" pace, the ability to orchestrate multiple models and verify their work through automated critique is the current frontier for professional software development.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.