Back to all videos

Jueves de Quack - GitHub Copilot en la era UBB: contexto, modos y presupuesto

By GitHub

AI Development Tools LLM Cost Optimization GitHub Copilot Management

Share:

Key Concepts

Usage-Based Billing (UBB): The new billing model for GitHub Copilot (Business/Enterprise) transitioning from credit-based to consumption-based billing starting June 1st.
Consumption Drivers: The four main factors impacting costs: Input prompts, Model output length, Context overhead (always-on instructions/tabs), and Tool/Agent overhead.
Agentic Modes: The three operational modes of Copilot: Ask (question/answer), Edit (code modification), and Agent (autonomous task execution).
Auto-Mode: A smart routing system that selects the most efficient model for a specific task, balancing performance and cost.
Context Hygiene: The practice of optimizing "always-on" instructions and reducing unnecessary context bloat to save tokens.
Deterministic Tools: Using CLI hooks and specific skills to perform repetitive tasks efficiently without triggering expensive LLM reasoning cycles.

1. The Shift to Usage-Based Billing (UBB)

Starting June 1st, GitHub Copilot will move away from a flat credit system to a consumption-based model. This means every interaction—including the prompt, the model's output, the context provided, and the tools invoked—will contribute to the final cost. The speaker emphasizes that this is not a reason to stop using Copilot, but a call to shift from a "honeymoon phase" of unlimited usage to a "smart usage" mindset.

2. Key Drivers of Consumption

Input Prompts: Includes the user's text, attached files, chat history, and system instructions.
Model Output: Verbose explanations increase costs. Users are encouraged to request only the necessary code rather than lengthy conversational responses.
Context Overhead: "Always-on" instructions (e.g., copilot-instructions.md) are sent with every request. If these files are bloated with irrelevant information, they waste tokens on every interaction.
Tool/Agent Overhead: Using Agent mode for simple tasks is inefficient. Agents perform planning, reasoning, and multiple tool calls, which are significantly more expensive than simple Edit or Ask interactions.

3. Methodologies for Optimization

Adopt a "Neanderthal" Communication Style: Be direct and concise. Avoid conversational filler (e.g., "How are you?") when prompting the model, as it consumes tokens unnecessarily.
Strategic Mode Selection:
- Ask: Best for quick questions.
- Edit: Best for specific code changes where the user knows the objective.
- Agent: Reserved only for complex tasks requiring repository navigation and multi-step reasoning.
Leverage "Auto" Mode: Trust the system's routing intelligence. It is designed to select the most cost-effective and capable model for the specific task, preventing the unnecessary use of premium models (like Opus) for simple tasks.
Context Scoping: Instead of global instructions, use scoped instructions that only trigger when relevant files are accessed.

4. Technical Frameworks: Hooks, Skills, and MCPs

Model Context Protocol (MCP): While powerful, MCP servers contribute to costs. Only keep necessary servers active.
GitHub Copilot Hooks: These allow for deterministic, command-line-like behavior. By using hooks, you can automate repetitive tasks without triggering a full LLM reasoning cycle, saving significant costs.
Chronicle: A CLI tool recommended for auditing interaction patterns. It helps identify repetitive tasks that should be converted into custom Skills or Hooks.

5. Administrative Insights (Enterprise)

Audit Usage: Admins should use the provided billing reports (CSV exports) to identify "power users" who may be over-utilizing premium models.
Education over Restriction: Rather than cutting access, use data to educate teams on token consumption and efficient prompting.
Budgeting: Admins can set budget limits per user or team to prevent runaway costs.
Transition Credits: Enterprise customers are encouraged to contact their sales representatives to understand the transition credits available for the first few months after the June 1st cutoff.

6. Notable Quotes

"No se trata tanto de enfocarnos en crear presupuestos... nos toca más que todo enfocarnos en cambiar nuestros hábitos." (It's not just about creating budgets; it's about changing our habits.)
"El prompt más corto y más vago a veces es el más costoso y los prompts específicos... pueden ser mucho más económicos." (The shortest, vaguest prompt is sometimes the most expensive, while specific prompts can be much more economical.)

7. Synthesis and Conclusion

The transition to usage-based billing requires a fundamental change in how developers interact with AI. By auditing current instructions, adopting a more concise prompting style, and utilizing the "Auto" mode and deterministic tools (Hooks/Skills), users can maintain high productivity without breaking the bank. The main takeaway is to treat AI interactions as a finite resource, prioritizing efficiency and precision over broad, context-heavy requests.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video