Understanding and Using AI Skills

By John Savill's Technical Training

Share:

Key Concepts

  • AI Skills: Modular, instruction-based procedures that define how an AI should perform specific tasks.
  • Progressive Disclosure: An information architecture pattern where the AI only accesses detailed instructions when necessary, rather than loading all available knowledge at once.
  • MCP (Model Context Protocol) Servers: Infrastructure that exposes tools, resources, and knowledge to an AI application, allowing it to interact with the real world.
  • Harness: The application code that manages the interaction between the Large Language Model (LLM), the user, and external tools.
  • Token Optimization: The practice of minimizing input tokens by only sending essential metadata (names/descriptions) to the LLM until specific task details are required.

1. The Framework of AI Skills

The speaker introduces "AI Skills" as a structured way to provide LLMs with procedural knowledge. Much like a human using a reference card to cut down a tree, an AI does not need to "memorize" every possible procedure. Instead, it maintains a catalog of available skills.

  • Structure of a Skill: Each skill resides in its own subdirectory containing a skill.md file.
    • Metadata: Includes a name and description (used for initial discovery).
    • Body: Contains the step-by-step instructions, formatting requirements, and constraints (e.g., "never invent videos or dates").
    • Optional Assets: Can include scripts, templates, or documentation.
  • The Role of MCP Servers: While a skill provides the "how-to" (the procedure), the MCP server provides the "tools" (the capability to execute). For example, a skill might instruct the AI to fetch data, while the MCP server provides the actual fetch_url tool to perform the network request.

2. Step-by-Step Execution Process

The interaction between the AI application, the LLM, and the skills follows a specific logical flow:

  1. Initial Request: The user submits a prompt (e.g., "Show me my latest YouTube videos").
  2. Discovery: The LLM asks what skills are available. The application returns only the names and descriptions of all skills.
  3. Selection: The LLM identifies the relevant skill and requests the full "body" (detailed instructions) for that specific skill.
  4. Tool Invocation: The LLM reads the instructions, realizes it needs external data, and requests the application to use a specific tool (e.g., an MCP-provided URL fetcher).
  5. Execution & Synthesis: The application executes the tool, returns the data to the LLM, and the LLM generates the final output based on the skill's formatting constraints.

3. Technical Advantages

  • Reasoning Quality: By avoiding the injection of massive amounts of irrelevant instructions into the system prompt, the model avoids "competing instructions" and maintains better focus.
  • Cost Efficiency: Sending only the name and description (typically <100 tokens) significantly reduces input token costs compared to loading full procedural manuals for every interaction.
  • Interoperability: Because skills are standardized, they can be utilized across different environments, such as custom Python harnesses, VS Code, or GitHub Copilot.

4. Real-World Application: YouTube Video Fetcher

The speaker demonstrated a skill designed to list the 10 most recent videos from a YouTube channel.

  • Implementation: The skill defined the data source (RSS feed), the sorting logic, and the requirement to generate a summary paragraph.
  • Execution: When invoked, the AI followed the skill.md instructions to parse the XML, format the output into a table, and synthesize a summary, demonstrating that the AI can be "prescriptive" about output quality and domain-specific requirements.

5. Notable Quotes

  • "It is the procedure that I need to follow. In this case, the large language model needs to follow to do a particular thing."
  • "The LLM only needs to know a small amount initially and then it only goes and gets the expertise, the capabilities needed when it needs it for the task it's being asked to do."

Synthesis/Conclusion

AI Skills represent a shift from "monolithic" AI prompts to a modular, library-based approach. By leveraging progressive disclosure, developers can build AI applications that are more scalable, cost-effective, and reliable. The separation of procedural knowledge (Skills) from functional capability (MCP Servers) allows for a clean architecture where the AI knows how to do a task and has the tools to perform it, without overwhelming the model's context window.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video