10分钟讲清楚 Prompt, Agent, MCP 是什么

AI Concepts Explained: Agent, MCP, Prompt, and Function Calling

This video breaks down complex AI terminology into understandable concepts, explaining the interplay between Agents, MCP, Prompts, and Function Calling.

1. Prompts: User and System

User Prompt: This is the direct message or question a user sends to an AI model. For example, "My stomach hurts."
System Prompt: This prompt defines the AI's persona, role, background, and tone. It's not directly spoken by the user but influences the AI's responses. For instance, "Act as my girlfriend."
- Evolution of Prompts: Initially, persona information was combined with the user prompt. However, this felt unnatural. Separating persona into a System Prompt allows for more natural interactions.
- Customization: Features like ChatGPT's "Customize ChatGPT" allow users to define preferences that are automatically incorporated into the System Prompt.

2. AI Agents and Agent Tools

AI Agent: A program that acts as an intermediary between the AI model, tools, and the end-user. It relays messages and orchestrates task completion.
Agent Tools: Functions or services that an AI Agent can call to perform specific actions. These need to be registered with the Agent, along with descriptions and usage instructions.
- Example: For file management, tools like list_files and read_file would be registered.
- AutoGPT: An early open-source example of an AI Agent that managed local files by registering functions and then prompting the AI model to use them.

3. Function Calling: Standardizing Tool Interaction

Problem with System Prompts for Tools: While System Prompts can describe tools, AI models, being probabilistic, might return responses in incorrect formats, leading to retries and unreliability.
Function Calling Solution: Major AI providers (ChatGPT, Claude, Gemini) introduced Function Calling to standardize tool interaction.
- Unified Format: Tool descriptions are defined using JSON objects, specifying name, description, and parameters.
- Standardized Responses: The AI is trained to return tool calls in a fixed format.
- Server-Side Retries: If the AI generates an incorrect response, the AI server can detect it due to the fixed format and perform retries, making the process seamless for the user.
- Benefits: Reduces development difficulty and token costs.
Limitations of Function Calling:
- No universal standard across all providers.
- Many open-source models do not yet support it.
- Writing cross-model compatible Agents remains challenging.
- Both System Prompts and Function Calling coexist in the market.

4. MCP: The AI Communication Protocol

Challenge: As Agent Tools become common (e.g., web browsing), copying code into every Agent is inefficient.
MCP (Message Communication Protocol): A communication protocol designed to standardize interaction between AI Agents (MCP Clients) and Tool services (MCP Servers).
- MCP Server: Hosts Tool functions and can also provide data (Resources) or prompt templates (Prompts).
- MCP Client: The AI Agent that calls the MCP Server.
- Interfaces: MCP defines interfaces for querying available tools, their functions, descriptions, parameters, and formats.
- Communication Methods: MCP Servers can communicate via standard input/output (local) or HTTP (network).
- Independence from AI Model: MCP is solely for managing tools, resources, and prompts; it does not depend on the specific AI model used by the Agent.

5. Connecting the Concepts: A Workflow Example

User Input: User asks the AI Agent (MCP Client): "What should I do if my girlfriend has a stomach ache?"
Prompt Packaging: The Agent packages this as a User Prompt.
Tool Retrieval: The Agent uses MCP to query the MCP Server for available tools.
Prompt Generation: The Agent converts the tool information into either a System Prompt or Function Calling format.
AI Model Interaction: The Agent sends the User Prompt and the tool information (in the chosen format) to the AI model.
Tool Invocation: The AI model identifies a web_browse tool and generates a request to call it.
Tool Execution: The Agent receives the tool call request and uses MCP to invoke the web_browse tool on the MCP Server.
Result Forwarding: The web_browse tool fetches website content and returns it to the Agent. The Agent then sends this content back to the AI model.
Final Response Generation: The AI model uses the web content and its own reasoning to generate the final answer: "Drink more hot water."
User Output: The Agent displays the final answer to the user.

6. Conclusion: AI as Collaborative Gears

The video emphasizes that Agent, MCP, Prompt, and Function Calling are not replacements but rather interconnected components that function like "gears" in a complete system of AI automated collaboration. The speaker expresses excitement about understanding these concepts, seeing them as a way to consciously engage with technological shifts rather than being passively swept along.

Key Concepts

User Prompt: User's direct input to an AI.
System Prompt: Defines AI persona, role, and tone.
AI Agent: Intermediary between AI model, tools, and user.
Agent Tools: Functions or services an AI Agent can call.
Function Calling: Standardized method for AI to call tools using JSON definitions.
MCP (Message Communication Protocol): Protocol for Agents (Clients) to interact with Tool services (Servers).
MCP Server: Hosts tools, resources, and prompts.
MCP Client: The AI Agent using MCP.
Resources: Data provided by an MCP Server (e.g., file read/write).
Prompts (MCP context): Prompt templates provided by an MCP Server.