Getting started with Agent Development Kit
By Google for Developers
Share:
Key Concepts:
- Agent Development Kit (ADK): An open-source framework for building AI agents.
- Code-first flexibility: Building agents using code for granular control.
- LLM Agent: Traditional agent with an LLM and tools.
- Workflow Agent: Agent that deterministically picks subagents without needing an LLM.
- Sequential Agent: Workflow agent that runs subagents one by one.
- Parallel Agent: Workflow agent that runs subagents simultaneously.
- Loop Agent: Workflow agent that runs subagents iteratively until a condition is met.
- Custom Agent: Agent that combines different agent types.
- Services: Memory, session, and artifact services for agents.
- Runner: The execution engine for agents.
- Event: An atomic action within an agent.
- State: Information passed between agents via output keys.
1. Introduction to Agent Development Kit (ADK)
- ADK is an open-source framework designed to simplify the development of AI agents.
- It allows users to build, run, evaluate, and deploy agents to any provider.
- ADK emphasizes "code-first flexibility," providing granular control over agent orchestration using programming language constructs (Python).
- The goal is to make AI agent development as straightforward as software development.
2. Architecture of the YouTube Shorts Agent
- The example agent is a "YouTube Shorts agent" composed of three sub-agents:
- ScriptWriter: Writes scripts based on an idea.
- Visualizer: Creates visual descriptions matching the script.
- Formatter: Combines the script and visual descriptions into a markdown format.
- The ScriptWriter agent utilizes a built-in Google Search tool to research current trends.
3. Code Structure and Agent Definition (agent.py)
- The
agent.pyfile contains definitions for the root agent and sub-agents. - The root agent (YouTube Shorts agent) takes a name and a model (Gemini 2.5 Pro is used as an example).
- ADK is model-agnostic, deployment-agnostic, and interoperable, allowing the use of any model from any provider and integration with agents built using other frameworks.
- The root agent definition includes a description (a one-liner describing the agent) and instructions (step-by-step guidance for the agent).
- Sub-agents (ScriptWriter, Visualizer, Formatter) are defined within the root agent.
4. Sub-Agent Details: ScriptWriter
- The ScriptWriter agent takes parameters similar to the parent agent.
- It includes a built-in Google Search tool.
- The
output_keyvariable is used to pass state between agents. The response from the LLM is stored in thegenerated_scriptkey, which can be accessed by other sub-agents. - Instructions for the ScriptWriter are loaded from a file, detailing the steps to accomplish its goal.
5. Sub-Agent Details: Visualizer and Formatter
- The Visualizer and Formatter agents have similar definitions, primarily differing in their instructions.
- The Visualizer agent's instructions call the state of
generated_script(output from the ScriptWriter). - The Formatter agent takes both the script and visual concepts to create the final markdown.
6. Running the Agent: Different Methods
- The agent can be run in four ways:
adk run: CLI command to run the agent directly in the command line.adk web: Spins up an Angular UI for interacting with the agent (with multimodal capabilities).adk api-server: Exposes the agent as a REST endpoint.- Programmatically: Invoking the agent through Python code.
- The video demonstrates
adk runandadk web.
7. Multi-Agent Problem and Workflow Agents
- The initial implementation only resulted in the ScriptWriter responding, as the parent agent (with LLM capabilities) decided it was sufficient to handle the query.
- ADK offers three types of agents to address this: LLM agents, workflow agents, and custom agents.
- Workflow agents (sequential, parallel, and loop) provide deterministic control over sub-agent execution.
- The video focuses on using a Loop Agent to ensure all sub-agents are run iteratively.
8. Implementing the Loop Agent
- The code is modified to replace the
LlmAgentwith aLoopAgent. - Parameters like
model,description, andinstructionsare removed, as workflow agents don't need reasoning capabilities. - A
maximum_iterationsparameter is introduced to control how many times the sub-agents run in a loop.
9. Running the Loop Agent and Verifying Sub-Agent Execution
adk webis used to spin up the UI and test the Loop Agent.- The responses show that the ScriptWriter, Visualizer, and Formatter agents are all called iteratively.
10. Programmatic Agent Execution: Services, Runner, and Event Loop
- Programmatic execution requires understanding services (memory, session, artifact), the runner (execution engine), and the event loop.
- Services:
- Memory: Stores conversations (in-memory or persistent storage like a database).
- Session: The duration of the conversation.
- Artifact Storage: Stores outputs like text files, PDFs, or images.
- Runner: Takes the input prompt, gathers services, and invokes the parent agent.
- Event: An atomic action within an agent (input prompt, tool call, tool response), streamed asynchronously from the runner.
11. Code for Programmatic Execution
- The code defines an in-memory session service with an application name, user ID, and session ID.
- The runner is called with the prompt.
- The code loops through the stream of events to identify and print the final response.
12. Conclusion
- The video provides an overview of ADK, different agent types, and methods for running agents.
- It covers the concepts of session, state, and runner.
- Links to documentation and the sample agent's GitHub repository are provided in the description.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.