Back to all videos

Getting started with Agent Development Kit

By Google for Developers

Technology AI Startup

Share:

Key Concepts:

Agent Development Kit (ADK): An open-source framework for building AI agents.
Code-first flexibility: Building agents using code for granular control.
LLM Agent: Traditional agent with an LLM and tools.
Workflow Agent: Agent that deterministically picks subagents without needing an LLM.
Sequential Agent: Workflow agent that runs subagents one by one.
Parallel Agent: Workflow agent that runs subagents simultaneously.
Loop Agent: Workflow agent that runs subagents iteratively until a condition is met.
Custom Agent: Agent that combines different agent types.
Services: Memory, session, and artifact services for agents.
Runner: The execution engine for agents.
Event: An atomic action within an agent.
State: Information passed between agents via output keys.

1. Introduction to Agent Development Kit (ADK)

ADK is an open-source framework designed to simplify the development of AI agents.
It allows users to build, run, evaluate, and deploy agents to any provider.
ADK emphasizes "code-first flexibility," providing granular control over agent orchestration using programming language constructs (Python).
The goal is to make AI agent development as straightforward as software development.

2. Architecture of the YouTube Shorts Agent

The example agent is a "YouTube Shorts agent" composed of three sub-agents:
- ScriptWriter: Writes scripts based on an idea.
- Visualizer: Creates visual descriptions matching the script.
- Formatter: Combines the script and visual descriptions into a markdown format.
The ScriptWriter agent utilizes a built-in Google Search tool to research current trends.

3. Code Structure and Agent Definition (agent.py)

The agent.py file contains definitions for the root agent and sub-agents.
The root agent (YouTube Shorts agent) takes a name and a model (Gemini 2.5 Pro is used as an example).
ADK is model-agnostic, deployment-agnostic, and interoperable, allowing the use of any model from any provider and integration with agents built using other frameworks.
The root agent definition includes a description (a one-liner describing the agent) and instructions (step-by-step guidance for the agent).
Sub-agents (ScriptWriter, Visualizer, Formatter) are defined within the root agent.

4. Sub-Agent Details: ScriptWriter

The ScriptWriter agent takes parameters similar to the parent agent.
It includes a built-in Google Search tool.
The output_key variable is used to pass state between agents. The response from the LLM is stored in the generated_script key, which can be accessed by other sub-agents.
Instructions for the ScriptWriter are loaded from a file, detailing the steps to accomplish its goal.

5. Sub-Agent Details: Visualizer and Formatter

The Visualizer and Formatter agents have similar definitions, primarily differing in their instructions.
The Visualizer agent's instructions call the state of generated_script (output from the ScriptWriter).
The Formatter agent takes both the script and visual concepts to create the final markdown.

6. Running the Agent: Different Methods

The agent can be run in four ways:
- adk run: CLI command to run the agent directly in the command line.
- adk web: Spins up an Angular UI for interacting with the agent (with multimodal capabilities).
- adk api-server: Exposes the agent as a REST endpoint.
- Programmatically: Invoking the agent through Python code.
The video demonstrates adk run and adk web.

7. Multi-Agent Problem and Workflow Agents

The initial implementation only resulted in the ScriptWriter responding, as the parent agent (with LLM capabilities) decided it was sufficient to handle the query.
ADK offers three types of agents to address this: LLM agents, workflow agents, and custom agents.
Workflow agents (sequential, parallel, and loop) provide deterministic control over sub-agent execution.
The video focuses on using a Loop Agent to ensure all sub-agents are run iteratively.

8. Implementing the Loop Agent

The code is modified to replace the LlmAgent with a LoopAgent.
Parameters like model, description, and instructions are removed, as workflow agents don't need reasoning capabilities.
A maximum_iterations parameter is introduced to control how many times the sub-agents run in a loop.

9. Running the Loop Agent and Verifying Sub-Agent Execution

adk web is used to spin up the UI and test the Loop Agent.
The responses show that the ScriptWriter, Visualizer, and Formatter agents are all called iteratively.

10. Programmatic Agent Execution: Services, Runner, and Event Loop

Programmatic execution requires understanding services (memory, session, artifact), the runner (execution engine), and the event loop.
Services:
- Memory: Stores conversations (in-memory or persistent storage like a database).
- Session: The duration of the conversation.
- Artifact Storage: Stores outputs like text files, PDFs, or images.
Runner: Takes the input prompt, gathers services, and invokes the parent agent.
Event: An atomic action within an agent (input prompt, tool call, tool response), streamed asynchronously from the runner.

11. Code for Programmatic Execution

The code defines an in-memory session service with an application name, user ID, and session ID.
The runner is called with the prompt.
The code loops through the stream of events to identify and print the final response.

12. Conclusion

The video provides an overview of ADK, different agent types, and methods for running agents.
It covers the concepts of session, state, and runner.
Links to documentation and the sample agent's GitHub repository are provided in the description.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video