Getting started with Agent Development Kit

By Google for Developers

Share:

Key Concepts:

  • Agent Development Kit (ADK): An open-source framework for building AI agents.
  • Code-first flexibility: Building agents using code for granular control.
  • LLM Agent: Traditional agent with an LLM and tools.
  • Workflow Agent: Agent that deterministically picks subagents without needing an LLM.
  • Sequential Agent: Workflow agent that runs subagents one by one.
  • Parallel Agent: Workflow agent that runs subagents simultaneously.
  • Loop Agent: Workflow agent that runs subagents iteratively until a condition is met.
  • Custom Agent: Agent that combines different agent types.
  • Services: Memory, session, and artifact services for agents.
  • Runner: The execution engine for agents.
  • Event: An atomic action within an agent.
  • State: Information passed between agents via output keys.

1. Introduction to Agent Development Kit (ADK)

  • ADK is an open-source framework designed to simplify the development of AI agents.
  • It allows users to build, run, evaluate, and deploy agents to any provider.
  • ADK emphasizes "code-first flexibility," providing granular control over agent orchestration using programming language constructs (Python).
  • The goal is to make AI agent development as straightforward as software development.

2. Architecture of the YouTube Shorts Agent

  • The example agent is a "YouTube Shorts agent" composed of three sub-agents:
    • ScriptWriter: Writes scripts based on an idea.
    • Visualizer: Creates visual descriptions matching the script.
    • Formatter: Combines the script and visual descriptions into a markdown format.
  • The ScriptWriter agent utilizes a built-in Google Search tool to research current trends.

3. Code Structure and Agent Definition (agent.py)

  • The agent.py file contains definitions for the root agent and sub-agents.
  • The root agent (YouTube Shorts agent) takes a name and a model (Gemini 2.5 Pro is used as an example).
  • ADK is model-agnostic, deployment-agnostic, and interoperable, allowing the use of any model from any provider and integration with agents built using other frameworks.
  • The root agent definition includes a description (a one-liner describing the agent) and instructions (step-by-step guidance for the agent).
  • Sub-agents (ScriptWriter, Visualizer, Formatter) are defined within the root agent.

4. Sub-Agent Details: ScriptWriter

  • The ScriptWriter agent takes parameters similar to the parent agent.
  • It includes a built-in Google Search tool.
  • The output_key variable is used to pass state between agents. The response from the LLM is stored in the generated_script key, which can be accessed by other sub-agents.
  • Instructions for the ScriptWriter are loaded from a file, detailing the steps to accomplish its goal.

5. Sub-Agent Details: Visualizer and Formatter

  • The Visualizer and Formatter agents have similar definitions, primarily differing in their instructions.
  • The Visualizer agent's instructions call the state of generated_script (output from the ScriptWriter).
  • The Formatter agent takes both the script and visual concepts to create the final markdown.

6. Running the Agent: Different Methods

  • The agent can be run in four ways:
    • adk run: CLI command to run the agent directly in the command line.
    • adk web: Spins up an Angular UI for interacting with the agent (with multimodal capabilities).
    • adk api-server: Exposes the agent as a REST endpoint.
    • Programmatically: Invoking the agent through Python code.
  • The video demonstrates adk run and adk web.

7. Multi-Agent Problem and Workflow Agents

  • The initial implementation only resulted in the ScriptWriter responding, as the parent agent (with LLM capabilities) decided it was sufficient to handle the query.
  • ADK offers three types of agents to address this: LLM agents, workflow agents, and custom agents.
  • Workflow agents (sequential, parallel, and loop) provide deterministic control over sub-agent execution.
  • The video focuses on using a Loop Agent to ensure all sub-agents are run iteratively.

8. Implementing the Loop Agent

  • The code is modified to replace the LlmAgent with a LoopAgent.
  • Parameters like model, description, and instructions are removed, as workflow agents don't need reasoning capabilities.
  • A maximum_iterations parameter is introduced to control how many times the sub-agents run in a loop.

9. Running the Loop Agent and Verifying Sub-Agent Execution

  • adk web is used to spin up the UI and test the Loop Agent.
  • The responses show that the ScriptWriter, Visualizer, and Formatter agents are all called iteratively.

10. Programmatic Agent Execution: Services, Runner, and Event Loop

  • Programmatic execution requires understanding services (memory, session, artifact), the runner (execution engine), and the event loop.
  • Services:
    • Memory: Stores conversations (in-memory or persistent storage like a database).
    • Session: The duration of the conversation.
    • Artifact Storage: Stores outputs like text files, PDFs, or images.
  • Runner: Takes the input prompt, gathers services, and invokes the parent agent.
  • Event: An atomic action within an agent (input prompt, tool call, tool response), streamed asynchronously from the runner.

11. Code for Programmatic Execution

  • The code defines an in-memory session service with an application name, user ID, and session ID.
  • The runner is called with the prompt.
  • The code loops through the stream of events to identify and print the final response.

12. Conclusion

  • The video provides an overview of ADK, different agent types, and methods for running agents.
  • It covers the concepts of session, state, and runner.
  • Links to documentation and the sample agent's GitHub repository are provided in the description.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video