Gemini vs. the clock: 5 hours to build a FinTech agent

By Google Cloud Tech

AIFinanceTechnology
Share:

Key Concepts

  • AI Agent: An autonomous software entity designed to perform specific tasks, often interacting with users or other systems.
  • Agent-to-Agent (A2A) Protocol: A communication standard enabling different AI agents to interact, share information, and collaborate.
  • Agent Development Kit (ADK): A Google Cloud framework that simplifies the creation, orchestration, and deployment of AI agents, including managing workflows and state.
  • Gemini: Google's family of multimodal large language models (LLMs), including Gemini 2.5 Flash, used for natural language understanding, generation, and reasoning.
  • BQML (BigQuery ML): Machine learning capabilities integrated directly into Google BigQuery, allowing users to create and execute ML models using SQL queries, particularly useful for forecasting and data analysis.
  • Firebase Authentication: A Google service providing backend services for user authentication, supporting various sign-in methods.
  • Firebase Storage: A cloud storage solution for developers to store and serve user-generated content.
  • Router Agent: A central AI agent responsible for dynamically directing user requests to the most appropriate specialized sub-agent.
  • Sequential Workflow: A process where tasks or agent interactions occur in a predefined, linear order.
  • Remote Agent as a Tool: A best practice in ADK where an external or pre-existing agent (like Symbol Bank's backend) is encapsulated and invoked as a tool by other agents.
  • Hallucination: A phenomenon in AI models where they generate plausible but incorrect or fabricated information.
  • User Scope/State: Contextual information about a user and their ongoing conversation that is maintained and accessible across different agents within a session.

The Challenge: Building a Trusted AI Financial Planner

The video documents Google Cloud's AI Agent Bake-Off, a five-hour challenge where four developer teams competed to transform "Symbol Bank," a realistic test case with a robust backend trapped behind a basic chatbot, into a trusted financial planning partner. The core mission was to build a new AI agent system that could work alongside the existing Symbol Bank backend using the Agent-to-Agent (A2A) protocol. The goal was to move Symbol Bank from a "boring transactional chatbot to a trusted financial planner," emphasizing the need for digital tools that foster trust. Developers had access to Google's full AI arsenal, including Gemini, Imagin, and VL, and utilized the Agent Development Kit (ADK) to integrate everything.

Developer Teams and Initial Approaches

Four teams participated, each bringing unique perspectives and technical strategies:

  • Vlad (Custom AI Applications Developer): Focused on daily spending and visualizing spending habits with graphs. Emphasized making the agent personable and connecting to the broader web beyond the bank's environment.
  • Luis (Google Cloud Customer Engineer): Focused on daily spending, big purchases, and trip planning, suggesting integration with Google's built-in tools like Google Search for average hotel prices.
  • Marcus (University of Waterloo Computer Science Student): Planned a three-route front-end structure (get started, onboarding, home) and considered voice interaction.
  • Sitha (Google Developer Relations Engineer): Aimed to get an agent up and running quickly, focusing on a retriever and chart generator, and considering sub-agents for cleaner architecture.
  • Brandon Hancock (AI Agent Content Creator): Focused on embedding BQML agents for big purchases to provide secure, consistent, and accurate forecasting, avoiding hallucination. Also planned to integrate the Reddit API for community insights on budgeting.
  • Lakshmi (Google Cloud Field Solutions Architect): Aimed to reinvent the entire banking system, acknowledging the complexity.
  • Adrian (University of Central Florida IT Student): Focused on setting up A2A instances and connecting different agents, structuring them into separate folders with a main "Symbol Agent" and various sub-agents.
  • Ideji (Google Cloud Developer Advocate): Planned BQML integration and Firebase integration for front-end authentication.

Development Process and Technical Implementations

Developers leveraged ADK for orchestrating agents and A2A protocol for inter-agent communication. Key technical decisions and challenges included:

  • Agent Architecture: Many teams opted for a multi-agent hierarchy, often with a main "router agent" delegating requests to specialized sub-agents (e.g., investments, daily spending, big purchases, trip planning).
  • Workflow Patterns: Primarily sequential workflows were used, where one agent retrieves data, and another formats or presents it. The "generator critic loop pattern" was mentioned as a desired but time-constrained advanced pattern.
  • Model Selection: Gemini 2.5 Flash was chosen by some for its human-like, live conversational capabilities, especially for voice interactions.
  • Data Visualization: Generating graphs and charts was a common goal to make financial data more understandable and actionable.
  • External Integrations:
    • Google Search Tool: Integrated into investment agents to fetch real-time stock information.
    • Reddit API: Used to pull relevant community discussions on budgeting and travel.
    • Google Calendar API: Integrated for automated meeting scheduling with financial advisors.
    • Spotify API & Scribe API (planned): For identifying and potentially canceling unused subscriptions.
  • Authentication and Authorization: Firebase Authentication was used for user sign-in. The OAUTH token generated upon sign-in was stored as metadata in the user scope of the ADK state, allowing it to cascade through sub-agents and tools, ensuring secure access to user-specific data (e.g., BigQuery transaction history).
  • BQML for Trust and Accuracy: Adrian and Ideji's team specifically highlighted using BQML for big purchase forecasting. They connected transaction history to a BigQuery table, allowing live, on-the-fly model training and execution against specific user data. This approach aimed to provide accurate forecasts, avoid hallucination common in traditional LLMs for mathematical operations, and rebuild trust by showing data-driven, personalized recommendations.
  • Handling Context: A significant challenge was passing context between agents, especially when external financial transactions were needed. The solution involved wrapping capabilities and remote agents as "tools" that could be used interchangeably by a single agent, preventing context loss.
  • Multimodal Artifacts: Generating structured HTML for rendering graphs and other visual artifacts proved challenging within the time limit.
  • UI/UX Design: Teams explored different UI philosophies, from minimalistic and functional to highly creative and themed interfaces (e.g., the "orchestra" theme).

Judging and Key Learnings

The judges (Alan, Paige, Richard) evaluated the solutions based on completeness, user experience, technical implementation, and creativity.

  • Context Passing: Judges inquired about how context was maintained across agents, confirming that ADK's user scope and state management were crucial.
  • Tool Calls: Questions arose about the use of sequential, loop, or parallel patterns for tool calls. Most teams leaned on sequential, with aspirations for more complex patterns.
  • BQML Deep Dive: The judges were particularly interested in the BQML implementation, confirming that it involved live model training on user-specific transaction data for accurate, non-hallucinatory forecasting.
  • A2A Best Practice: A critical learning shared was that the A2A remote agent should be wrapped as a tool rather than a sub-agent when multiple sub-agents need to invoke the banking agent. This ensures efficient data feeding and supports complex multi-agent interactions.

The Winning Solution: Brandon and Lakshmi

Brandon and Lakshmi were declared the winners. Their demo, "Fee (Financial Intelligent Planner)," was praised for its polish, completeness, and user-centric design.

  • Core Philosophy: "Putting agents in every aspect of the application," treating the agent as a "best friend" that analyzes data and provides highlights.
  • Key Features:
    • Income and Expenses Analysis: Easily queryable financial data.
    • Portfolio Management: Uses ADK's sequential workflow to pull, format, and present data on net worth and investment goals, offering actionable insights.
    • Perks Agent: Analyzes transactions and credit card perks to show users how they can save money based on available benefits.
    • Security: Allowed users to explicitly grant or deny agents access to their data.
    • Data Sourcing: All user information was stored in Symbol Bank, with agents communicating via A2A to gather data.
  • Technical Strengths: The judges appreciated the team's focus on storytelling and user experience, and the "very well called out AI insights." Their solution was deemed the "highest likelihood of getting funded on a demo day at a startup accelerator pitch day."

Conclusion and Main Takeaways

The AI Agent Bake-Off demonstrated the immense potential of AI agents and the A2A protocol in transforming traditional banking into a more intelligent, personalized, and trustworthy experience. Key takeaways include:

  • ADK's Power: The Agent Development Kit proved instrumental in rapidly building and orchestrating complex multi-agent systems within a tight timeframe, with features like user scope and state management being critical.
  • A2A as the Future: The ability for agents to communicate and collaborate (A2A) is seen as the next frontier, enabling sophisticated, collaborative AI systems that perform tasks on behalf of users.
  • Trust through Accuracy: Integrating specialized tools like BQML for accurate forecasting and mathematical operations is crucial for building user trust in financial AI applications, mitigating the risk of hallucination.
  • User-Centric Design: Successful AI agent applications require a deep understanding of user needs and a focus on intuitive UI/UX, making complex financial data accessible and actionable.
  • Best Practices Emerge: The challenge highlighted important architectural best practices, such as wrapping remote agents as tools, for efficient and robust multi-agent interactions.
  • The "Agentic" Future: The participants expressed excitement about the future where "everything's becoming more agentic," with agents collaborating seamlessly to handle complex tasks.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "Gemini vs. the clock: 5 hours to build a FinTech agent". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video