From systems of intelligence to systems of action: Yasmeen Ahmad on the agentic data cloud
By Unknown Author
Key Concepts
- Agentic Data Cloud: A shift from passive "systems of intelligence" (dashboards/reports) to "systems of action" where AI agents reason over data and execute tasks.
- Intent-Driven Engineering: Moving from persona-based agents to agents that focus on high-level objectives, utilizing tools and skills to complete end-to-end workflows.
- Knowledge Catalog: A system that creates inferred schemas, relationships, and semantic meaning across both structured and unstructured data (e.g., PDFs).
- Cross-Cloud Lakehouse: An architecture using open standards like Apache Iceberg to allow data access across AWS, Azure, and Google Cloud without moving data.
- AI-Optimized Infrastructure: Vertical integration of hardware (TPUs), data engines (BigQuery, Spark), and software to handle the high-frequency API calls generated by agent swarms.
1. The Evolution: From Intelligence to Action
The industry is transitioning from "systems of intelligence"—which provided static insights that often remained unused—to "systems of action."
- The Problem: Historically, only 10–20% of data insights were successfully productionized.
- The Solution: Agentic systems allow data to be active in a "reasoning loop," enabling AI to interact with operational, marketing, and ledger systems directly.
- Key Insight: Yasmeen Ahmed notes that "AI-ready data" is no longer just about cleanliness or lineage; it requires context. True accuracy (the remaining 50% beyond basic data quality) comes from coding business intuition and "hidden" context into the platform so agents can reason effectively.
2. Semantic Understanding and the Knowledge Catalog
To bridge the gap between raw data and agentic reasoning, Google Cloud utilizes the Knowledge Catalog:
- Inferred Schema: GenAI is used to automatically generate descriptions and relationships for data, replacing tedious manual documentation.
- Unstructured Data Handling: Since enterprise documents (like thousands of PDFs) cannot fit into a model's context window, the Knowledge Catalog creates an "inferred schema" across these documents, allowing agents to access specific, relevant context without the cost of processing entire document sets.
3. Gemini Enterprise as the "Front Door"
Gemini Enterprise serves as the unified interface for business users, abstracting away the complexity of underlying data pipelines.
- Integration: Users can chat with business agents that pull data from BigQuery, AlloyDB, and Looker.
- Deep Research Agent: By connecting the Deep Research agent to the Knowledge Catalog, users can synthesize enterprise-specific data with real-time web data (e.g., weather or traffic patterns) to optimize operational strategies like shipping. This reduces a process that previously took weeks of IT coordination to mere minutes.
4. Multi-Cloud Strategy and Open Standards
Google Cloud is addressing the reality of multi-cloud environments through:
- Apache Iceberg: By adopting this universal open standard, Google allows users to connect to data residing in Databricks, Snowflake, or AWS S3 without proprietary lock-in.
- Cross-Cloud Interconnect: This technology mitigates the traditional challenges of latency and egress costs, allowing for sub-second access to petabytes of data across different cloud providers.
5. Scaling for "Agent Swarms"
As organizations move to agentic workflows, the volume of API calls increases significantly (10–20x per human action). To maintain efficiency:
- Engine Optimization: BigQuery processing speeds have improved by 35% with a 40% cost reduction. The managed Apache Spark service with the "Lightning Engine" is 5x faster than standard Spark.
- Vertical Integration: Google optimizes the entire stack—from the TPU silicon (separating training and inference to prevent traffic jams) to the data layer—resulting in a 230x reduction in token usage for AI inferencing over BigQuery data.
- Search and Retrieval: Google is applying its "hybrid search" and complex re-ranking algorithms (derived from Google Search) to the enterprise data stack to ensure agents are served the most relevant context, preventing them from getting "lost" in too much data.
6. Notable Quotes
- "The whole industry's strategy to AI-ready data was quite naive three years ago... just focusing on the data layer only got you to 50% accuracy with agents. The rest of the 50% comes from great context." — Yasmeen Ahmed
- "It's not just one agent, a monolithic agent, it's actually swarms of agents that activate to complete an intent." — Yasmeen Ahmed
Synthesis
The transition to an Agentic Data Cloud represents a fundamental shift in how enterprises derive ROI from data. By combining open standards (Iceberg), semantic context (Knowledge Catalog), and vertically integrated infrastructure, Google Cloud is enabling a move toward "intent-driven engineering." In this new era, data practitioners shift their focus from managing individual tasks to defining objectives, while swarms of agents handle the reasoning, data wrangling, and execution across multi-cloud environments.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.