The new FinOps problem isn't cloud bills
By The New Stack
Key Concepts
- FinOps: A cultural practice and discipline focused on maximizing business value from cloud investments through accountability, governance, and cost optimization.
- Agentic Era: A shift in technology where AI agents perform tasks autonomously, necessitating new management frameworks for infrastructure and spend.
- Token Economics: The study and management of costs associated with LLM (Large Language Model) usage, where consumption is measured in tokens rather than traditional compute hours.
- Deterministic vs. Agentic: The balance between rigid, rule-based systems (deterministic) and flexible, AI-driven autonomous workflows (agentic).
- Unit Economics: Measuring the cost of cloud/AI resources relative to the specific business value or output they generate.
- Right-sizing: The process of matching infrastructure resources (like Kubernetes pods or GPU instances) to the actual needs of the workload to prevent waste.
1. The Evolution of FinOps
FinOps is no longer just about cost reduction; it is about value realization. As cloud environments expand to include SaaS, on-premise infrastructure, and AI, the discipline has evolved into a data science problem.
- Key Shift: Moving from manual headcount-heavy triage to automated, agent-driven optimization.
- The "Agentic" Challenge: With 98% of FinOps teams now managing AI, the primary goal is to automate the "dumb stuff" (e.g., redundant right-sizing tickets) so teams can focus on business-level strategy.
2. AI for FinOps: Automating the Workflow
The speakers argue that AI should be treated as a "junior team member" that requires onboarding, context, and authorization.
- Methodology: Use deterministic building blocks for detection (e.g., identifying an anomaly) and agentic flows for enrichment and action (e.g., drafting a pull request).
- Human-in-the-loop: Because LLMs can hallucinate or lack context, critical actions (like terminating a server) must remain under human approval or strict deterministic guardrails.
- Example: In Google Kubernetes Engine (GKE), instead of asking an LLM to "fix the cluster," the system uses "Goldilocks" signals (P99 utilization, memory limits, etc.) to provide data-backed, deterministic recommendations that an SRE would approve.
3. FinOps for AI: Managing Token Economics
As enterprises move past the "AI hype" phase into "hard ROI," managing the cost of AI systems has become critical.
- The Token Trap: While base token prices may decrease, the complexity of models (e.g., moving from "Flash" to "Pro" models) often leads to higher overall spend.
- Orchestration: A key strategy is using an orchestrator layer that routes simple tasks (summarizing emails) to cheaper, smaller models (like Gemma or Flash) and complex tasks (scientific reasoning) to frontier models.
- Total Cost of Ownership (TCO): Beyond inference, organizations must account for training, storage, data adaptation layers, and the "10x rule"—a Stanford study suggesting that for every $1 spent on AI, $10 is spent on organizational change management and governance.
4. Strategic Frameworks and Recommendations
- Organizational Change: The speakers emphasize that buying a tool (like Finout) is insufficient without a cultural shift. FinOps is an organizational problem, not a software problem.
- Onboarding Agents: Treat AI agents like new employees: provide them with specific context, authentication, and clear operational standards.
- Actionable Advice:
- Start with the FinOps Foundation: Utilize their resources to build the necessary culture and framework.
- Adopt a Value-First Mindset: Every technical decision should be mapped to a business outcome (e.g., revenue, productivity, or differentiation).
- Implement Governance: Establish clear "guardrails" for AI usage to ensure that innovation does not lead to uncontrolled spending.
5. Notable Quotes
- "FinOps is not about saving money; it’s about making more money for the business by finding the value of your investment." — Roy, CEO of Finout.
- "Think about it as everyone is now a manager, and you have a bunch of very junior employees who just work for you." — Roy, on the role of AI agents.
- "With great power comes great responsibility... you hold the keys to the kingdom." — Batik, Google Cloud, on the responsibility of cloud engineers.
Synthesis
The transition into the agentic era requires a dual-pronged approach: using AI to automate the tedious aspects of cloud management (AI for FinOps) while simultaneously applying rigorous financial discipline to the consumption of AI models themselves (FinOps for AI). Success in this era is defined by the ability to balance autonomous agentic workflows with deterministic human oversight, ensuring that every token spent contributes directly to measurable business value.
Chat with this Video
AI-PoweredLoad the transcript when you're ready to chat so the initial page stays lighter.