The next stages of AI conformance in the cloud-native, open-source world

By The New Stack

Share:

Key Concepts

  • CNCF (Cloud Native Computing Foundation): The neutral home for critical open-source infrastructure projects like Kubernetes, Prometheus, and Envoy.
  • AI Conformance Program: A new initiative by the CNCF to standardize how AI/ML workloads, particularly inference, run on Kubernetes.
  • Dynamic Resource Allocation (DRA): A Kubernetes feature that allows for the standardized exposure of accelerators (like GPUs) to clusters.
  • Inference vs. Training: The shift in compute demand; by 2026, two-thirds of AI compute is projected to be for inference, compared to one-third for training.
  • LLMD: A new CNCF sandbox project that provides a practical, opinionated implementation of an inference framework and orchestration manager.
  • Invisibility Paradox: The phenomenon where Kubernetes becomes so ubiquitous (like electricity) that developers building on top of it are increasingly unaware of its underlying complexity.

1. The AI Conformance Program

The CNCF launched the AI Conformance Program to ensure that the massive global infrastructure being built for AI remains portable and interoperable.

  • Objective: To create a "global footprint of infrastructure" where AI workloads can run consistently across any cloud provider or enterprise environment.
  • Current Status: The program is in its early stages, focusing on the "common denominator" of requirements. The first version centers on Dynamic Resource Allocation (DRA), allowing workloads to request specific types and quantities of accelerators in a standard way.
  • Future Roadmap: Subsequent versions will incorporate standards for networking and storage as these requirements stabilize.
  • Certification: Companies must re-certify periodically as the program matures and requirements evolve.

2. Market Dynamics and Data

  • Compute Projections: By the end of 2026, 93 gigawatts of compute power will be dedicated to AI inference. This represents a massive shift from three years ago, when the ratio of training to inference was flipped (2/3 training, 1/3 inference).
  • Industry Participation: Early adopters who have passed the initial conformance testing include major players like Nvidia, the "big three" cloud providers, Red Hat, and OVH Cloud.
  • Motivation: While portability might seem counter-intuitive for some vendors, the rapid growth of the AI market makes standardization a tool to accelerate adoption rather than a zero-sum game.

3. Practical Implementation: The LLMD Project

To complement the high-level standards of the conformance program, the CNCF introduced LLMD into its sandbox.

  • Function: It acts as a full inference framework and orchestration manager.
  • Integration: It integrates vLLM (an open-source inference serving engine) directly into Kubernetes clusters.
  • Distinction: While the conformance program defines the rules for interoperability, LLMD provides a practical, opinionated implementation for users to deploy.

4. Addressing the Skills Gap and Operational Complexity

  • Platform Engineering: The CNCF is promoting the "platform engineering" model to help organizations manage the increasing complexity of cloud-native systems.
  • Upskilling: The foundation continues to expand its training and certification programs, including the new "Certified Platform Engineer" program.
  • Operational Agents: Jonathan Bryce highlighted the potential for "ops agents"—AI tools designed to assist with configuring and debugging systems—to help human operators scale their capacity and reduce burnout.

5. Key Quotes

  • "By the end of 2026... two-thirds of [compute] is going to be for inference and a third of it is going to be for training... 93 gigawatts is more than all other compute combined."Jonathan Bryce
  • "The invisibility [of Kubernetes] is only at certain layers... it is an interesting paradox to see that this massive growth... is actually making these components... more important than ever."Jonathan Bryce

6. Synthesis and Conclusion

The CNCF is positioning itself to manage the next wave of infrastructure demand by applying the same successful "conformance" framework used for Kubernetes to the burgeoning AI/ML sector. By focusing on a standard API for accelerator management (via DRA) and supporting practical implementations like LLMD, the CNCF aims to prevent fragmentation in the AI infrastructure market. The primary takeaway is that while Kubernetes is becoming increasingly "invisible" to the average developer, its role as the foundational layer for the world's 93-gigawatt AI inference infrastructure makes it more critical than ever for platform engineers to master its operation, security, and observability.

Chat with this Video

AI-Powered

Hi! I can answer questions about this video "The next stages of AI conformance in the cloud-native, open-source world". What would you like to know?

Chat is based on the transcript of this video and may not be 100% accurate.

Related Videos

Ready to summarize another video?

Summarize YouTube Video