Nvidia CEO Jensen Huang unveils new AI technology for autonomous driving
By Yahoo Finance
NVIDIA’s Advancement in Physical AI & Autonomous Vehicles
Key Concepts:
- Physical AI: Artificial intelligence applied to understanding and interacting with the physical world.
- Synthetic Data Generation: Creating artificial datasets to train AI models, particularly useful when real-world data is scarce or expensive to obtain.
- Foundation Model (Cosmos): A large AI model pre-trained on vast datasets, capable of adapting to various downstream tasks.
- Long Tail of Driving: The infinite number of rare and unpredictable scenarios encountered in real-world driving.
- End-to-End Training: Training an AI model directly from raw sensor input to final output (e.g., camera input to steering control).
- AV Stack: The complete software and hardware system enabling autonomous vehicle functionality.
- Dual Orin/Thor: NVIDIA’s processors designed for robotic systems, emphasizing safety and performance.
- Alpio: NVIDIA’s autonomous vehicle AI, trained with both human demonstration and Cosmos-generated data.
1. The Challenge of Data in Physical AI
The core challenge in developing physical AI lies in acquiring sufficient and diverse training data. Unlike language models trained on readily available text, physical AI requires data representing the complexities of the real world. Collecting this data through real-world interactions is slow, costly, and inherently limited in its coverage of all possible scenarios. The speaker emphasizes the need to move from relying solely on real-world data to leveraging synthetic data generation. This is crucial for capturing the “long tail” of unpredictable events.
2. Introducing NVIDIA Cosmos: Turning Compute into Data
NVIDIA Cosmos is presented as a solution to this data bottleneck. It’s an “open Frontier World Foundation model for physical AI” pre-trained on internet-scale video, driving/robotics data, and 3D simulations. Cosmos doesn’t just process data; it generates it. It’s capable of creating physically plausible and realistic video based on 3D scene descriptions, driving telemetry, and scenario prompts. This allows for the selective and clever generation of data tailored for AI training. The speaker highlights Cosmos’s ability to align language, images, 3D representations, and actions, enabling skills like generation, reasoning, and trajectory prediction. Cosmos has been downloaded millions of times and is being used globally.
3. Cosmos Capabilities & Examples
Cosmos’s functionality is demonstrated through several examples:
- Traffic Simulator Enhancement: Output from a basic traffic simulator can be fed into Cosmos to generate more realistic and diverse surround video for AI training.
- Scenario Generation: Cosmos can create “edge cases” – rare and challenging scenarios – to test and improve AI robustness.
- Interactive Simulations: Developers can run closed-loop simulations where AI actions trigger realistic responses from the simulated world.
- Reasoning & Prediction: Cosmos can analyze scenarios, break them down into familiar physical interactions, and predict potential outcomes.
4. Alpio: The First “Thinking” Autonomous Vehicle AI
Building on Cosmos, NVIDIA has developed Alpio, described as the “world’s first thinking reasoning autonomous vehicle AI.” Alpio is trained “end-to-end,” directly from camera input to vehicle actuation (steering, brakes, acceleration). Its training data comprises:
- Human Demonstration Data: Miles driven by human drivers.
- Cosmos-Generated Data: Synthetic data created by Cosmos to augment the real-world data.
- Carefully Labeled Data: Hundreds of thousands of examples meticulously labeled to teach the car how to drive.
A key feature of Alpio is its ability to reason about its actions. Before executing a maneuver, it articulates the action, the reasoning behind it, and the predicted trajectory. This reasoning capability is crucial for handling the “long tail” of driving scenarios.
5. Addressing the Long Tail of Driving with Reasoning
The speaker emphasizes that it’s impossible to collect real-world data for every conceivable driving scenario. However, most complex scenarios can be decomposed into simpler, more familiar situations that an AI can understand. Alpio’s reasoning ability allows it to break down these complex scenarios and apply its existing knowledge. A demonstration video showcases Alpio navigating various challenging situations, explaining its actions in real-time.
6. NVIDIA’s Full-Stack Approach & Safety Certification
NVIDIA has invested eight years in building a complete AI stack for autonomous vehicles, described as a “five-layer cake”:
- Layer 1 (Base): The car itself.
- Layer 2: Chips (GPUs, CPUs, networking chips).
- Layer 3: Infrastructure (Omniverse, Cosmos).
- Layer 4: Models (Alpio).
- Layer 5: Application (Mercedes-Benz integration).
The Mercedes-Benz CLA, powered by this stack, has achieved the highest safety rating from NCAAP. NVIDIA has implemented a dual-stack safety system:
- Alpio Stack: The primary AI driving system, leveraging reasoning and prediction.
- Classical AV Stack: A traditional, safety-certified autonomous driving system that acts as a fallback in uncertain situations.
A “policy and safety evaluator” determines which stack is best suited for a given scenario, ensuring a high level of safety. The entire system is designed with diversity and redundancy. Every line of code and the chip itself are safety certified.
7. Open Ecosystem & Future Outlook
Alpio is being open-sourced to encourage wider adoption and innovation. NVIDIA envisions a future where every car is AI-powered, either as a robo-taxi, a self-driving personal vehicle, or a vehicle capable of both autonomous and manual driving. The company anticipates that the autonomous vehicle market will become a “giant business,” with opportunities for companies to utilize different parts of the NVIDIA stack – from chips to full-stack integration. The speaker predicts that a very large percentage of the world’s cars will be autonomous or highly autonomous within the next 10 years.
Notable Quotes:
- “The chat GPT moment for physical AI is nearly here.”
- “Cosmos turns compute into data.”
- “It’s impossible for us to simply collect every single possible scenario for everything that could ever happen.”
- “Every single car will have autonomous vehicle capability.”
- “Every single car will be AI powered.”
Data & Statistics:
- Cosmos has been downloaded millions of times.
- The Mercedes-Benz CLA, powered by the NVIDIA stack, received the highest safety rating from NCAAP.
- The AV team consists of several thousand people.
- NVIDIA anticipates the autonomous vehicle market will be a “giant business.”
Conclusion:
NVIDIA is positioning itself as a leader in the physical AI revolution, particularly in the realm of autonomous vehicles. By addressing the data challenge with synthetic data generation through Cosmos, and by developing a reasoning-based AI like Alpio, the company is making significant strides towards safe and reliable autonomous driving. The open-source approach and full-stack integration strategy are designed to accelerate innovation and widespread adoption of this technology. The emphasis on safety, with a dual-stack system and comprehensive certification, underscores NVIDIA’s commitment to building trustworthy autonomous systems.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Nvidia CEO Jensen Huang unveils new AI technology for autonomous driving". What would you like to know?