NVIDIA CEO Jensen Huang Leaves Everyone SPEECHLESS (Supercut)
By Ticker Symbol: YOU
Key Concepts
- Accelerated Computing: A paradigm shift from general-purpose computing to specialized hardware (like GPUs) for faster data processing, image processing, computer graphics, and general computation, including AI.
- AI as Workers: A conceptual shift where AI is viewed not as a tool, but as an active participant that can perform tasks and increase productivity.
- Extreme Code Design: A methodology involving simultaneous design and optimization of hardware (chips) and software to achieve significant performance gains and cost reductions.
- Vera Rubin Supercomputer: NVIDIA's third-generation rack-scale computer, featuring a cableless design, liquid cooling, and advanced processors for AI workloads.
- Context Processor: A specialized processor designed to handle the increasing amount of context (e.g., large documents, videos) that AI models need to process for more informed responses.
- BlueField-4: A revolutionary processor designed to accelerate KV caching, improving the speed and efficiency of retrieving past AI conversations and learned information.
- NVLink Switch: A high-bandwidth interconnect that connects multiple computers, designed to handle data traffic significantly exceeding global peak internet traffic.
- Spectrum X Switch: An Ethernet switch enabling simultaneous communication between all processors without network congestion.
- Quantum Computing: A new computing paradigm that leverages quantum mechanics to solve problems intractable for classical computers.
- MVQLink: An interconnect designed for quantum computing, facilitating control, calibration, error correction, and hybrid simulations between quantum processing units (QPUs) and GPU supercomputers.
- Physical AI: AI focused on real-world interactions, requiring a multi-computer approach for training, simulation (digital twins), and operation.
- Robotics: A rapidly growing sector, with a focus on humanoid robots and wheeled robots (robo taxis) as significant future markets.
- NVIDIA ARC (Aerial Radio Network Computer): A new product line for 6G wireless telecommunications, combining CPUs, GPUs, and networking for software-defined, programmable wireless communication and AI processing.
- NVIDIA DRIVE Hyperion: An architecture enabling car companies to build "robo taxi ready" vehicles, serving as a computing platform on wheels for autonomous driving systems.
- Digital Twin: A virtual replica of a physical object or system, used for simulation, training, and design in areas like robotics and factories.
- Open Source Models: AI models that are publicly available, crucial for startups, researchers, and developers due to their reasoning capabilities, multimodality, and efficiency.
- CSP Capex: Capital expenditures by Cloud Service Providers, indicating significant investment in AI infrastructure.
- Dinard Scaling: The slowing down of transistor performance and power efficiency improvements, marking the end of traditional Moore's Law scaling.
Vera Rubin: The Next Generation of AI Supercomputing
The presentation highlights NVIDIA's advancements in AI supercomputing, starting with the first AI supercomputer delivered to OpenAI in 2016, which required the design of a new chip. The core argument is that significant performance leaps and cost reductions are achieved through extreme code design, optimizing multiple chips concurrently, rather than relying on single-chip improvements.
Vera Rubin System Architecture and Capabilities
- Third Generation NVLink Rack Scale Computer: The Vera Rubin system is introduced as NVIDIA's third-generation NVLink rack-scale computer.
- Evolution of Design: The first generation used the GB200, the second was smoother, and the third generation, Vera Rubin, is completely cableless and 100% liquid-cooled.
- Production Timeline: While GB300s are currently being shipped, Vera Rubin is being prepared for production, expected around this time next year or slightly earlier.
- Performance Leap: Vera Rubin achieves 100 petaflops, representing a 100x performance increase compared to the DGX1 delivered to OpenAI nine years prior. This means one Vera Rubin rack can replace approximately 25 racks of the older supercomputers.
- Compute Tray: The system features an easily installable compute tray.
- Context Processor: A new processor is introduced to handle the increasing context requirements of AI, enabling models to process large amounts of data (PDFs, papers, videos) before answering questions.
- Node Configuration: Each node within Vera Rubin includes:
- Eight ConnectX-9 Super NICs (CX/CPXs)
- BlueField-4 data processor
- Two Vera CPUs
- Four Rubin packages or eight Rubin GPUs.
- BlueField-4's Revolutionary Role: This processor is crucial for accelerating KV caching, which is essential for AI to remember past conversations and learned information. The current retrieval process for this cached data is becoming a bottleneck, leading to longer refresh times. BlueField-4 aims to revolutionize this by providing faster memory access.
- Networking Infrastructure:
- NVLink Switch: Provides several times the bandwidth of the entire world's peak internet traffic, enabling high-speed communication between GPUs.
- Spectrum X Switch: An Ethernet switch designed for simultaneous communication between all processors without network congestion.
- Quantum Switch: Supports InfiniBand, indicating support for various networking standards.
Palantir Partnership and Enterprise AI
NVIDIA announces a partnership with Palantir, highlighting Palantir's Ontology platform, which transforms data, human judgment, and information into business insights. This collaboration aims to accelerate Palantir's data processing capabilities for government, national security, and enterprises, enabling "speed of light" data processing and insight extraction.
The Era of Accelerated Computing and the End of Dinard Scaling
The presentation emphasizes that NVIDIA invented a new computing model to solve problems beyond the capabilities of general-purpose computers. A key observation is that Dinard scaling has stopped nearly a decade ago, meaning transistor performance and power efficiency have significantly slowed, even as transistor counts continue to rise. NVIDIA has been advancing accelerated computing for 30 years, starting with the invention of the GPU and the CUDA programming model. The strategy is to combine parallel computing (GPUs) with sequential processing (CPUs) to extend computing capabilities.
NVIDIA ARC: Revolutionizing Wireless Telecommunications
NVIDIA is launching a new product line called NVIDIA ARC (Aerial Radio Network Computer) to address the decline of American leadership in wireless technology. ARC is built on three core technologies:
- Grace CPU
- Blackwell GPU
- ConnectX Melanox Connect Networking
ARC aims to create a software-defined programmable computer capable of wireless communication and AI processing simultaneously. This will enable upgrading millions of base stations globally with 6G and AI, improving spectral efficiency through AI-driven real-time beamforming adjustments based on environmental factors. This new technology will also enable an edge industrial robotics cloud built on top of the wireless telecommunications network.
Quantum Computing Integration
The presentation touches upon the progress in quantum computing, with the announcement of the ability to create one logical qubit that is coherent, stable, and error-corrected. NVIDIA recognizes the necessity of connecting quantum computers directly to GPU supercomputers for error correction, AI calibration, and control. The MVQLink is announced as an interconnect that enables quantum computer control, calibration, error correction, and hybrid simulations between QPUs and GPU supercomputers. This architecture is designed to scale from current qubit counts to hundreds of thousands in the future. Seventeen quantum computing companies and DOE labs are supporting MVQLink.
AI: Beyond Chatbots to "Workers"
The definition of AI is explored, moving beyond chatbots like ChatGPT. The speaker posits that while past software industries focused on creating tools (e.g., Excel, Word), AI represents workers that can use tools and enhance productivity. This distinction opens up vast economic opportunities.
CSP Capex and Platform Shifts
The presentation shows a chart of CSP (Cloud Service Provider) capex, indicating significant investment from major players like Amazon, Corewave, Google, Meta, Microsoft, and Oracle. This investment is driven by two simultaneous platform shifts:
- General Purpose Computing to Accelerated Computing: NVIDIA's GPUs are highlighted as the only solution capable of handling all accelerated computing tasks, including AI, unlike ASICs which may only do AI.
- Classical Handwritten Software to Artificial Intelligence: The shift towards AI-driven development.
NVIDIA's Business Growth and Blackwell/Rubin Pipeline
NVIDIA is experiencing extraordinary growth driven by these platform shifts. The company has visibility into half a trillion dollars of cumulative Blackwell and early Rubin ramps through 2026. This represents a five times growth rate compared to Hopper. Six million Blackwells have already been shipped in the first few quarters of production.
The Importance of Open Source Models
The rise of open-source models is emphasized as a critical development. These models are becoming highly capable due to:
- Reasoning capabilities
- Multimodality
- Efficiency through distillation
Open-source models are now the "lifeblood of startups" as they allow domain expertise and specific use cases to be embedded into models. NVIDIA is dedicating itself to leading in open-source contributions, with 23 models on leaderboards across various domains (language, physical AI, robotics, biology). They claim the number one models in speech, reasoning, and physical AI.
Physical AI: A Three-Computer Approach
Physical AI, which deals with real-world interactions, requires a three-computer system:
- Training Computer: For training the AI model (e.g., Grace Blackwell NVLink72).
- Simulation Computer (Omniverse Computer): For creating digital twins of robots and factories, enabling AI to learn in a simulated environment. This computer needs to excel at generative AI, computer graphics, sensor simulation, ray tracing, and signal processing.
- Robotic Computer: For operating the robot in the real world (e.g., Jetson Thor robotics computer for self-driving cars or agile robots).
All three computers run CUDA, enabling advancements in physical AI. The complexity of software required for these systems necessitates digital twin simulations for design, planning, and operation.
Case Study: Figure AI and Disney Research
- Figure AI: A company founded three and a half years ago, now valued at nearly $40 billion, is working with NVIDIA on training AI, simulating robots, and utilizing the robotic computer for their humanoid robots. Humanoid robots are predicted to be a massive consumer electronics and industrial equipment market.
- Disney Research: NVIDIA is collaborating with Disney Research on a new simulation platform called Newton, enabling robots to learn in physically aware environments. This technology is expected to drive the development of digital twins for factories, warehouses, and surgical rooms, leading to the "largest consumer electronics product line in the world."
Robo Taxis and NVIDIA DRIVE Hyperion
Robo taxis are presented as an AI chauffeur and a robot on wheels at an inflection point. NVIDIA is announcing NVIDIA DRIVE Hyperion, an architecture that allows all car companies to build "robo taxi ready" vehicles.
- Sensor Suite: Includes surround cameras, radars, and LiDAR for high-level perception and redundancy.
- Industry Adoption: Hyperion is designed into Lucid, Mercedes-Benz, and Stellantis vehicles, with more to come.
- Developer Ecosystem: A standard chassis allows AV system developers (e.g., Waymo, Aurora, Motional) to deploy their AI.
- Market Potential: The future of driving involves a trillion miles per year, 100 million cars made annually, and 50 million taxis, which will be augmented by robo taxis.
- Uber Partnership: NVIDIA is partnering with Uber to connect Hyperion cars into a global network, creating a new computing platform for Uber.
Conclusion: Two Platform Transitions Driving Growth
The presentation concludes by reiterating the core drivers of NVIDIA's growth:
- Platform Transition 1: General Purpose Computing to Accelerated Computing: Enabled by NVIDIA CUDA and CUDA X libraries, addressing virtually every industry.
- Platform Transition 2: Classical Software to Artificial Intelligence: The shift towards AI development.
These two simultaneous transitions are creating an inflection point and driving incredible growth. The presentation also briefly recaps advancements in quantum computing, open models, enterprise solutions (CrowdStrike, Palantir), robotics, and 6G (NVIDIA ARC), and autonomous vehicles (Hyperion).
Chat with this Video
AI-PoweredHi! I can answer questions about this video "NVIDIA CEO Jensen Huang Leaves Everyone SPEECHLESS (Supercut)". What would you like to know?