Back to all videos

DeepSeek Just Started a Global AI War And Exposed GPT-5.6

By AI Revolution

multimodal AI OpenAI's current state (bugs Codex GPT 5.6)

Share:

Key Concepts

DeepSeek V4: A powerful, open-source Chinese Large Language Model (LLM) known for high reasoning capabilities and aggressive cost-efficiency.
Token Maxing: The trend of enterprises rapidly increasing AI usage, leading to massive consumption of API tokens.
Jevons Paradox: An economic principle where increased efficiency in resource use leads to higher overall consumption.
Reference Gap: A limitation in multimodal AI where models lose track of specific objects within an image during reasoning.
Visual Primitives: A methodology using coordinates and bounding boxes to anchor AI reasoning to specific visual elements.
Stochasticity: The probabilistic nature of LLMs, meaning single-instance performance is not a reliable benchmark.

1. The Rise of DeepSeek V4 and Market Disruption

DeepSeek has introduced V4, a model that has fundamentally altered the AI landscape by forcing a "pricing war."

Cost Efficiency: DeepSeek slashed API prices by up to 90%. For example, V4 Pro input costs dropped from ~14.5 cents to 3.6 cents per million tokens.
Hardware Agnostic: Unlike models heavily reliant on Nvidia’s CUDA ecosystem, V4 is validated on both Nvidia and Huawei Ascend processors, signaling a shift toward a self-sustaining Chinese AI ecosystem.
Strategic Positioning: The model is open-source, allowing for modification and integration into local Chinese infrastructure, which appeals to companies looking for alternatives to US-based closed systems (OpenAI, Anthropic, Google).

2. Multimodal Innovation: "Thinking with Visual Primitives"

DeepSeek, in collaboration with Peking and Tsinghua Universities, released research addressing the "reference gap" in multimodal AI.

The Problem: Traditional models focus on the "perception gap" (seeing more pixels). However, they often fail at tasks requiring stable object tracking, such as counting crowds or navigating mazes, because they lose track of specific references.
The Solution: The model uses "visual primitives"—points and bounding boxes—as active reasoning tools rather than just final outputs. By anchoring reasoning to coordinates, the model maintains focus on specific objects.
Efficiency: This approach requires significantly less visual memory (approx. 90 entries) compared to competitors like Claude (870) or Gemini (1,100), leading to faster, more accurate performance in real-time applications like robotics.

3. OpenAI’s Current State: Chaos and Competition

OpenAI is currently navigating a period of both rapid expansion and technical oddities:

The "Goblin" Bug: GPT 5.5 exhibited a strange behavioral quirk, frequently referencing "goblins, gremlins, and trolls" in unrelated contexts. Internal system prompts revealed that OpenAI had to explicitly ban these terms, though the model continued to struggle with the constraint.
Codex Evolution: OpenAI is pivoting toward "super agents" with the Codex app, designed to automate workflows across email, calendars, and spreadsheets.
GPT 5.6 Speculation: Developers identified "GPT 5.6" in backend logs. While likely an early routing or Canary deployment, its appearance suggests OpenAI may be accelerating its development cycle in response to the competitive pressure from DeepSeek.

4. Enterprise Adoption and "Token Maxing"

The shift toward cheaper models is changing corporate behavior:

Usage Statistics: Companies like Visa are consuming trillions of tokens monthly.
Behavioral Shift: As costs drop, companies move from experimental AI use to full-scale workflow automation. The "Jevons Paradox" is in effect: as AI becomes cheaper, the total volume of usage increases, making cost-efficiency the primary competitive advantage for model providers.

5. Leadership and Talent Retention

DeepSeek’s Growth: The research team grew by 27% (from 212 to 270 members) between December and the V4 launch.
Strategic Transparency: Senior researcher Chen Derry has become a public face for the company, emphasizing "long-termism" and the necessity of public discourse regarding AI-driven job displacement.
Corporate Structure: Founder Liang Wenfang has significantly increased his stake and capital investment in the company, signaling strong internal confidence and state-level alignment.

Synthesis and Conclusion

The AI industry is splitting into two distinct camps: the US-led closed systems (prioritizing top-tier performance) and the Chinese-led open-source ecosystem (prioritizing cost, accessibility, and hardware independence). DeepSeek V4 has proven that a model does not need to win every benchmark to be disruptive; it only needs to be "good enough" and significantly cheaper to force a shift in enterprise behavior. OpenAI’s potential acceleration toward GPT 5.6 suggests that the pressure from this "bottom-up" competition is forcing the industry into a high-speed race where cost, speed, and agentic capabilities are the new primary battlegrounds.

Chat with this Video

AI-Powered

Load the transcript when you're ready to chat so the initial page stays lighter.

Related Videos

Ready to summarize another video?

Summarize YouTube Video