Deepseek V3.2 Exp: CHEAPEST Model Ever That Is Quite POWERFUL! Best Opensource Model?

Key Concepts

Gemini 3.0: Potentially leaked AI model with a possible launch in the coming week.
GLM 4.6: A new powerful open-source model.
Claude 4.5 Sonnet: Anthropic's latest model, touted as their best coding model.
Deepseek V3: New model released by the Deep Seek team.
Deepseek V3.2 Experimental: An experimental version of Deepseek V3, focusing on cost reduction.
Sparse Attention: A technique used by Deep Seek to reduce API costs by focusing on important words and skipping less important ones.
Deepseek R2: The future iteration of the Deep Seek model.
Kilo Code: An AI provider offering free credits that can be used with the Deepseek V3.2 Experimental model.
Open Roder: Another AI provider.

Deepseek V3.2 Experimental: Overview and Efficiency

The Deep Seek team has released Deepseek V3.2 Experimental, an iteration of their Deepseek V3 model. The key focus of this release is cost reduction through the implementation of sparse attention. Sparse attention skips over less important words, focusing only on the things that actually matter, making it much faster and cheaper to use without losing much quality. While Deep Seek V3.2 Experimental is practically the same as its initial release with the Terminus model with only slight improvements in generation, the trade-off with the sparse attention can sometimes struggle with very long context. The model aims to enhance long context performance and reduce compute costs drastically in comparison to the 3.1 terminus while maintaining the performance.

Sparse Attention: This technique is the core innovation, enabling significant cost savings.
Efficiency Gains: The model achieves efficiency gains with DSA (fine grain sparse attention) enabled with minimal impact on the quality of output.
Benchmark Performance: Benchmarks show that Deepseek V3.2 Experimental is on par with the Terminus model in terms of performance while being significantly more efficient in terms of pricing.

Pricing Structure and Accessibility

The Deepseek V3.2 Experimental model boasts a highly competitive pricing structure, making it one of the most affordable options with comparable performance to proprietary models.

Pricing: The model is priced at 0.028 cents with cash hit for 1 million input tokens and 28 cents with cash miss. Output tokens are priced at 42 cents for 1 million output tokens.
Accessibility: The model can be accessed via the Deep Seek chatbot (free) or through their API platform.
Local Installation: Users can also install the model locally using quantized models and host it with tools like Olama or LM Studio.
AI Providers: Kilo Code and Open Roder are recommended AI providers for accessing the model. Kilo Code offers $25 worth of free credits.

Code Generation and Creative Applications

The video showcases the Deepseek V3.2 Experimental model's capabilities in code generation and creative applications through several examples.

SVG Code Generation: The model was tasked with creating a butterfly in SVG code. The generated code produced a symmetrical butterfly that was better than the Terminus model.
SAS Landing Page: The model was instructed to create a SAS landing page with as many features as possible. The resulting page included animations, pricing tiers, testimonials, and an FAQ section. The generation cost was only 5 cents.
Browser-Based OS: The model successfully generated a browser-based OS mimicking the Mac OS style, including apps like Finder, Safari, and a terminal. The only downside was that it failed to generate the icons.

Logical Reasoning and Problem Solving

The video demonstrates the model's ability to perform logical reasoning and problem-solving through a classic water puzzle.

Water Puzzle: The model was given a prompt involving three containers (8L, 5L, and 3L) and asked to achieve exactly 4L in the 8L container.
Reasoning Process: The model followed a multi-step logical reasoning process, carefully evaluating different outcomes to arrive at the correct solution.
Result: After 77 seconds of thinking, the model successfully provided the correct steps to leave exactly 4L in the 8L container.

Conclusion

The Deepseek V3.2 Experimental model represents a significant advancement in open-source AI, particularly in terms of cost-effectiveness and performance. Its sparse attention mechanism allows for efficient resource utilization without sacrificing output quality. The model excels in code generation, creative applications, and logical reasoning, making it a versatile tool for various tasks. The affordable pricing structure and ease of accessibility further enhance its appeal. The success of this experimental model raises expectations for the upcoming Deepseek R2, which is anticipated to be a truly remarkable release.