Mistral's Magistral: POWERFUL Opensource Reasoning Model Beats Deepseek R1/V3! (Fully Tested)

Key Concepts:

Mistral AI, Magistral (Small & Medium), Reasoning Models, Domain Expertise, Transparency, Multilingual Capabilities, Chain of Thought Reasoning, Benchmarks, Coding Performance, Inference Speed, Open Source, Commercial Use, Token Throughput, SVG Generation, Absolute Risk Reduction (ARR), Relative Risk Reduction (RR), Number Needed to Treat (NNT).

1. Introduction of Magistral:

The Mistral team released Magistral, their first reasoning-focused model.
Magistral is designed for domain-specific, transparent, and multilingual reasoning.
Two variants: Magistral Small (24B parameters, fully open-source) and Magistral Medium (enterprise-grade, for commercial use).
Aims to replicate human-like complex thinking by blending logic, uncertainty, and insight.
Enables precision step-by-step problem-solving across professional domains.

2. Performance Benchmarks:

3. Coding Capabilities and Example:

The model was tasked with creating a SAS landing page with front-end UI design.
It quickly outputted 1,800 lines of code, creating a base structure for a SAS landing page.
The output was a decent base structure of a typical SAS landing page.

4. Multilingual Chain of Thought Reasoning:

Both models excel in multilingual chain of thought reasoning.
Ideal for structured logic and decision-making tasks.
Mistral Small is open source and can be used for various use cases.
The medium version scores 73.6 on the AME 2024 and 90% on majority voting at 64.

5. Inference Speed and Language Support:

The new "think mode" and "flash answer" in their chatbot (Le Chat) provide responses at 10x the speed compared to competitors.
Supports multiple languages: English, French, Spanish, German, Italian, Arabic, Russian, and Simplified Chinese.
Enables real-time reasoning and scalable user feedback.

6. Accessing and Using Magistral:

Magistral Small is under the Apache 2.0 license, allowing commercial and non-commercial use.
Can be deployed locally on a single RTX 4090 or 32GB RAM MacBook (once quantized).
Access methods: Olama, LM Studio, Mistral's Le Chat (cloud-based chatbot), Kilo Code (with $20 free credits), and Open Router.
Pricing: Magistral Small is $0.50 per 1 million input tokens and $1.50 per 1 million output tokens; Magistral Medium is $2 per 1 million input tokens and $5 per 1 million output tokens.

7. Testing Reasoning Capabilities:

8. Reasoning Prompt Example (Pharmaceutical Study):

A pharmaceutical company is testing a new drug in a double-blind study.
1000 patients, 500 receive the drug, 500 receive a placebo.
After six months, 60 in the drug group and 90 in the placebo group experience a recurrence.
The model was asked to calculate the absolute risk reduction (ARR) and relative risk reduction (RR) of the drug.
The model correctly calculated ARR (6%), RR (33.3%), and NNT (approximately 17).
It also provided an explanation for doctors considering prescribing the drug, including potential biases.

9. Conclusion:

Magistral is an impressive open-source reasoning model that can be hosted locally.
It addresses issues with typical reasoning models and excels in structured calculation, decision trees, programmatic logic, and native chain of thought reasoning.
Benchmark scores and inference speeds are exceptional.
The model is good with programmatic logic as well as native chain of thought reasoning across multiple languages.
The Mistral team is commended for making this model open source.