Gemini 3 Pro: My Independent Benchmark Results Revealed!

Key Concepts

Gemini 3 Pro: A new AI model from Google, touted as a significant advancement in intelligence.
Claude Sonnet 4.5: A competing AI model from Anthropic.
GPT 5.1: Another AI model mentioned for comparison.
Multimodality: The ability of an AI model to process and understand multiple types of data, including text, image, video, and audio.
Terrain Mode: A specific feature or capability of Gemini 3 Pro for generating terrain simulations.
3.js: A JavaScript library for creating and displaying animated 3D computer graphics in a web browser.
Sparse Mixture of Experts (SMoE): An architectural approach for large language models that uses multiple "expert" sub-models, activating only a subset for any given task.
Context Window: The amount of text an AI model can consider at once when processing input and generating output.
Function Calling: The ability of an AI model to identify and call external functions or tools.
Structured Output: The ability of an AI model to generate output in a predefined format (e.g., JSON).
ChatLM (Abacus AI): A platform that provides access to multiple state-of-the-art AI models, including Gemini 3 Pro.
Google AI Studio: A platform for developers to experiment with and integrate Google's generative AI models.
Gen AI SDK: Software Development Kits provided by Google for integrating generative AI models into applications.
Google Anti-gravity: A new AI code editor, presented as a competitor to Cursor.

Gemini 3 Pro vs. Claude Sonnet 4.5: A Comparative Analysis

This video provides a side-by-side comparison of Gemini 3 Pro and Claude Sonnet 4.5, focusing on their capabilities in generating interactive 3D graphics and simulations. The presenter highlights Gemini 3 Pro as a "new era of intelligence" and claims it surpasses Claude 4.5 and GPT 5.1 in various aspects.

Gemini 3 Pro: Features and Capabilities

Gemini 3 Pro is presented as a highly advanced AI model with the following key features:

Multimodal Input: Supports text, image, video, audio, and PDF inputs.
Extensive Context Window: Boasts a 1 million input token context window and a 64,000 output token context window.
Recent Knowledge Cutoff: Knowledge cutoff date is January 25th, indicating more up-to-date information compared to some other models.
Built-in Function Calling: Can identify and execute external functions.
Structured Output: Capable of generating output in a structured format.
Tool Integration: Includes search as a tool and supports code execution.
Architecture: Described as a "sparse mixture of expert model," not a modified or fine-tuned version of a prior model.
Deep Thinking Reasoning Model: Gemini 3 Deep Thinking, a reasoning model with multimodal capabilities, was also released.

Gemini 3 Pro is available through the Gemini app and other platforms. It is also integrated into Google Search AI mode for Google AI Pro and Ultra subscribers in the US.

Performance Benchmarking and Comparisons

The presenter conducted a benchmark test using 3.js applications to compare Gemini 3 Pro Preview, Claude Sonnet 4.5, and GPT 5.1.

Error and Warning Analysis:
- Claude Sonnet: 93 warnings.
- Gemini 3 Pro Preview: 48 warnings.
Generation Speed:
- Average generation time for Gemini 3 Pro Preview: 50 seconds. This is slightly faster than Claude Sonnet 4.5 and a little slower than GPT 5.1.

Interactive Simulations and Applications

The core of the comparison involves testing both models' ability to generate various interactive 3D simulations and applications using 3.js.

1. Rotating Cube Simulation

Claude Sonnet 4.5: Generated a rotating cube with added layers of shadows.
Gemini 3 Pro Preview: Produced a relatively similar rotating cube.

2. Data-Driven Dashboard

Claude Sonnet 4.5: Offered views like top, side, and isometric, with zoom and data export capabilities.
Gemini 3 Pro Preview: Also provided top, side, isometric views, zoom, and theme toggling. The presenter found Gemini 3 Pro Preview to be better in this instance.

3. Molecule Explorer

Claude Sonnet 4.5: Displayed molecules, but some were slightly hidden. It could switch between molecule views, DNA sequences, and methane.
Gemini 3 Pro Preview: Showcased ethanol and DNA with zoom capabilities. It also demonstrated glucose with options for exploding, space-filling, and animating. The presenter expressed strong liking for Gemini 3 Pro's rendering of glucose.

4. Product Configuration

Claude Sonnet 4.5: Allowed changing object color, finish (matte, high gloss), stripe color, and offered front, rear, side, and top views.
Gemini 3 Pro Preview: Provided customization options including front, side, isometric views, and finishes like black matte and chrome. Both models were considered equally good for this application.

5. Solar System Simulation

Claude Sonnet 4.5: Generated a solar system simulation.
Gemini 3 Pro Preview: Produced a solar system simulation that was perceived as slightly smoother than Claude Sonnet 4.5. Features included speed control and reset view.

6. Terrain Flight Mode

Claude Sonnet 4.5: The terrain simulation was described as more natural compared to other models tested.
Gemini 3 Pro Preview: The terrain simulation looked more like a game, was slightly pixelated but smoother. The presenter felt Claude Sonnet 4.5's visual quality was better for terrain.

7. City Generator

Claude Sonnet 4.5: The city generation was described as much clearer.
Gemini 3 Pro Preview: Slowly generated the city, showing sunset and sunrise. However, the presenter preferred the clarity of Claude Sonnet 4.5's city generation.

8. Particle System Simulation

Claude Sonnet 4.5: Offered particle systems with explosion, spiral, flow, and swarm effects.
Gemini 3 Pro Preview: Generated a "magical particle stream." The presenter found both to be good and stated that Gemini 3 Pro Preview was better than GPT 5.1 based on their testing.

Integration and Accessibility

Gemini 3 Pro is accessible through:

Google AI Studio: Allows direct testing and integration.
Google's Gen AI SDKs: Available for Python, JavaScript, and REST for easy integration into applications.
Thinking Level Control: Users can set the thinking level to low, medium, or high.

Related Google AI Developments

The video also briefly mentions other recent AI releases from Google:

Google Anti-gravity: An AI code editor positioned as a competitor to Cursor. A detailed guide on testing Anti-gravity is promised.

Conclusion and Key Takeaways

The presenter concludes that Gemini 3 Pro Preview demonstrates superior performance in many of the tested benchmarks and interactive simulations, particularly in areas like molecule exploration, data dashboards, and particle systems, when compared to Claude Sonnet 4.5 and GPT 5.1. While Claude Sonnet 4.5 showed strengths in natural terrain and city generation clarity, Gemini 3 Pro's overall capabilities, including its multimodal input, extensive context window, and advanced features, position it as a significant advancement in AI. The availability of Gemini 3 Pro through platforms like ChatLM and Google AI Studio, along with its SDKs, makes it readily accessible for developers and users.