OpenAI Just Dropped A New Coding Model for Developers

Key Concepts:

GPT5 Codeex: A specialized, optimized version of GPT5 specifically trained for coding tasks.
Long Horizon Coding Tasks: Complex coding tasks that can run independently for extended periods (e.g., over 7 hours).
Dynamic Reasoning Effort: The model's ability to dynamically adjust the amount of "thinking time" or computational effort based on the complexity of the task.
Code Reviews: The process of reviewing code for flaws and correctness, a key focus area for GPT5 Codeex.
Sweepbench Verified: A benchmark used to evaluate the performance of coding models.
Token Generation: The process of generating tokens, which are the basic units of text used by language models.
Codex CLI: A command-line interface for accessing and using GPT5 Codeex.
Local to Web Handoff: The ability to seamlessly transition tasks between a local environment (e.g., CLI) and a web-based environment.

1. Introduction of GPT5 Codeex

OpenAI has released GPT5 Codeex, a specialized version of GPT5 trained for coding, accessible within the Codeex environment (terminal, IDE, web).
It excels at code reviews and handles long horizon coding tasks (over 7 hours independently).
The speaker notes the confusing naming convention.

2. Availability and Access

GPT5 Codeex is available in a cloud version within ChatGPT, including a code review version.
Access is provided through CLI or IDE extensions (VS Code and Cursor).
It's included in ChatGPT Plus, Pro, Business, Education, and Enterprise plans with rate limits.

3. Benchmarks and Performance

Sweepbench Verified: GPT5 Codeex shows a slight improvement over GPT5 on high settings.
- The speaker points out that the benchmark is now tested on all 500 questions, unlike previous tests on only 477 questions.
- GPT5's performance decreases when tested on all 500 questions, and GPT5 Codeex doesn't surpass the original GPT5 score on the limited question set.
Code Refactoring: GPT5 Codeex is significantly better (50% improvement) than GPT5 on high settings for code refactoring tasks.
Token Usage:
- For the bottom 10% of user turns (low complexity), GPT5 Codeex uses 94% fewer tokens than GPT5.
- For the top 10% (high complexity), GPT5 Codeex spends twice as long and generates twice the number of tokens, indicating more "thinking time."
- This demonstrates the model's ability to dynamically select reasoning effort based on task complexity.

4. Dynamic Reasoning and Task Handling

GPT5 Codeex dynamically adjusts its "thinking time" based on task complexity.
It combines interactive sessions with developers and independent task execution for longer tasks.
Users can start tasks locally (CLI) and offload them to the cloud version.

5. Code Review Capabilities

GPT5 Codeex is specifically trained for code reviews and finding critical flaws.
It navigates codebases, reasons through dependencies, and runs code/tests for validation.
Internal benchmarks show a substantial reduction in incorrect comments (from 13% to 4%) and an increase in high-impact comments (by 12-13%).
The average number of comments per pull request (PR) has also decreased.

6. Integration and User Experience

7. Key Insights from Dan (Every CEO/Co-founder)

Highlights two important aspects:
- Dynamic Thinking Time: The model can decide how much effort to spend based on task complexity.
- Local to Web Handoff: Seamless transition between local and web environments.

8. Security and Safety

9. Pricing and Availability

Included in ChatGPT Plus, Pro, Business, Education, and Enterprise plans.
Generous rate limits are provided, with Pro plans supporting a full work week.
GPT5 Codeex is not yet available through the API but is planned for future release.

10. Conclusion

GPT5 Codeex represents a significant advancement in AI-powered coding assistance, particularly in code reviews and handling complex, long-horizon tasks. Its ability to dynamically adjust reasoning effort and seamlessly transition between local and web environments offers a powerful and flexible user experience. The speaker plans to test GPT5 Codeex in Codeex CLI and IDEs and report the findings in a future video.