Testing ChatGPT Image 1.5's crazy capabilities (full review)
By Greg Isenberg
OpenAI’s New Image Model: First Impressions & Deep Dive
Key Concepts:
- OpenAI’s New Image Model: The recently launched image generation capability within ChatGPT.
- Nano Banana Pro: A competing image generation model previously used as the speaker’s daily driver.
- Prompt Optimization: The process of refining text prompts to achieve desired outputs from AI image models.
- CPG Brands: Consumer Packaged Goods brands – products frequently purchased by consumers.
- LLMs: Large Language Models – AI models trained on massive datasets of text.
- Instruction Following: The ability of an AI model to accurately interpret and execute user instructions.
- Text Rendering: The quality and accuracy of text generated within an image.
I. Initial Exploration & Plushie Creation
The video begins with a hands-on exploration of OpenAI’s new image model, accessible through the “plus” button and image option within ChatGPT. The initial test involves transforming a photo of Sam Altman into a “plushie” style image. The model automatically optimizes the prompt for better results, a feature reminiscent of apps like Glyph, which focus on prompt engineering for AI image generation. The resulting image exceeded expectations, demonstrating impressive detail, particularly in the hair rendering. The speaker highlights the potential for creating CPG brands and toys using this technology, suggesting a lucrative opportunity for entrepreneurs. He notes the quality is high enough to potentially create marketable products.
II. Sketch Generation & Prompt Refinement
The exploration continues with a “sketch” style transformation of a photo of the speaker enjoying a martini. The model generates a detailed graphite pencil sketch on textured notebook paper. The speaker questions whether the improvement is due to the model itself or the optimized prompts, ultimately concluding that the output quality is the primary concern. He emphasizes the importance of high-quality images for advertising, content creation (Instagram, TikTok slideshows), and ultimately, business success.
A key observation is made regarding the “AI feel” of the initial sketch, specifically the hand rendering. The speaker tests the model’s ability to respond to real-time feedback, requesting the removal of the hand and notebook for a more natural look. This tests the model’s ability to iterate based on user input, a feature the speaker previously found stronger in Google’s Nano Banana Pro.
III. Diagram Recreation & Viral Potential
The speaker then attempts to recreate a previously posted diagram on X (formerly Twitter) using the new image model. The original hand-drawn diagram received 621 likes, while similar AI-generated versions received fewer. He prompts the model to create a hand-drawn version in a more casual style, specifically requesting the removal of a “weird pencil sharpening thing” that appeared in a previous image.
The model’s output is significantly improved, described as “beautiful” and possessing a quality that would perform well on Instagram. The speaker notes the model is “reaching for ideas online” during the creation process, a phrasing he finds amusing. This section underscores the potential for the model to generate content with high viral potential.
IV. Style Exploration: Bobblehead & Model Capabilities
Further experimentation involves generating a bobblehead image, specifically requesting a style appropriate for a tech YouTuber rather than a baseball player. The model accurately captures the speaker’s attire, including a long-sleeve sweater and camera. The speaker acknowledges potential bias but expresses overall satisfaction with the result.
The video then references a blog post from OpenAI detailing the new image model’s capabilities. The post highlights improvements in editing (adding, subtracting, combining, blending, transposing) and creative transformations, emphasizing the preservation of important details. The model’s improved instruction following is also noted, allowing for more precise edits and complex compositions.
V. Technical Improvements & Comparison to Nano Banana Pro
The OpenAI blog post also mentions improvements in text rendering, addressing a common issue where AI-generated text contains errors. The speaker confirms this improvement, noting the previous prevalence of misspellings.
Regarding a direct comparison, the speaker concludes that the new ChatGPT image model is “as good, if not better than Nano Banana Pro.” He expresses optimism about the advancements in AI image generation and encourages viewers to explore the new tool.
VI. Future Applications & Call to Action
The speaker briefly touches upon the next steps for entrepreneurs – moving from image creation to product manufacturing and online sales (e.g., Shopify). He offers to create a dedicated video on this topic if there is sufficient interest. He concludes with a call to action, encouraging viewers to like, subscribe, and share their creations using the new image model. He reiterates his commitment to transparency and sharing valuable resources.
Notable Quote:
“All I care about is the output. Ultimately all I care about is the output. If I can get great outputs, that means, you know, maybe better ads for some of my business. That makes sometimes better content that actually can go viral.” – Speaker, emphasizing the practical value of AI image generation.
Data & Statistics:
- The original hand-drawn diagram on X received 621 likes.
- The speaker suggests hand-drawn diagrams typically receive around 2,000 likes.
Synthesis/Conclusion:
OpenAI’s new image model represents a significant leap forward in AI-powered image generation. The model’s ability to generate high-quality images, respond to feedback, and follow complex instructions positions it as a powerful tool for content creators, entrepreneurs, and anyone seeking to leverage visual media. The optimized prompting system and improvements in text rendering further enhance its usability and output quality. The speaker’s initial impressions are overwhelmingly positive, suggesting this new capability could rival and potentially surpass existing solutions like Nano Banana Pro. The potential for creating marketable products and viral content is a key takeaway, highlighting the practical applications of this technology.
Chat with this Video
AI-PoweredHi! I can answer questions about this video "Testing ChatGPT Image 1.5's crazy capabilities (full review)". What would you like to know?