OpenAI Unveils Major ChatGPT Image Generation Upgrade with GPT-4o

OpenAI expands ChatGPT’s capabilities with native image generation and editing using the GPT-4o model. (Image Source: OpenAI)

OpenAI has introduced a significant update to ChatGPT’s image capabilities, marking the platform’s first significant advancement in visual content generation in over a year. CEO Sam Altman announced during a live stream that ChatGPT can now natively generate and edit images using the company’s powerful GPT-4o model.

Although GPT-4o has been the core of ChatGPT’s text-based features for some time, this is the first time the model can handle visual tasks directly, expanding the chatbot’s functionality beyond text generation.

Available for Pro Users

The new image-generation capabilities are available immediately for $200/month Pro subscribers using ChatGPT and OpenAI’s video platform Sora. OpenAI also confirmed that the feature will soon be accessible to Plus, free-tier ChatGPT users, and developers via API.

Unlike the outgoing DALL·E 3, GPT-4o takes a little longer to generate images but produces results that OpenAI says are more detailed and accurate. Users can now edit existing images, including those featuring people, with tools for modifying backgrounds, foreground elements, and fine details through a technique known as inpainting.

Data Sources and Artist Protections

Brad Lightcap, OpenAI’s COO, told the Wall Street Journal that the company has strict policies in place to avoid imitating the styles of living artists, and creators can request to opt out of training datasets. OpenAI also honors requests to block its web crawlers from collecting website image data.

Also read: OpenAI Updates Advanced Voice Mode to Improve User Experience and Reduce Interruptions

A Competitive Race in Generative AI

OpenAI’s visual upgrade follows recent developments by competitors like Google, which launched native image output for Gemini 2.0 Flash, one of its flagship models. While Google’s feature gained viral attention, it also attracted criticism due to lax safeguards that allowed users to remove watermarks or generate copyrighted characters.

Related Topics

multimodal AI modelsOpenAI modelsGenerative AI Models