OpenAI is equipping its chatbot ChatGPT with new image processing functions and expanding its ability to create visualizations with readable text elements. The innovations are intended to increase the attractiveness of the AI tool for both business and private users.
As the company demonstrated in a livestream presentation, users will in future be able to adjust and change images step by step in a dialog-based form. For example, once an image has been created, the background can be modified or elements such as headgear can be added.
Another focus is on improved text integration in generated images. This should make it easier to create diagrams, infographics and logos for professional use. The AI is also able to implement more complex instructions for image composition.
The new features will be available from today via OpenAI’s GPT-4o model and will be available to both free and paying users. The company announced that the new features will also be made available to software developers using the API in the coming weeks.
However, as with other AI applications, there are technical limitations. According to OpenAI, errors can occur, for example due to the generation of incorrect text content such as non-existent country names. A company blog post points out that such problems can occur particularly with unspecific queries. Other challenges include the display of small font sizes and non-Latin writing systems.
Image creation with the extended functions requires up to one minute of processing time, which CEO Sam Altman justifies with the higher level of detail in the output.