OpenAI on May 13 showcased its latest AI model, GPT-4o, with a demo featuring voice interaction across text and images. This could keep the company "ahead of the race" in the global artificial intelligence landscape, Reuters reported. The GPT-4o boasts advanced audio capabilities, allowing users to engage in real-time conversations without delays and even interrupt the AI during its speech—a significant milestone in replicating natural human interaction, the report said.
OpenAI researchers showcased these features during a livestream event, likening the experience to dialogue straight from the movies. OpenAI CEO Sam Altman expressed his enthusiasm in a blog post, highlighting the newfound naturalness in conversing with computers, a feat previously considered challenging. "It feels like AI from the movies ...
Talking to a computer has never felt really natural for me; now it does," Altman wrote. Backed by Microsoft, OpenAI faces mounting competition and the imperative to broaden the user base of its popular chatbot, ChatGPT, the report noted. During the livestream, researchers demonstrated ChatGPT's enhanced voice assistant capabilities.
In one demonstration, the AI guided a researcher through solving a mathematical problem, leveraging its vision and voice functionalities. Another showcased the model's prowess in real-time language translation. The demonstrations bordered on "science fiction", the report added.
Read more on livemint.com