Gemini 1.5 Pro model boasts performance levels akin to the larger Gemini 1.0 Ultra model, according to Google. The release follows OpenAI's success with ChatGPT in late 2022, prompting Google to position itself as a major player in cutting-edge generative AI technology capable of creating text, images, and videos based on user prompts. Gemini, initially released by Google in December, offered three versions tailored for various tasks and compatible with a range of devices, from mobile to large-scale data centers.
In response to the advancements made by Microsoft Corp. and OpenAI, Google seeks to attract users with even more powerful tools. Gemini 1.5 Pro distinguishes itself by enabling faster and more efficient training, coupled with the ability to process substantial amounts of information in response to user prompts.
Google claims that Gemini 1.5 Pro can handle up to an hour's worth of video, 11 hours of audio, or over 700,000 words in a document — a feat described as the "longest context window" among large-scale AI models. This surpasses the data processing capabilities of the latest AI models from OpenAI and Anthropic, according to Google. During a pre-recorded video demonstration, Google showcased Gemini 1.5 Pro's capabilities.
In one instance, the AI model ingested a 402-page PDF transcript of the Apollo 11 moon landing and successfully located quotes highlighting "three funny moments." Another demonstration involved finding a specific scene in a 44-minute Buster Keaton film based on a rough sketch provided to the model. Despite its advanced features, Google acknowledges that Gemini 1.5 Pro, like all generative models, is not infallible. The AI model may exhibit imperfections such as hallucinations, occasional
. Read more on livemint.com