Meta’s new ‘Voicebox’ AI is a text-to-speech tool that learns like ChatGPT

Meta Ai business Research Machine Learning ChatGPT

16.06.2023 - 19:43

Reading now: 706

cointelegraph.com:

Meta AI recently unveiled a “breakthrough” text-to-speech (TTS) generator it claims produces results up to 20 times faster than state-of-the-art artificial intelligence models with comparable performance.

The new system, dubbed Voicebox, eschews traditional TTS architecture in favor of a model more akin to OpenAI’s ChatGPT or Google’s Bard.

Among the main differences between Voicebox and similar TTS models, such as ElevenLabs Prime Voice AI, is that Meta’s offering can generalize through in-context learning.

Much like ChatGPT or other transformer models, Voicebox uses large-scale training datasets. Previous efforts to use massive troves of audio data have resulted in severely degraded audio outputs. For this reason, most TTS systems use small, highly-curated, labelled datasets.

Meta overcomes this limitation through a novel training scheme that ditches labels and curation for an architecture capable of “in-filling” audio information.

As Meta AI put in a June 16 blog post, Voicebox is the “first model that can generalize to speech-generation tasks it was not specifically trained to accomplish with state-of-the-art performance.”

This makes it possible for Voicebox to translate text to speech, remove unwanted noise by synthesizing replacement speech, and even apply a speaker’s voice to different language outputs.

According to an accompanying research paper published by Meta, its pre-trained Voicebox system can accomplish all of this using only the desired output text and a three-second audio clip.

The arrival of robust speech-generation comes at particular sensitive time as social media companies continue to struggle with moderation and, in the U.S., a looming presidential election threatens to once again test the limits of

Read more on cointelegraph.com

All news from cointelegraph.com

About this in other media

Goyal launches new sugar-ethanol portal to boost renewable energy livemint.com /1 year ago

KEC International wins new orders of ₹1,042 crore across businesses livemint.com /1 year ago

What will Biden's new plan mean for borrowers set to begin paying back their student loans? abcnews.go.com /1 year ago

The website fvbb.com is an aggregator of news from open sources. The source is indicated at the beginning and at the end of the announcement. You can send a complaint on the news if you find it unreliable.