Imran Khan addressed a virtual internet rally on Sunday night through a voice clone, a first of its kind use of the technology.
His Pakistan Tehreek-e-Insaf party used artificial intelligence (AI) voice cloning platform ElevenLabs to clone the jailed cricketer-turned-politician’s voice by feeding the AI model with audio of his previous speeches, The Guardian first reported.
He reportedly sent the speech as a shorthand script through his lawyers and his party’s social media team used AI to dub the audio for a four-minute audio message.
What is a voice clone and how can it be created?
An audio deepfake or simply a voice clone is a synthetic audio created using generative AI tools which are trained on sample audio of a person. These tools can mimic the real voice of an individual with up to 95% accuracy in 29 languages and more than 50 accents.
There are also professional voice cloning models which can mirror every intonation, rhythm and nuance, generating a clone that’s virtually indistinguishable from the real voice.
The tech can be used for producing synthetic audio for applications such as videos, audiobooks, podcasts
and video games.
While instant services can create a voice clone using even a 10-second audio, professional ones require 5-10 minutes of training data to capture the style, different moods and pauses of a person’s speech.