Chinese Tech Giant Baidu Develops A.I. That Can Clone Any Voice Within Seconds


By Fattima Mahdi

Researchers at Baidu have created an A.I. that they claim can clone your voice in under a minute. Deep Voice is a text-to-speech synthesis system which Baidu trained using 800 hours of audio from 2,400 speakers. Though the system typically needs 100 5-second sections of vocal training to mimic a voice, a 10-5 second sample was enough to trick a voice-recognition system more than 95 percent of the time.

Deep Voice is able to generate new speech, accents, tones and styles. For example, it is able to change a female voice to male and a british accent to an American one. “From a technical perspective, this is an important breakthrough showing that a complicated generative modeling problem, namely speech synthesis, can be adapted to new cases by efficiently learning only from a few examples,” Leo Zou, a member of Baidu’s communications team. “Previously, it would take numerous examples for a model to learn. Now, it takes a fraction of what it used to.”

“We see many great use cases or applications for this technology,” Zou said. “For example, voice cloning could help patients who lost their voices. This is also an important breakthrough in the direction of personalized human-machine interfaces. For example, a mom can easily configure an audiobook reader with her own voice. The method [additionally] allows creation of original digital content. Hundreds of characters in a video game would be able to have unique voices because of this technology. Another interesting application is speech-to-speech language translation, as the synthesizer can learn to mimic the speaker identity in another language.”

Technologies like Deep Voice represent the rapid advancement in machine learning. However, some are concerned that the system can be used and abused to fabricate interviews, news segments and press conferences. With concerns that this new clone technology lead to even more fake news, let’s hope it does not get into the wrong hands. 

Read more: Neural Voice Cloning With A Few Samples (White Paper)

Image credit: sifotography / 123RF Stock Photo

