Microsoft is developing AI that can imitate human voices in just three seconds » Leadersnet

| Alexander Schöpf


To prevent manipulation, work is being done on another piece of software that can recognize audio clips created with “VALL-E”.

VALL-E is the name of a new artificial intelligence (AI) that was developed by Microsoft and can imitate human voices in a deceptively real way. Like the Austrian daily newspaper The standard writesa snippet of sound three seconds long is enough to be able to imitate a voice – including the emotional coloring of the speaker and the acoustics of the spatial environment in which the voice sample was recorded.

Wide field of application and fear of manipulation

The computer group sees a wide range of applications for the groundbreaking technology. On the one hand, high-quality text-to-speech functions would be conceivable: For example, a text message could be read out with the sender’s voice. On the other hand, the correction of slips of the tongue would still be possible afterwards.

Of course, this also opens the door to the possibility of manipulation. For example, statements by people could be changed afterwards or created completely artificially without it being noticed. To prevent this, Microsoft wants to develop software that will recognize when an audio clip was created with VALL-E.

First sound clips released

However, the AI ​​will not be available to the general public for the time being, as it is still a research project. But to illustrate the revolutionary potential of VALL-E, the research team released some sound clipswhich show artificial intelligence in action.

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.