Artificial intelligence has been taught to determine the appearance of the voice


Scientists do not stand still, teaching artificial intelligence so that he can perform as many tasks as possible. For example, now he can generate human faces based on gender, ethnicity, and age, not individual characteristics.

Artificial intelligence has been taught to determine the appearance of the voice
Artificial intelligence has been taught to determine the appearance of the voice.



Scientists named this method of definition - Speech2Face. The neural network, which "thinks" like the human brain, showed a million educational video clips from the system, which featured over 100,000 speakers.


From this dataset, Speech2Face has formed an association between voice signals and specific physical features of the human face. Then the AI ​​used a sound recording to model a photorealistic look that would match the voice. The results were published online on May 23.

Fortunately, artificial intelligence (so far) does not precisely determine what a particular person looks like, based only on her voice. The authors of the study reported that the neural network recognizes in the language-specific markers indicating gender, age, and ethnicity, as well as the peculiarities that are inherent in many people.


The faces generated by Speech2Face - all in the position of full and neutral expressions - were inaccurately answered by people whose voice studied the AI. However, images usually illustrate the correct age, ethnicity, and gender identity of people.

In the event of a collision with language variations, AI also showed errors. For example, when the neural network "listened" to the audio recording of an Asian speaking Chinese, the program generated an image of an Asian face. However, when the same person spoke English in another audio tape, artificial intelligence formed the front of a person of a European race.


The algorithm also demonstrated gender bias, linking low voices to male faces and high voices with women.