OK folks, Siri has a new voice with enhanced Artificial Intelligence. Apple’s newest iOS 11 has made some changes to Siri’s voice for the iPhone. Recent articles tout the change in Siri as “more human” and “genuine”.
In the voiceover world, voice actors heed the call on many auditions to sound “real”. The combination of old and new voice synthesis techniques aims to make her sound as life-like as possible. So now Siri has a new voice. Listen to the changes.
Apple made the changes with a different voice actress, plus dove deep into speech and voice systhesis. The new voice to me sounds younger, with a higher vocal pitch.
The approach Apple took to give Siri a vocal makeover they say wasn’t easy. You must select the appropriate “phone” segments and join them together. The acoustic characteristics of each “phone” depend on its neighboring “phones” and the pattern and rhythm of speech, which often makes the speech units incompatible with each other.
It’s why previous versions of Siri sound a little robotic at times.
To solve the problem, Apple turned to deep learning and created a system that can “accurately predict both target and concatenation” elements in the database of half-phones that it has access to.
“The benefit of this approach becomes more clear when we consider the nature of speech. Sometimes the speech features, such as formants, are rather stable and evolve slowly, such as in the case of vowels. Elsewhere, speech can change quite rapidly, such as in transitions between voiced and unvoiced speech sounds. To take this variability into account, the model needs to be able adjust its parameters according to the aforementioned variability,” the Apple team explained.
Will Voice Actors be replaced by AI?
So AI (artificial Intelligence) has come a long way, no doubt. If you want to see all the geeky technical stuff, read the paper on Siri the Apple team published. I guarantee it’ll make your head spin!
After all that, I strongly believe that there’s one thing humans have that artificial intelligence will never have; EMOTIONAL INTELLIGENCE!
This is all very well with a single sentence (which is all Siri has to provide usually) but humanising over long passages is I think much harder.
Very insightful Tim. And completely agreed. I think AI is a long way off from sounding human and conveying any point of view, which speaks to your point.