In the latest version of Android, Google have redeveloped their voice recognition software and how it interprets and processes what you say. The new software is based on a neutral network, a computerized learning system that functions similar to the human brain.
The results of using the new system were dramatic, Vincent Vanhoucke, a researcher at Google stating that the voice error rate in the latest version of Android is 25% lower than the previous versions, which makes people more comfortable with using voice commands, instead of just doing everything manually. He also notes that users are now talking in a more natural way when using voice commands, instead of speaking like a robot, which has helped to make the feature more accessible and natural.
When you use Android’s voice recognition, your message is now outputted as a spectrogram that is split up and sent to 8 different Google servers for processing; due to advancements made by Google, the process of breaking up and processing your voice commands is now a lot faster. This data is then processed using the neutral network algorithms developed by Vanhoucke and his team.
The way the data is processed in Jelly Bean is revolutionary, by breaking up the interpretation of the message into steps, starting form the basics of working out what vowels and consonants were used, which is then assembled into words and then into a sentence, all the while computers are making sophisticated guesses on what the user is saying, comparing them to real life data and logical speech patterns. This process is extremely similar to the way a human mind discerns words and meaning in a conversation.
Google is also using neural networks for image recognition, breaking up the process of identifying an image in a similar way the human brain works, which we now find in applications such as Google Goggles and Google Now. While the Jelly Bean voice recognition system is reaching maturity, we probably won’t see the full potential of the image recognition software until the release of Google Glass, where Google’s entire database will be used to augment your perception of the world