AH Google Translate-2

Google Explains How ‘Translate’ Achieves Real-Time Translations

July 29, 2015 - Written By Justin Diaz

Earlier this morning Google announced some pretty big changes that are coming to Google Translate in the latest version of the app, which should be heading out to users between now and the next couple of weeks. In this latest update, Google has added in 18 new languages which Translate is able to perform two-way instant visual translations of printed text, and two more languages which can do instant visual translations from English making for a total of now 27 supported languages for this particular function. Google demonstrated Translate’s capabilities through a video on their initial blog post about the new feature changes, (you can see this below) but now they’re also explaining to users how everything actually works. Hint: It uses Deep Learning.

The app starts by using the camera to basically block out the processing of any background objects in the image and only focus on the letters in whatever words you’re trying to translate. This is only one step in the entire process though, and it continues by getting the app to recognize the letters. Using a “convolutional neural network” Google trained the Translate app to be able to tell the difference between letters and non-letters. Google also mentions however that they can’t simply train the app to recognize only letters which look perfect. They had to train it to recognize imperfections as well because that’s how things can end up looking in the real world. To simulate this they had their letter generator cook up letters with all the kinds of imperfections one might find in reality. The generator actually creates letters with fake dirt, reflections, and smudges as part of the process.

These steps are followed up by taking all of the letters which the app recognizes and forming those words to look them up in a dictionary so they can retrieve translations for them, and then placing those translations on top of the original word. All of this sounds pretty complex, and it is, which makes the process of sizing things down for the mobile aspect especially for users who may not have a data connection an even more impressive task. Google basically had to create a tiny neural net with limits on what Google could try to teach it. Now that you have a broad idea of how it works, go out and try things for yourself.

Google Translate