Google Outs Cloud Speech For All Cloud Platform Users

April 18, 2017 - Written By Daniel Fuller

Google Cloud Platform’s coolest perks are the things that Google puts together for you to use on the platform with little to no effort, such as the newest pre-trained machine learning model on offer, their very own Cloud Speech API. The same extremely well-trained and powerful speech recognition API that powers numerous Google services that people use every day, such as Search and Assistant, is now out on general availability for all Google Cloud Platform customers to use in their projects on the platform. It isn’t publicly available for free or for sale on its own, but non-customers can net themselves a free trial.

Not only is Google’s own speech recognition API available to all Cloud Platform customers, but some tweaks have been made to it, resulting in it being more powerful and versatile, and more well-suited to run with Google Cloud Platform as a backend. The API already recognizes over 80 languages with a pretty high success rate thanks to training over the course of years by billions of users, but Google has sweetened the deal by fine-tuning transcription to help with long audio input, optimizing the API for faster batch processing of audio samples, and adding support for WAV, Opus, and Seex files. The feature set on offer will make it easier to further and more specifically train the model, implement it into just about any client-facing application, and even allow it to be trained and used for mission-critical or specialized scenarios, such as dictating a novel.

Early adopters and testers given early access have already made a few compelling cases for the use of this technology. Google gives two such examples in their blog post announcing the open availability. The first is InteractiveTel, a telephone marketing analytics outfit that has been using the API to help monitor and optimize dealer and customer interactions on the phone. According to CTO Gary Graves, the API has been performing quite accurately, and has been outputting transcriptions for them in close to real time. Clarion, a Japan-based car infotainment specialist, has been using the API to add voice capabilities to their products. Much like InteractiveTel’s CTO, Clarion exec Hirohisa Miyazawa had high praise for the API’s accuracy and speed.