It’s being reported that Google has been seeking help from users on Reddit to improve the quality of their voice recognition software. According to The Verge, third party contractors that had been hired by a company called Appen who is working with Google on this matter have been posting to various subreddits, to let users know that Google was looking to recruit people for a brief period in order to use their voice for recorded samples. The end goal is to help Google’s voice recognition software better understand accents with which it currently has more trouble understanding.
The report states that the subreddit posts were first spotted in the /r/Edinburgh subreddit in hopes of enticing people to allow Google to record voice samples with a Scottish accent. Other threads have received posts as well, with Google reportedly seeking people with Indian accents, Chinese accents, and various American accents. It’s mentioned that Google mostly recorded people speaking the well-known catchphrase used to initiate a Google request by voice, although the “Ok Google” command was not the only voice sample gathered from the collective of Redditers. Other samples with people saying “Hey Google” were also recorded, as were people’s responses to questions asking them to name off popular TV shows, toys, and video games.
Appen, the company who hired third-party contractors to create the subreddit posts, collects the recorded voice samples prior to having them annotated by linguists which are employed by Appen. The linguists also may have to break down longer sentences into smaller, more manageable grammatical chunks so they’re easier to process. According to the subreddit posts, Google set up an offer that would pay out a total of $35 for 2,000 phrases, while those under the age of 17 would be offered $26 for 500 phrases. It’s not listed whether or not the those threads are still live, and Google have apparently not confirmed that they have had any involvement in the process. While Google might do a decent enough job of improving their speech software using the machine learning tools they have at their disposal, it’s apparent that they can only go so far without the needed tools to help their computers powered by machine learning technology understand different accents. Luckily this appears to be a rather simple way for Google to get the data that they need.