Microsoft's Speech Recognition Almost Matches Human Hearing

Cortana AH 2

Voice dictation is an easy way to interact with electronics that is improving rapidly. Virtually all modern smartphones have built-in voice assistants, as well as the ability to type using voice. The accuracy of most speech recognition software, however, still needs quite a bit of work, as it is often prone to getting words wrong, particularly in noisy environments.

Microsoft has now made a breakthrough that will significantly increase the accuracy of speech recognition software. Last month, the team behind research and development of speech recognition software reported a word error rate (WER) of 6.3%, which means that of all the words dictated, 93.7% of them were recorded accurately. In the latest test, the speech recognition system achieved a WER of only 5.9%, which is the lowest WER that has ever been recorded when using such systems. According to Xuedong Huang, chief speech scientist, this level of accuracy can match the accuracy of actual human speech recognition. It is equal in accuracy, or more accurate, than professional transcriptionists at recognizing words. With such accuracy, people will no longer have to make special efforts to ensure their pronunciation is clear to be accurately understood, but can rather just speak naturally in their normal manner of speaking, and in most cases, the software will still understand what is being said. As far as practical applications for the speech recognition system is concerned, Microsoft may have plans to include it in game consoles, mobile devices, and computers, and is likely to use the technology to improve the accuracy of their own smart assistant, Cortana. Although the latest testing has shown very good results, it’s still not completely accurate, and will still miss a few words from time to time, however, it will do this less often overall than a human listener would, which is quite impressive.

As technology companies try to make their hardware easier to use, speech recognition is quickly becoming one of the most common ways that consumers interact with their smart devices. In fact, some devices, such as Google’s new connected speaker, Google Home, rely entirely on voice recognition. While most modern solutions are fairly accurate, these latest improvements will likely make devices much easier to communicate with using voice and may provide excellent options with regard to the ways users interact with their devices.