DeepMind Learns Lip Reading From TV

Advertisement
Advertisement

It's not easy to defend watching hours of television as anything more than entertainment. Sure, some programs may offer some educational value, but marathon viewing of such can be overwhelming. When it comes to defending binge sessions of TV watching, however, most people simply won't be able to say much about settling in for over 5,000 hours of TV, or over half a year. Google's DeepMind AI, on the other hand, has a perfect defense that most humans simply won't be able to match; it watched all those hours of TV in order to learn lipreading, and it ended up being able to do it on par with and even above trained human professionals.

The DeepMind AI is based on deep learning, a branch of machine learning that places even more emphasis on analysis and repetition. Rather than learning a task on its own over time, an AI equipped with deep learning is meant to be spoon fed data, which it uses to learn very specific things very quickly. Such was the case with DeepMind's recent TV binge; the AI went in with very little data in its memory banks about lip reading, and plunged into 5,000 hours' worth of TV content. which put it through reading about 118,000 complete sentences. The data feeding, conducted in conjunction with the University of Oxford, centered around six different TV shows, all fromĀ BBC.

Some of the raw TV programming that DeepMind had viewed had a timing mismatch between the sound of speech and somebody's lips moving. In order to combat this, DeepMind was fed a number of basic human speech sounds and their associated mouth shapes, and it was able to correctly realign the content on its own from there. Matched against a professional lip reader, the AI managed to actually pull out a victory. The pro deciphered some 12.4% of the content they were shown correctly, while DeepMind was able to correctly determine what was being said about 46.8% of the time. This also puts it light-years ahead of any other automated lip reader out there. Use cases for a lip reading AI may not be very numerous at a quick thought, but they do include some rather convenient applications like dictating to your phone's virtual assistant silently, improving hearing aids, and helping AI programs with speech recognition.

Advertisement