Chorus.ai is a business sales tool that helps streamline, transcribe, expand, and record conversations by forming a voice fingerprint for each person involved in a given call and figuring out who is speaking and why, and it does so through the use of extremely specialized natural language processing systems. Everything is run on a custom hardware and software stack crafted by Chorus.ai explicitly for this purpose, and it results in a product that’s wildly different from any other speech recognition and natural language processing AI program on the market. AndroidHeadlines had a chat with Dr. Micha Breakstone, the company’s chief scientist, to get a breakdown of how it all works and why is it so special.
For starters, the system was built around one main goal – business. Speaker recognition, recognizing pain points and keeping an accurate log of conversations were the biggest components of that goal. To that end, Chorus.ai ordered specially modified GPUs made for the sole purpose of running intensive AI training operations. Deep learning, a core AI technology, is at the core of most of what Chorus.ai does. The company’s natural language processing systems work in concert with a language model and optical character recognition for video calls, giving the system multiple ways to figure out who’s speaking, what they’re saying, and most importantly – what it means in context. When the product hits a client’s servers, it automatically starts building up a database of that client’s conversations, related jargon, and common repetitions.
This is how the system gets the bulk of its training, with three million conversations and counting, and it results in there being no such thing as a failed or faulty voice fingerprint. It may take longer than usual to create in some cases, but it will eventually form, and once it does, it will adapt and train more as it’s exposed to its source more often. The end result of all of that is speech recognition and natural language processing that is, according to Dr. Breakstone, up to 25-percent more accurate than bigger players whose systems are built on a more broad use case. On top of that, it can be customized by companies or even by individual users within a larger firm. They can identify things like risk factors, key phrases, and common upsell opportunities, then have the system alert them whenever those are found.
Dr. Breakstone stated that the company is open to licensing its technology in the form of APIs that can be integrated into an application’s stack, but licensing the core technology to be integrated directly into somebody else’s speech recognition stack is out of the question. This means that your Google Home or Amazon Echo won’t be making a pinpoint-accurate voice fingerprint of individual speakers or identifying key phrases and actionable tips in conversational contexts anytime soon. Chorus.ai is still in a startup phase, but with six PhD holders on staff and 15 industry publications put out, it likely won’t stay that way for long. It’s already raised $20 million from Emergence Capital and Redpoint Ventures and is poised for long-term growth, its management believes.