Speech recognition was everywhere at CES

You see reports from CES like "Clearly, voice control/voice recognition is the number one trend of this year", "Voice-controlled assistants steal the show" and "The most talked about tech at CES wasn’t on display."

Voice recognition was applied to everything - from controlling cars, showers, mirrors, and even toilets, to medicine. The Digital Health Summit had talks on "How voice-assisted technology is creating opportunities to improve medication and therapy adherence..." and "How to leverage Alexa and Google Home for guiding patients via voice-enabled technology to improve study success".

Voice understanding

We demonstrated an AI system with a difference... other voice recognition systems lump everyone together - they are designed to recognize anyone, regardless of where they're from and do their best to ignore the differences. Our machine learning demo took the opposite approach - it looks for the differences between speakers. This could have many applications - understanding anything about the demographics of the speaker, reading their mood, or discerning their context. 

To demonstrate the ability to discern differences, this demonstration was trained to differentiate between British and American accents. The user speaks a phrase and it shows the likelihood of the two accents. It is impressive, even though only a few hundred speakers were used for its training. It could be trained to differentiate anything else of interest - age, gender, mood, interest, etc. What does your voice say about you?

Another application of this technology could be to solve the problem of "watch people with accents confuse the hell out of AI assistants", by first determining the accent, then applying machine learning tailored for that accent.

Explainable AI

I now drive anywhere Waze tells me to drive - I trust the AI. But how is it making its decisions? As AI becomes more and more ubiquitous, how will we know if we can trust it? Is it making its decisions on our behalf, or that of company creating it? How do we test and regulate it? How does the FDA decide whether a device using AI is safe? That's where Explainable AI comes in. 

Explainable AI (XAI) is becoming an active area of research, and has gained new impetus with the EU's General Data Protection Regulation, which becomes enforceable from 25 May 2018. For decisions made on a solely algorithmic basis, it provides a "right to explanation." This is causing controversy, such as: EU's Right to Explanation: A Harmful Restriction on Artificial Intelligence. The impact will be spreading to many domains: What do we need to build explainable AI systems for the medical domain? But clearly, the area will continue to develop: Explain Yourself Machine.

The accent demo actually consists of two AI's, one to distinguish the accent, and the other to explain how the first made its decision. The latter uses a state-of-the-art approach to probe the first AI and 'understand its thinking'. It can probe any AI system to gain insights into what contributes to its decisions. In our demonstration of the explainable AI, the speaker's text is shown on the screen, and it is color coded to show which words influenced the decision towards American and which towards British, and how strongly. This is just one way of showing how it's working. Explainable AI is useful - it gives insights into how the AI functions, it provides information to help debug the system when something goes wrong, and eventually will become a tool used by regulators and testers in evaluating the AI.

Can you fake the accent?

People enjoyed trying - can you beat the machine? Test your fake British or American accent with this AI. You can try for yourself: Can we decode your ACCENT?

David Ritscher
Connected System Architect
With a focus on connected devices, consumer products, wearables, and implantable medical  devices; design of sensor systems, algorithms, DSP, machine learning; experienced in bringing new concepts from ideation to research and development through successful product launch