This AI could help speech-impaired people talk to Siri and Google

Tuesday, January 25, 2022

Originally published by Wired Middle East on January 18, 2022

MBZUAI student Karima Kadaoui is developing algorithms to help speech-impaired people communicate and navigate their way through society.

The idea behind the project is to have a person with speech impediments talk and have an application to translate what they said in a way that is understandable by other people.

Karima Kadaoui
MBZUAI master's student
Growing up, Karima Kadaoui was acutely aware of the difficulties that her brother faced because of his speech impairment. Now a masters student at the Mohamed bin Zayed University of Artificial Intelligence (MBZUAI), the young Moroccan engineer is on a high-tech mission to tackle the issues she observed in her childhood. “A person with speech impairments has more trouble communicating with their environment to be understood,” says Kadaoui. “This is especially crucial, because these people usually also have other models [of] disabilities that make their daily life harder, and they need other people to help them.”

This AI could help speech-impaired people from WIRED Middle East on Vimeo.

Kadaoui, a master’s student at MBZUAI, is drawing on her machine-learning expertise to build an app for the speech-impaired to communicate with others. “The idea behind the project is to have a person with speech impediments talk and have an application to translate what they said in a way that is understandable by other people,” she explains.

It’s a project designed to help people with all sorts of conditions, including strokes and cerebral palsy. People with speech impairments sometimes struggle to control the muscles used to talk, making their speech difficult to understand. The AI that Kadaoui is developing could help improve communication and allow the speech-impaired to participate in society more smoothly. As voice-enabled technologies grow increasingly important in our daily lives, Kadaoui also hopes that such a solution can ultimately be plugged into speech recognition systems like Siri and Google Assistant.

Kadaoui isn’t the first to recognize the gaps in speech recognition. Since 2019, Google has been working on developing algorithms that adapt speech recognition to people suffering from a stroke or other conditions that can lead to speech impairment. Amazon meanwhile integrated the Israeli startup Voiceitt’s app into Alexa last June, building a personalized AI-powered models that can understand specific requests from speech-impaired users.

Still, it’s a problem that’s easier identified than solved. Algorithms are only as good as the data they are trained with, and the phrases they come across most frequently become patterns for learning how to speak. This can be a problem when it comes to refining algorithms for the speech impaired, who can often struggle to speak for long periods of time, according to Kadaoui. So researchers need lots of audio samples (and manual transcriptions of the speech) to make associations between sounds and words.

Google has tackled the problem by prioritizing scripted speech. It sends its volunteers some 1500 phrases to read and record for its database, including a mix of unique units of speech, and those repeated to better train the algorithms. So far, the company says it has gathered some 1400 hours of data from more than a thousand volunteers, allowing researchers to refine their algorithms to understand different types of speech.

Kadaoui and her team are contemplating a similar fix. “We thought of making an app where a person can read a sentence that’s given to them and on their own time, record the sentences,” she says, noting that such an approach will also make submitting the data easier for the speech-impaired volunteers giving the team audio samples. “Eventually, we’ll get this continuously growing data set for speech.”

Related

thumbnail
Wednesday, December 18, 2024

AI and the Arabic language: Preserving cultural heritage and enabling future discovery

The Arabic language is underrepresented in the digital world, making AI inaccessible for many of its 400.....

  1. atlas ,
  2. language ,
  3. Arabic LLM ,
  4. United Nations ,
  5. Arabic language ,
  6. jais ,
  7. llms ,
  8. large language models ,
Read More
thumbnail
Thursday, December 12, 2024

Solving complex problems with LLMs: A new prompting strategy presented at NeurIPS

Martin Takáč and Zangir Iklassov's 'self-guided exploration' significantly improves LLM performance in solving combinatorial problems.

  1. processing ,
  2. prompting ,
  3. problem-solving ,
  4. llms ,
  5. neurips ,
  6. machine learning ,
Read More