Project Euphonia’s new step: 1,000 hours of speech recordings
Muratcan Cicek, a PhD candidate at UC Santa Cruz, worked as a summer intern on Google’s Project Euphonia, which aims to improve computers’ abilities to understand impaired speech. This work was especially relevant and important for Muratcan, who was born with cerebral palsy and has a severe speech impairment.
Before his internship, Muratcan recorded 2,000 phrases for Project Euphonia. These phrases, expressions like “Turn the lights on” and “Turn up thermostat to 74 degrees,” were used to build a personalized speech recognition model that could better recognize the unique sound of his voice and transcribe his speech. The prototype allowed Muratcan to share the transcription in a video call so others could better understand him. He used the prototype to converse with co-workers, give status updates during team meetings and connect with people in ways that were previously impossible. Muratcan says, “Euphonia transformed my communication skills in a way that I can leverage in my career as an engineer without feeling insecure about my condition.”
1,000 hours of speech samples
The phrases that Muratcan recorded were key to training custom machine learning models that could help him be more easily understood. To help other people that have impaired speech caused by ALS, Parkinson’s disease or Down Syndrome, we need to gather samples of their speech patterns. So we’ve worked with partners like CDSS, ALS TDI, ALSA, LSVT Global, Team Gleason and CureDuchenne to encourage people with speech impairments to record their voices and contribute to this research.
Since 2018, nearly 1,000 participants have recorded over 1,000 hours of speech samples. For many, it’s been a source of pride and purpose to shape the future of speech recognition, not only for themselves but also for others who struggle to be understood.
I contribute to this research so that I can help not only myself, but also a larger group of people with communication challenges that are often left out.
While the technology is still under development, the speech samples we’ve collected helped us create personalized speech recognition models for individuals with speech impairments, like Muratcan. For more technical details about how these models work, see the Euphonia and Parrotron blog posts. We’re evaluating these personalized models with a group of early testers. The next phase of our research aims to improve speech recognition systems for many more people, but it requires many more speech samples from a broad range of speakers.
How you can contribute
To continue our research, we hope to collect speech samples from an additional 5,000 participants. If you have difficulty being understood by others and want to contribute to meaningful research to improve speech recognition technologies, learn more and consider signing up to record phrases. We look forward to hearing from more participants and experts— and together, helping everyone be understood.