Many people suffer from speech loss as a result of illness, although their cognitive functions remain unaffected. So, in the wake of advances in AI, many researchers are focusing on synthesizing natural speech (vocalization) using a combination of brain implants and a neural network. If successful, this technology could be expanded to help people who have difficulty vocalizing due to conditions such as cerebral palsy or autism.
Image source: unsplash.com
For a long time, the main investment and attention of scientists was focused on implants that allow people with severe disabilities to use a keyboard, control robotic arms, or partially restore the use of paralyzed limbs. At the same time, many researchers focused on developing vocalization technologies that convert thought patterns into speech.
«“We’re making great progress. Making the transfer of voice from the brain to the synthetic voice as seamless as a chat between two people talking is our main goal,” said Edward Chang, a neurosurgeon at the University of California. “The AI algorithms we use are getting faster, and we’re learning with every new participant in our studies.”
In March 2025, Chang and colleagues published a paper in Nature Neuroscience describing their work with a paralyzed woman who had been unable to speak for 18 years after suffering a stroke. With the scientists’ help, she trained a neural network by silently attempting to produce sentences made up of 1,024 different words. The sound of her voice was then synthesized by streaming her neural data into a joint speech synthesis and text decoding model.
Image source: New England Journal of Medicine
The technology reduced the delay between the patient’s brain signals and the resulting sound from the original eight seconds to one second. This result is already comparable to the natural time interval of 100–200 milliseconds for normal speech. The system’s median decoding speed reached 47.5 words per minute, which is about a third of the speed of normal conversation.
Similar research has been carried out by Precision Neuroscience, with CEO Michael Mager claiming their approach can capture brain signals at higher resolution due to “densely packed electrodes.”
So far, Precision Neuroscience has successfully tested 31 patients and even received regulatory approval to leave its sensors implanted for up to 30 days. Mager says this will allow the neural network to be trained within a year on “the largest repository of high-resolution neural data that exists on planet Earth.” The next step, Mager says, is to “miniaturize the components and put them in hermetically sealed, biocompatible packages so they can be permanently implanted in the body.”
Image source: UC Davis Health
The biggest hurdle to developing and using brain-to-voice technology is the time it takes for patients to learn how to use the system. A key unresolved issue is how much response patterns in the motor cortex—the part of the brain that controls voluntary actions, including speech—vary among individuals. If they are similar, pre-trained models could be used with new patients, speeding up the individual training process, which takes tens or even hundreds of hours.
All vocalization researchers agree on the inadmissibility of “decoding inner thoughts,” that is, what a person does not want to express. According to one scientist, “there are many things I do not say out loud because they will not benefit me or may harm others.”
To date, scientists are still far from vocalization comparable to the average person’s normal conversation. Although decoding accuracy has been increased to 98%, voice output is not instantaneous and is unable to convey important speech features such as tone and mood. Scientists hope that they will eventually be able to create a vocal neuroprosthesis that will provide the full expressive range of the human voice, so that patients can control the tone and rhythm of their speech and even sing.