Neuroengineers from the Zuckerman Institute at Columbia University have created a system that directly translates brain signals into speech.
The technology monitors a subject’s brain activity to reconstruct the words being heard with unprecedented clarity. Speech synthesizers and Artificial Intelligence are integral components of the system.
It is expected that this breakthrough can lead to new ways for computers to communicate directly with the brain. Furthermore, it can also help people who are otherwise incapable of speech — such as stroke victims and people suffering from amyotrophic lateral sclerosis — regain their ability to communicate with the outside world.
These findings were published in Scientific Reports on January 29.
“Our voices help connect us to our friends, family and the world around us, which is why losing the power of one’s voice due to injury or disease is so devastating,” said Nima Mesgarani, the paper’s senior author and a principal investigator at the Zuckerman Institute. “With today’s study, we have a potential way to restore that power. We’ve shown that, with the right technology, these people’s thoughts could be decoded and understood by any listener.”
Read more: Hidden region of the human brain discovered
There is a significant body of research that shows that when people speak — or even imagine speaking — distinct patterns of activity appear in the brain. Experts now foresee a future in which these patterns will no longer remain hidden, but will be translated into audio at will.
However, all previous attempts to make this a reality failed. Because of this, Dr Mesgarani’s team opted to use a vocoder, a computer algorithm that can synthesize speech after being trained on recordings of people talking.
“This is the same technology used by Amazon Echo and Apple Siri to give verbal responses to our questions,” said Dr. Mesgarani.
To train the vocoder, Dr Mesgarani turned to Dr Ashesh Dinesh Mehta, a neurosurgeon at Northwell Health Physician Partners Neuroscience Institute, for help.
“Working with Dr. Mehta, we asked epilepsy patients already undergoing brain surgery to listen to sentences spoken by different people, while we measured patterns of brain activity,” said Dr. Mesgarani. “These neural patterns trained the vocoder.”
Next, the researchers asked those same patients to listen to speakers reciting digits between 0 to 9, while recording brain signals that could then be run through the vocoder. The sound produced by the vocoder in response to those signals was analyzed and cleaned up by neural networks, a type of AI that mimics the structure of neurons in the biological brain.
The end result was a robotic-sounding voice reciting a sequence of numbers. To test the accuracy of the recording, Dr. Mesgarani and his team tasked individuals to listen to the recording and report what they heard.
“We found that people could understand and repeat the sounds about 75% of the time, which is well above and beyond any previous attempts,” said Dr. Mesgarani. The improvement in intelligibility was especially evident when comparing the new recordings to the earlier, spectrogram-based attempts. “The sensitive vocoder and powerful neural networks represented the sounds the patients had originally listened to with surprising accuracy.”
Dr. Mesgarani and his team plan to test more complicated words and sentences next, and they want to run the same tests on brain signals emitted when a person speaks or imagines speaking.
“In this scenario, if the wearer thinks ‘I need a glass of water,’ our system could take the brain signals generated by that thought, and turn them into synthesized, verbal speech,” said Dr. Mesgarani. “This would be a game changer. It would give anyone who has lost their ability to speak, whether through injury or disease, the renewed chance to connect to the world around them.”