This is something, that has been in the works for quite some time now. Engineers have been trying for decades to create a program, that accurately translates thoughts into clear intelligible speech. At the moment these programs can only reconstruct the words a person has heard by monitoring their brain activity. However this is a huge step toward AI that can read the thoughts of people who can’t communicate verbally. An AI of that magnitude would have many other potential uses as well. The possibilities are vast. Researchers in Columbia used a vocoder to accomplish this breakthrough. First a bit of background on what a vocoder is.


With a name that combines the word voice and the word encoder into one title the vocoder is a device with quite a long history dating back to the 1930’s. It was designed to analyze speech and synthesize it for communication purposes. Both of these functions allowed for more precise means of encrypting sound for broadcast over the radio.

A vocoder operates on similar principles, that help to create the production of the human voice. In humans the sound is created by the opening and closing of the glottis in the vocal cords, then filtering it through the nose and throat. The vocoder works by producing a sound that is identified as the fundamental frequency. This frequency is then filtered in a way that produces a range of sounds that elevate the subtle nuances of itself which ensures the pitch and tone produced are clear and concise.

Over the years the vocoder has been tweaked a bit so that it can be used to create music. It remains in use today as an integral part of the auto tuning process. However, more recently it has been used to help program speech synthesizing AI’s in various ways.


An old vocoder.


Neuroengineers from the Zukerman institute at Columbia University were responsible for this breakthrough not too long ago. Simply put they used artificial intelligence to recognize the activity that appears in a person’s brain when that person listens to someone speak. They used a novel algorithm combined with a vocoder. They actually used epilepsy patients undergoing brain surgery to train this algorithm.

Due to the invasive nature of this procedure only 30 minutes of activity was recorded. The researchers asked the patients to listen to sentences spoken by different people while they measured brain activity. This brain activity is what trained the algorithm.

After that, the researchers programmed the vocoder to synthesize speech from the brain patterns using robotic voices similar to Amazon’s Alexa and Apple’s Siri. These two voices made the most sense at the time seeing as they are so widely available. Future developments of this technology could lead to programs that can directly read thoughts and translate them into speech. This is still decades away though.


The senior author of this study Nima Mesgarani claims that it will be at least a decade before they will be able to actually read people’s minds. However, this is a huge step in the right direction. At the moment we can’t implant an array of electrodes into our brain to record the signals necessary to create such a program. However, the fact that the researchers achieved such drastic results with a relatively small data set, suggests, that much more could be achieved with larger data sets.

Mesgarini says the next step for his team is to refine the algorithms so that more complex words and sentences can be decoded from the same data. Following that the idea would be to create a program that would allow people who have lost the ability to speak to communicate with others. This would give the a chance to reconnect with the world around them.

Leave a Reply

Your email address will not be published. Required fields are marked *