MIT scientists, led by an Indian-origin student, have developed a computer system that can transcribe words that users say in their heads. The system consists of a wearable device and an associated computing system.
Electrodes in the device pick up neuromuscular signals in the jaw and face that are triggered by internal verbalisations -saying words 'in your head' - but are undetectable to the human eye. The signals are fed to a machine-learning system that has been trained to correlate particular signals with particular words.
The device also includes a pair of bone-conduction headphones, which transmit vibrations through the bones of the face to the inner ear. Since they do not obstruct the ear canal, the headphones enable the system to convey information to the user without interrupting conversation or otherwise interfering with the user's auditory experience.
The device is thus part of a complete silent-computing system that lets the user undetectably pose and receive answers to difficult computational problems. In one of the researchers' experiments, for instance, subjects used the system to silently report opponents' moves in a chess game and just as silently receive computer-recommended responses.
"The motivation for this was to build an IA device - an intelligence-augmentation device," said Arnav Kapur, a graduate student at the MIT, who led the development of the new system.
This would allow one to interact with computing devices without having to physically type into them, researchers said. The idea that internal verbalisations have physical correlates has been around since the 19th century, and it was seriously investigated in the 1950s.
However, subvocalisation as a computer interface is largely unexplored. The researchers' first step was to determine which locations on the face are the sources of the most reliable neuromuscular signals. They conducted experiments in which the people were asked to subvocalise a series of words four times, with an array of 16 electrodes at different facial locations each time.
The researchers wrote code to analyse the resulting data and found that signals from seven particular electrode locations were consistently able to distinguish subvocalised words.
Researchers developed a prototype of a wearable silent-speech interface, which wraps around the back of the neck like a telephone headset and has tentacle-like curved appendages that touch the face at seven locations on either side of the mouth and along the jaws.
They collected data on a few computational tasks with limited vocabularies - about 20 words each. One was arithmetic, in which the user would subvocalise large addition or multiplication problems; another was the chess application, in which the user would report moves using the standard chess numbering system.
Using the prototype wearable interface, the researchers conducted a usability study in which 10 subjects spent about 15 minutes each customising the arithmetic application to their own neurophysiology, then spent another 90 minutes using it to execute computations.
In that study, the system had an average transcription accuracy of about 92 per cent. However, the system's performance should improve with more training data, which could be collected during its ordinary use.
In ongoing work, the researchers are collecting a wealth of data on more elaborate conversations, in the hope of building applications with much more expansive vocabularies.