Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network

Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-...

Full description

Saved in:
Bibliographic Details
Main Authors: Xi Chen (Author), Xu Zhang (Author), Xiang Chen (Author), Xun Chen (Author)
Format: Book
Published: IEEE, 2023-01-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_8c5342e327894227a2c94f9d6f40f17c
042 |a dc 
100 1 0 |a Xi Chen  |e author 
700 1 0 |a Xu Zhang  |e author 
700 1 0 |a Xiang Chen  |e author 
700 1 0 |a Xun Chen  |e author 
245 0 0 |a Decoding Silent Speech Based on High-Density Surface Electromyogram Using Spatiotemporal Neural Network 
260 |b IEEE,   |c 2023-01-01T00:00:00Z. 
500 |a 1558-0210 
500 |a 10.1109/TNSRE.2023.3266299 
520 |a Finer-grained decoding at a phoneme or syllable level is a key technology for continuous recognition of silent speech based on surface electromyogram (sEMG). This paper aims at developing a novel syllable-level decoding method for continuous silent speech recognition (SSR) using spatio-temporal end-to-end neural network. In the proposed method, the high-density sEMG (HD-sEMG) was first converted into a series of feature images, and then a spatio-temporal end-to-end neural network was applied to extract discriminative feature representations and to achieve syllable-level decoding. The effectiveness of the proposed method was verified with HD-sEMG data recorded by four pieces of 64-channel electrode arrays placed over facial and laryngeal muscles of fifteen subjects subvocalizing 33 Chinese phrases consisting of 82 syllables. The proposed method outperformed the benchmark methods by achieving the highest phrase classification accuracy (97.17 &#x00B1; 1.53&#x0025;, <inline-formula> <tex-math notation="LaTeX">${p} &lt; 0.05$ </tex-math></inline-formula>), and lower character error rate (3.11 &#x00B1; 1.46&#x0025;, <inline-formula> <tex-math notation="LaTeX">${p} &lt; 0.05$ </tex-math></inline-formula>). This study provides a promising way of decoding sEMG towards SSR, which has great potential applications in instant communication and remote control. 
546 |a EN 
690 |a Silent speech recognition 
690 |a high-density surface electromyography 
690 |a spatiotemporal feature 
690 |a language model 
690 |a time sequence decoding 
690 |a Medical technology 
690 |a R855-855.5 
690 |a Therapeutics. Pharmacology 
690 |a RM1-950 
655 7 |a article  |2 local 
786 0 |n IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol 31, Pp 2069-2078 (2023) 
787 0 |n https://ieeexplore.ieee.org/document/10098814/ 
787 0 |n https://doaj.org/toc/1558-0210 
856 4 1 |u https://doaj.org/article/8c5342e327894227a2c94f9d6f40f17c  |z Connect to this object online.