Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition

Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless,...

Full description

Saved in:
Bibliographic Details
Main Authors: Shengcai Duan (Author), Le Wu (Author), Aiping Liu (Author), Xun Chen (Author)
Format: Book
Published: IEEE, 2023-01-01T00:00:00Z.
Subjects:
Online Access:Connect to this object online.
Tags: Add Tag
No Tags, Be the first to tag this record!

MARC

LEADER 00000 am a22000003u 4500
001 doaj_a91095580e2e4ada89c9b8f0e64a9e7a
042 |a dc 
100 1 0 |a Shengcai Duan  |e author 
700 1 0 |a Le Wu  |e author 
700 1 0 |a Aiping Liu  |e author 
700 1 0 |a Xun Chen  |e author 
245 0 0 |a Alignment-Enhanced Interactive Fusion Model for Complete and Incomplete Multimodal Hand Gesture Recognition 
260 |b IEEE,   |c 2023-01-01T00:00:00Z. 
500 |a 1558-0210 
500 |a 10.1109/TNSRE.2023.3335101 
520 |a Hand gesture recognition (HGR) based on surface electromyogram (sEMG) and Accelerometer (ACC) signals is increasingly attractive where fusion strategies are crucial for performance and remain challenging. Currently, neural network-based fusion methods have gained superior performance. Nevertheless, these methods typically fuse sEMG and ACC either in the early or late stages, overlooking the integration of entire cross-modal hierarchical information within each individual hidden layer, thus inducing inefficient inter-modal fusion. To this end, we propose a novel Alignment-Enhanced Interactive Fusion (AiFusion) model, which achieves effective fusion via a progressive hierarchical fusion strategy. Notably, AiFusion can flexibly perform both complete and incomplete multimodal HGR. Specifically, AiFusion contains two unimodal branches and a cascaded transformer-based multimodal fusion branch. The fusion branch is first designed to adequately characterize modality-interactive knowledge by adaptively capturing inter-modal similarity and fusing hierarchical features from all branches layer by layer. Then, the modality-interactive knowledge is aligned with that of unimodality using cross-modal supervised contrastive learning and online distillation from embedding and probability spaces respectively. These alignments further promote fusion quality and refine modality-specific representations. Finally, the recognition outcomes are set to be determined by available modalities, thus contributing to handling the incomplete multimodal HGR problem, which is frequently encountered in real-world scenarios. Experimental results on five public datasets demonstrate that AiFusion outperforms most state-of-the-art benchmarks in complete multimodal HGR. Impressively, it also surpasses the unimodal baselines in the challenging incomplete multimodal HGR. The proposed AiFusion provides a promising solution to realize effective and robust multimodal HGR-based interfaces. 
546 |a EN 
690 |a Multimodal fusion 
690 |a hand gesture recognition 
690 |a myoelectric control 
690 |a accelerometer 
690 |a incomplete multimodal 
690 |a alignment 
690 |a Medical technology 
690 |a R855-855.5 
690 |a Therapeutics. Pharmacology 
690 |a RM1-950 
655 7 |a article  |2 local 
786 0 |n IEEE Transactions on Neural Systems and Rehabilitation Engineering, Vol 31, Pp 4661-4671 (2023) 
787 0 |n https://ieeexplore.ieee.org/document/10323506/ 
787 0 |n https://doaj.org/toc/1558-0210 
856 4 1 |u https://doaj.org/article/a91095580e2e4ada89c9b8f0e64a9e7a  |z Connect to this object online.