Multimodal information enables Brain-Computer Interface (BCI) systems to adapt to the differences in individual neural characteristics, overcoming the limitations of each modality. As a result, multimodal fusion technology that integrates non-invasive brain imaging techniques such as electroencephalogram (EEG) and functional near-infrared spectroscopy (fNIRS) has gained widespread attention. However, in the field of hybrid BCI, challenges remain in effectively integrating the heterogeneous information from these two modalities and improving the decoding accuracy and generalization across various task conditions. The core issue lies in the underutilization of each modality’s signal characteristics and the incomplete capture of the potential homogeneity of higher-order hybrid features. Therefore, we propose a novel EEG-fNIRS multimodal spatial-temporal fusion decoding network (MSTFDN). This network combines multi-scale temporal convolution of time series differences and spatial multi-head self-attention mechanism. MSTFDN consists of three core components, including the EEG branch, the fNIRS branch, the EEG-fNIRS fusion branch. A multi-dimensional loss function is constructed based on independent and hybrid space multi-head expression diversity, aiming to achieve high-precision decoding in small sample datasets under multi-task and multiple personalized experimental protocols. In experiments with four motor imagery (MI) and mental workload (MWL) tasks of two public datasets under three personalized experimental protocols, MSTFDN demonstrated state-of-the-art performance. The more comprehensive experimental protocols may establish a benchmark for model performance evaluation for future research in this field. Meanwhile, MSTFDN is also expected to become a new benchmark method for EEG-fNIRS hybrid BCI research.
扫码关注我们
求助内容:
应助结果提醒方式:
