一种新的自适应轻量级多模态高效特征推理网络ALME-FIN。

IF 3.1 3区工程技术 Q2 NEUROSCIENCES Cognitive Neurodynamics Pub Date : 2025-12-01 Epub Date: 2025-01-13 DOI:10.1007/s11571-024-10186-x

Xiaoliang Guo, Shuo Zhai

{"title":"一种新的自适应轻量级多模态高效特征推理网络ALME-FIN。","authors":"Xiaoliang Guo, Shuo Zhai","doi":"10.1007/s11571-024-10186-x","DOIUrl":null,"url":null,"abstract":"Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.","PeriodicalId":10500,"journal":{"name":"Cognitive Neurodynamics","volume":"19 1","pages":"24"},"PeriodicalIF":3.1000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729629/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel adaptive lightweight multimodal efficient feature inference network ALME-FIN for EEG emotion recognition.\",\"authors\":\"Xiaoliang Guo, Shuo Zhai\",\"doi\":\"10.1007/s11571-024-10186-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.\",\"PeriodicalId\":10500,\"journal\":{\"name\":\"Cognitive Neurodynamics\",\"volume\":\"19 1\",\"pages\":\"24\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729629/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Neurodynamics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11571-024-10186-x\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Neurodynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11571-024-10186-x","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}

引用次数: 0

摘要

通过多模态学习来提高情绪识别模型的准确性是一种常用的方法。然而，诸如多模态推理中模态特征学习不足和样本数据稀缺等挑战仍然是需要克服的障碍。因此，我们提出了一种新的自适应轻量级多模态高效特征推理网络（ALME-FIN）。我们引入了时域轻量级自适应网络（TDLAN）和二维动态聚焦网络（TDDFN）用于多模态特征学习。TDLAN将去噪过程作为网络训练的一个组成部分，通过对可训练滤波阈值的不断优化，实现对每个样本的自适应去噪。同时，它结合了一个交互式卷积采样模块，在时域上实现了轻量级的多尺度特征提取。TDDFN有效地提取核心图像特征，同时滤除冗余。在训练过程中，多网络动态梯度调整框架（MDGAF）动态监测不同模式下的特征学习效果。及时调整网络的训练梯度，为未优化的模态分配额外的优化时间，从而最大限度地利用多模态特征信息。此外，在分类器之前引入多类关系交互模块，有助于模型清晰地理解不同类别样本之间的关系。这种方法使模型即使在样本有限的情况下也能实现相对准确的情绪识别。与现有的多模态学习技术相比，ALME-FIN展示了一种更高效的多模态特征推理方法，即使在有限的样本数量下也能获得令人满意的情绪识别性能。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A novel adaptive lightweight multimodal efficient feature inference network ALME-FIN for EEG emotion recognition.

Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Cognitive Neurodynamics 医学-神经科学

CiteScore

6.90

自引率

18.90%

发文量

140

审稿时长

12 months

期刊介绍： Cognitive Neurodynamics provides a unique forum of communication and cooperation for scientists and engineers working in the field of cognitive neurodynamics, intelligent science and applications, bridging the gap between theory and application, without any preference for pure theoretical, experimental or computational models. The emphasis is to publish original models of cognitive neurodynamics, novel computational theories and experimental results. In particular, intelligent science inspired by cognitive neuroscience and neurodynamics is also very welcome. The scope of Cognitive Neurodynamics covers cognitive neuroscience, neural computation based on dynamics, computer science, intelligent science as well as their interdisciplinary applications in the natural and engineering sciences. Papers that are appropriate for non-specialist readers are encouraged. 1. There is no page limit for manuscripts submitted to Cognitive Neurodynamics. Research papers should clearly represent an important advance of especially broad interest to researchers and technologists in neuroscience, biophysics, BCI, neural computer and intelligent robotics. 2. Cognitive Neurodynamics also welcomes brief communications: short papers reporting results that are of genuinely broad interest but that for one reason and another do not make a sufficiently complete story to justify a full article publication. Brief Communications should consist of approximately four manuscript pages. 3. Cognitive Neurodynamics publishes review articles in which a specific field is reviewed through an exhaustive literature survey. There are no restrictions on the number of pages. Review articles are usually invited, but submitted reviews will also be considered.