{"title":"一种新的自适应轻量级多模态高效特征推理网络ALME-FIN。","authors":"Xiaoliang Guo, Shuo Zhai","doi":"10.1007/s11571-024-10186-x","DOIUrl":null,"url":null,"abstract":"<p><p>Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.</p>","PeriodicalId":10500,"journal":{"name":"Cognitive Neurodynamics","volume":"19 1","pages":"24"},"PeriodicalIF":3.1000,"publicationDate":"2025-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729629/pdf/","citationCount":"0","resultStr":"{\"title\":\"A novel adaptive lightweight multimodal efficient feature inference network ALME-FIN for EEG emotion recognition.\",\"authors\":\"Xiaoliang Guo, Shuo Zhai\",\"doi\":\"10.1007/s11571-024-10186-x\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><p>Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.</p>\",\"PeriodicalId\":10500,\"journal\":{\"name\":\"Cognitive Neurodynamics\",\"volume\":\"19 1\",\"pages\":\"24\"},\"PeriodicalIF\":3.1000,\"publicationDate\":\"2025-12-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11729629/pdf/\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Cognitive Neurodynamics\",\"FirstCategoryId\":\"5\",\"ListUrlMain\":\"https://doi.org/10.1007/s11571-024-10186-x\",\"RegionNum\":3,\"RegionCategory\":\"工程技术\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"2025/1/13 0:00:00\",\"PubModel\":\"Epub\",\"JCR\":\"Q2\",\"JCRName\":\"NEUROSCIENCES\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Cognitive Neurodynamics","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1007/s11571-024-10186-x","RegionNum":3,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"2025/1/13 0:00:00","PubModel":"Epub","JCR":"Q2","JCRName":"NEUROSCIENCES","Score":null,"Total":0}
A novel adaptive lightweight multimodal efficient feature inference network ALME-FIN for EEG emotion recognition.
Enhancing the accuracy of emotion recognition models through multimodal learning is a common approach. However, challenges such as insufficient modal feature learning in multimodal inference and scarcity of sample data continue to pose obstacles that need to be overcome. Therefore, we propose a novel adaptive lightweight multimodal efficient feature inference network (ALME-FIN). We introduce a time-domain lightweight adaptive network (TDLAN) and a two-dimensional dynamic focusing network (TDDFN) for multimodal feature learning. The TDLAN incorporates the denoising process as an integral part of network training, achieving adaptive denoising for each sample through the continuous optimization of the trainable filtering threshold. Simultaneously, it incorporates an interactive convolutional sampling module, enabling lightweight multi-scale feature extraction in the time domain. TDDFN effectively extracts core image features while filtering out redundancies. During the training process, the Multi-network dynamic gradient adjustment framework (MDGAF) dynamically monitors the feature learning efficacy across different modalities. It timely adjusts the training gradients of networks to allocate additional optimization time for under-optimized modalities, thereby maximizing the utilization of multimodal feature information. Moreover, the introduction of a Multi-class relationship interaction module prior to the classifier aids the model in clearly understanding the relationships among different category samples. This approach enables the model to achieve relatively accurate emotion recognition even in scenarios of limited sample availability. Compared to existing multimodal learning techniques, ALME-FIN exhibits a more efficient multimodal feature inference method that can achieve satisfactory emotional recognition performance even with a limited number of samples.
期刊介绍:
Cognitive Neurodynamics provides a unique forum of communication and cooperation for scientists and engineers working in the field of cognitive neurodynamics, intelligent science and applications, bridging the gap between theory and application, without any preference for pure theoretical, experimental or computational models.
The emphasis is to publish original models of cognitive neurodynamics, novel computational theories and experimental results. In particular, intelligent science inspired by cognitive neuroscience and neurodynamics is also very welcome.
The scope of Cognitive Neurodynamics covers cognitive neuroscience, neural computation based on dynamics, computer science, intelligent science as well as their interdisciplinary applications in the natural and engineering sciences. Papers that are appropriate for non-specialist readers are encouraged.
1. There is no page limit for manuscripts submitted to Cognitive Neurodynamics. Research papers should clearly represent an important advance of especially broad interest to researchers and technologists in neuroscience, biophysics, BCI, neural computer and intelligent robotics.
2. Cognitive Neurodynamics also welcomes brief communications: short papers reporting results that are of genuinely broad interest but that for one reason and another do not make a sufficiently complete story to justify a full article publication. Brief Communications should consist of approximately four manuscript pages.
3. Cognitive Neurodynamics publishes review articles in which a specific field is reviewed through an exhaustive literature survey. There are no restrictions on the number of pages. Review articles are usually invited, but submitted reviews will also be considered.