Pub Date : 2025-03-11DOI: 10.1016/j.inffus.2025.103073
Jakub Šmíd, Pavel Král
Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that focuses on understanding opinions at the aspect level, including sentiment towards specific aspect terms, categories, and opinions. While ABSA research has seen significant progress, much of the focus has been on monolingual settings. Cross-lingual ABSA, which aims to transfer knowledge from resource-rich languages (such as English) to low-resource languages, remains an under-explored area, with no systematic review of the field. This paper aims to fill that gap by providing a comprehensive survey of cross-lingual ABSA. We summarize key ABSA tasks, including aspect term extraction, aspect sentiment classification, and compound tasks involving multiple sentiment elements. Additionally, we review the datasets, modelling paradigms, and cross-lingual transfer methods used to solve these tasks. We also examine how existing work in monolingual and multilingual ABSA, as well as ABSA with LLMs, contributes to the development of cross-lingual ABSA. Finally, we highlight the main challenges and suggest directions for future research to advance cross-lingual ABSA systems.
{"title":"Cross-lingual aspect-based sentiment analysis: A survey on tasks, approaches, and challenges","authors":"Jakub Šmíd, Pavel Král","doi":"10.1016/j.inffus.2025.103073","DOIUrl":"10.1016/j.inffus.2025.103073","url":null,"abstract":"<div><div>Aspect-based sentiment analysis (ABSA) is a fine-grained sentiment analysis task that focuses on understanding opinions at the aspect level, including sentiment towards specific aspect terms, categories, and opinions. While ABSA research has seen significant progress, much of the focus has been on monolingual settings. Cross-lingual ABSA, which aims to transfer knowledge from resource-rich languages (such as English) to low-resource languages, remains an under-explored area, with no systematic review of the field. This paper aims to fill that gap by providing a comprehensive survey of cross-lingual ABSA. We summarize key ABSA tasks, including aspect term extraction, aspect sentiment classification, and compound tasks involving multiple sentiment elements. Additionally, we review the datasets, modelling paradigms, and cross-lingual transfer methods used to solve these tasks. We also examine how existing work in monolingual and multilingual ABSA, as well as ABSA with LLMs, contributes to the development of cross-lingual ABSA. Finally, we highlight the main challenges and suggest directions for future research to advance cross-lingual ABSA systems.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103073"},"PeriodicalIF":14.7,"publicationDate":"2025-03-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143619907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-09DOI: 10.1016/j.inffus.2025.103105
Zekang Bian , Linbiao Yu , Jia Qu , Zhaohong Deng , Shitong Wang
Although existing studies have confirmed that ensemble clustering methods based on co-association (CA) have been widely employed successfully, they still have the following drawback: the clustering performance and stability of ensemble clustering results heavily depend on the CA matrix. To enhance clustering performance while maintaining the stability of ensemble clustering results, an ensemble clustering method via learning the CA matrix with fuzzy neighbors (EC–CA–FN) is proposed in this study. First, EC–CA–FN constructs an accurate CA matrix by using both intra-cluster and inter-cluster relationships of pairwise samples from all base clustering results. Second, to improve the stability of ensemble clustering results, EC–CA–FN introduces a fuzzy index and the rank constraints on the constructed accurate CA matrix. This method invents a new ensemble clustering framework that learns the optimal fuzzy CA (FCA) matrix by adaptively assigning fuzzy neighbors of samples, thus obtaining the optimal clustering structure. Third, an alternative optimization method and weighting mechanism are adopted to achieve the optimal FCA matrix and adaptively assign all base clustering results. The experimental results on all adopted datasets indicate the effectiveness of EC–CA–FN in terms of both clustering performance and the stability of ensemble clustering results.
{"title":"An ensemble clustering method via learning the CA matrix with fuzzy neighbors","authors":"Zekang Bian , Linbiao Yu , Jia Qu , Zhaohong Deng , Shitong Wang","doi":"10.1016/j.inffus.2025.103105","DOIUrl":"10.1016/j.inffus.2025.103105","url":null,"abstract":"<div><div>Although existing studies have confirmed that ensemble clustering methods based on co-association (CA) have been widely employed successfully, they still have the following drawback: the clustering performance and stability of ensemble clustering results heavily depend on the CA matrix. To enhance clustering performance while maintaining the stability of ensemble clustering results, an ensemble clustering method via learning the CA matrix with fuzzy neighbors (EC–CA–FN) is proposed in this study. First, EC–CA–FN constructs an accurate CA matrix by using both intra-cluster and inter-cluster relationships of pairwise samples from all base clustering results. Second, to improve the stability of ensemble clustering results, EC–CA–FN introduces a fuzzy index and the rank constraints on the constructed accurate CA matrix. This method invents a new ensemble clustering framework that learns the optimal fuzzy CA (FCA) matrix by adaptively assigning fuzzy neighbors of samples, thus obtaining the optimal clustering structure. Third, an alternative optimization method and weighting mechanism are adopted to achieve the optimal FCA matrix and adaptively assign all base clustering results. The experimental results on all adopted datasets indicate the effectiveness of EC–CA–FN in terms of both clustering performance and the stability of ensemble clustering results.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103105"},"PeriodicalIF":14.7,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-09DOI: 10.1016/j.inffus.2025.103069
Bo Dai , Yijun Wang , Xinyu Mou , Xiaorong Gao
Reliable Brain-Computer Interface (BCI) systems are essential for practical applications. Current BCIs often suffer from performance degradation due to environmental noise and external interference. These environmental factors significantly compromise the quality of EEG data acquisition. This study presents a novel Mixture-of-Graphs-driven Information Fusion (MGIF) framework to enhance BCI system robustness through the integration of multi-graph knowledge for stable EEG representations. Initially, the framework constructs complementary graph architectures: electrode-based structures for capturing spatial relationships and signal-based structures for modeling inter-channel dependencies. Subsequently, the framework employs filter bank-driven multi-graph constructions to encode spectral information and incorporates a self-play-driven fusion strategy to optimize graph embedding combinations. Finally, an adaptive gating mechanism is implemented to monitor electrode states and enable selective information fusion, thereby minimizing the impact of unreliable electrodes and environmental disturbances. Extensive evaluations through offline datasets and online experiments validate the framework’s effectiveness. Results demonstrate that MGIF achieves significant improvements in BCI reliability across challenging real-world environments.
{"title":"A reliability-enhanced Brain–Computer Interface via Mixture-of-Graphs-driven Information Fusion","authors":"Bo Dai , Yijun Wang , Xinyu Mou , Xiaorong Gao","doi":"10.1016/j.inffus.2025.103069","DOIUrl":"10.1016/j.inffus.2025.103069","url":null,"abstract":"<div><div>Reliable Brain-Computer Interface (BCI) systems are essential for practical applications. Current BCIs often suffer from performance degradation due to environmental noise and external interference. These environmental factors significantly compromise the quality of EEG data acquisition. This study presents a novel Mixture-of-Graphs-driven Information Fusion (MGIF) framework to enhance BCI system robustness through the integration of multi-graph knowledge for stable EEG representations. Initially, the framework constructs complementary graph architectures: electrode-based structures for capturing spatial relationships and signal-based structures for modeling inter-channel dependencies. Subsequently, the framework employs filter bank-driven multi-graph constructions to encode spectral information and incorporates a self-play-driven fusion strategy to optimize graph embedding combinations. Finally, an adaptive gating mechanism is implemented to monitor electrode states and enable selective information fusion, thereby minimizing the impact of unreliable electrodes and environmental disturbances. Extensive evaluations through offline datasets and online experiments validate the framework’s effectiveness. Results demonstrate that MGIF achieves significant improvements in BCI reliability across challenging real-world environments.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103069"},"PeriodicalIF":14.7,"publicationDate":"2025-03-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143635810","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103046
Pengfei Zhang , Xiang Fang , Zhikun Zhang , Xianjin Fang , Yining Liu , Ji Zhang
With the rapid proliferation of data collection and storage technologies, the growing demand for horizontal multi-party data publishing has created an urgent need for robust privacy-preserving mechanisms that can effectively handle sensitive distributed data across multiple organizations. While existing approaches attempt to address this challenge, they often fail to balance privacy protection with data utility, struggle to achieve effective information fusion across heterogeneous data distributions, and incur significant computational overhead. In this paper, we introduce the NATION approach, an innovative GAN-based framework that advances multi-party data publishing through sophisticated information fusion techniques while maintaining stringent differential privacy guarantees and computational efficiency. In NATION, we modify the traditional GAN architecture through a distributed design where multiple discriminators are strategically allocated across parties while centralizing the generator at a semi-trusted server, enabling seamless fusion of distributed knowledge with minimal computational cost. Building on this foundation, we introduce two key technical innovations: an iterative-aware adaptive noise IAN method that dynamically optimizes noise injection based on training convergence, and a global-aware discriminator regularization GDR method that leverages Bregman Divergence to enhance inter-discriminator information exchange while ensuring model stability. Through comprehensive theoretical analysis and extensive experimental evaluation on real-world datasets, we demonstrate that NATION consistently outperforms state-of-the-art approaches by up to 7% in accuracy while providing provable privacy guarantees, which makes a significant advancement in secure GAN-based information fusion for privacy-sensitive applications.
{"title":"Horizontal multi-party data publishing via discriminator regularization and adaptive noise under differential privacy","authors":"Pengfei Zhang , Xiang Fang , Zhikun Zhang , Xianjin Fang , Yining Liu , Ji Zhang","doi":"10.1016/j.inffus.2025.103046","DOIUrl":"10.1016/j.inffus.2025.103046","url":null,"abstract":"<div><div>With the rapid proliferation of data collection and storage technologies, the growing demand for horizontal multi-party data publishing has created an urgent need for robust privacy-preserving mechanisms that can effectively handle sensitive distributed data across multiple organizations. While existing approaches attempt to address this challenge, they often fail to balance privacy protection with data utility, struggle to achieve effective information fusion across heterogeneous data distributions, and incur significant computational overhead. In this paper, we introduce the <em>NATION</em> approach, an innovative GAN-based framework that advances multi-party data publishing through sophisticated information fusion techniques while maintaining stringent differential privacy guarantees and computational efficiency. In <em>NATION</em>, we modify the traditional GAN architecture through a distributed design where multiple discriminators are strategically allocated across parties while centralizing the generator at a semi-trusted server, enabling seamless fusion of distributed knowledge with minimal computational cost. Building on this foundation, we introduce two key technical innovations: an iterative-aware adaptive noise <em>IAN</em> method that dynamically optimizes noise injection based on training convergence, and a global-aware discriminator regularization <em>GDR</em> method that leverages Bregman Divergence to enhance inter-discriminator information exchange while ensuring model stability. Through comprehensive theoretical analysis and extensive experimental evaluation on real-world datasets, we demonstrate that <em>NATION</em> consistently outperforms state-of-the-art approaches by up to 7% in accuracy while providing provable privacy guarantees, which makes a significant advancement in secure GAN-based information fusion for privacy-sensitive applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103046"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103061
Yushan Zhao , Kuan-Ching Li , Shunxiang Zhang , Tongzhou Ye
Technical Keyphrase Extraction (TKE) is crucial for summarizing the core content of scientific and technical texts. Existing keyphrase extraction models typically focus on calculating phrase and sentence correlations that can limit their ability to understand long contexts and uncover hierarchical semantic information, leading to biased results. To address these limitations, a hyperbolic graph technical attention network is designed and applied to a novel unsupervised Technical KeyPhrase Extraction (TKPE) model, achieving the fusion of complex hierarchical semantic representations and long-context information by constructing global embeddings of the technical text in hyperbolic space for high-fidelity representation with minimal dimensions. A technical attention score is calculated based on technical terminology degree and hierarchical relevance to guide the extraction process. Additionally, the network utilizes geodesic variations between embedded nodes to reveal meaningful hierarchical clustering relationships, thus enabling semantic structural understanding of technical text data and efficient extraction of the most relevant technical keyphrases. This work exploits the long-context understanding capability of large language models to generate candidate phrases guided by an effective prompt template that reduces information loss when importing candidate phrases in a hyperbolic graph attention network. Experiments performed on benchmark technical datasets demonstrate that the proposed model outperforms recent state-of-the-art baseline keyphrase extraction models.
{"title":"Hyperbolic graph attention network fusing long-context for technical keyphrase extraction","authors":"Yushan Zhao , Kuan-Ching Li , Shunxiang Zhang , Tongzhou Ye","doi":"10.1016/j.inffus.2025.103061","DOIUrl":"10.1016/j.inffus.2025.103061","url":null,"abstract":"<div><div>Technical Keyphrase Extraction (TKE) is crucial for summarizing the core content of scientific and technical texts. Existing keyphrase extraction models typically focus on calculating phrase and sentence correlations that can limit their ability to understand long contexts and uncover hierarchical semantic information, leading to biased results. To address these limitations, a hyperbolic graph technical attention network is designed and applied to a novel unsupervised Technical KeyPhrase Extraction (TKPE) model, achieving the fusion of complex hierarchical semantic representations and long-context information by constructing global embeddings of the technical text in hyperbolic space for high-fidelity representation with minimal dimensions. A technical attention score is calculated based on technical terminology degree and hierarchical relevance to guide the extraction process. Additionally, the network utilizes geodesic variations between embedded nodes to reveal meaningful hierarchical clustering relationships, thus enabling semantic structural understanding of technical text data and efficient extraction of the most relevant technical keyphrases. This work exploits the long-context understanding capability of large language models to generate candidate phrases guided by an effective prompt template that reduces information loss when importing candidate phrases in a hyperbolic graph attention network. Experiments performed on benchmark technical datasets demonstrate that the proposed model outperforms recent state-of-the-art baseline keyphrase extraction models.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103061"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143583088","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103070
Qiang Lai, Peng Chen
In traffic forecasting, a key challenge lies in capturing both long-term temporal dependencies and inter-node relationships. While recent work has addressed long-term dependencies using Transformer-based models, the handling of inter-node relationships remains limited. Most studies rely on predefined or adaptive adjacency matrices, which fail to capture rich, dynamic relationships such as traffic similarity and strength, features embedded in time-varying data and challenging to model effectively. To comprehensively understand and leverage these inter-node relationships, we propose a unified framework: Pretrained Graph Transformer (PreGT) and Mix Graph Transformer (MixGT). PreGT, through self-supervised masking and reconstruction of nodes, learns latent representations of inter-node relationships from time-varying node features. MixGT integrates relationship matrix construction and utilization modules, effectively leveraging the latent representations from PreGT through graph convolution and attention mechanisms to enhance the model’s ability to capture dynamic inter-node relationship features. Experimental validation on real traffic flow datasets demonstrates the effectiveness of our framework in predicting traffic flow by accurately capturing inter-node relationships.
{"title":"Unveiling node relationships for traffic forecasting: A self-supervised approach with MixGT","authors":"Qiang Lai, Peng Chen","doi":"10.1016/j.inffus.2025.103070","DOIUrl":"10.1016/j.inffus.2025.103070","url":null,"abstract":"<div><div>In traffic forecasting, a key challenge lies in capturing both long-term temporal dependencies and inter-node relationships. While recent work has addressed long-term dependencies using Transformer-based models, the handling of inter-node relationships remains limited. Most studies rely on predefined or adaptive adjacency matrices, which fail to capture rich, dynamic relationships such as traffic similarity and strength, features embedded in time-varying data and challenging to model effectively. To comprehensively understand and leverage these inter-node relationships, we propose a unified framework: Pretrained Graph Transformer (PreGT) and Mix Graph Transformer (MixGT). PreGT, through self-supervised masking and reconstruction of nodes, learns latent representations of inter-node relationships from time-varying node features. MixGT integrates relationship matrix construction and utilization modules, effectively leveraging the latent representations from PreGT through graph convolution and attention mechanisms to enhance the model’s ability to capture dynamic inter-node relationship features. Experimental validation on real traffic flow datasets demonstrates the effectiveness of our framework in predicting traffic flow by accurately capturing inter-node relationships.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103070"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103072
Qingwei Jia , Tingquan Deng , Ming Yang , Yan Wang , Changzhong Wang
Label correlation learning is a challenging issue in multi-label classification, which has been extensively studied recently. Typically, the second-order label correlation is achieved by fusing information from pairwise labels, while high-order correlation arises from integrating global information of the entire label matrix with the help of some regularization constraints. However, few studies focus on collaboratively learning label correlations through local and global label fusion. Unfortunately, in the case of label missing, neither second-order nor high-order label correlations can be accurately measured and characterized. To address the two issues, a novel approach for incomplete multi-label classification called class label fusion guided correlation learning (CLFCL) is proposed. The pointwise fuzzy mutual information is introduced for prior fusion of paired labels. Specifically, the second-order label correlation is obtained by relaxing the pointwise mutual information. Simultaneously, an adaptively low-rank regularization technique is developed to fuse globally relevant labels so as to extract the high-order correlations. By integrating second-order and high-order label correlations, the label distribution of instances is learned. To recover missing labels, a multi-label classifier is trained by regressing features to label distribution space rather than original logical label space. An efficient algorithm is designed to solve the built nonconvex optimization. Extensive experimental results validate the superior performance of the proposed model against state-of-the-art missing multi-label classification methods.
{"title":"Class label fusion guided correlation learning for incomplete multi-label classification","authors":"Qingwei Jia , Tingquan Deng , Ming Yang , Yan Wang , Changzhong Wang","doi":"10.1016/j.inffus.2025.103072","DOIUrl":"10.1016/j.inffus.2025.103072","url":null,"abstract":"<div><div>Label correlation learning is a challenging issue in multi-label classification, which has been extensively studied recently. Typically, the second-order label correlation is achieved by fusing information from pairwise labels, while high-order correlation arises from integrating global information of the entire label matrix with the help of some regularization constraints. However, few studies focus on collaboratively learning label correlations through local and global label fusion. Unfortunately, in the case of label missing, neither second-order nor high-order label correlations can be accurately measured and characterized. To address the two issues, a novel approach for incomplete multi-label classification called class label fusion guided correlation learning (CLFCL) is proposed. The pointwise fuzzy mutual information is introduced for prior fusion of paired labels. Specifically, the second-order label correlation is obtained by relaxing the pointwise mutual information. Simultaneously, an adaptively low-rank regularization technique is developed to fuse globally relevant labels so as to extract the high-order correlations. By integrating second-order and high-order label correlations, the label distribution of instances is learned. To recover missing labels, a multi-label classifier is trained by regressing features to label distribution space rather than original logical label space. An efficient algorithm is designed to solve the built nonconvex optimization. Extensive experimental results validate the superior performance of the proposed model against state-of-the-art missing multi-label classification methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103072"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143601458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103084
Jing Wang , Mohammad Tabrez Quasim , Bo Yi
The widespread availability of medical Internet of Things devices and smart healthcare monitoring systems has unprecedentedly led to the emergence of the generation of heterogeneous sensor data throughout the different decentralized healthcare institutions. Although this data has a significant potential to enhance patient care, the handling of multi-modal sensor data, with the need to maintain the privacy of the patients and comply with the necessary regulations, proves to be very difficult using traditional ways of central processing. We propose PHMS-Fed, a novel privacy-preserving heterogeneous multi-modal sensor fusion framework based on federated learning for smart healthcare applications. Our framework enables healthcare institutions to train shared diagnostic models collaboratively without exchanging raw sensor data while effectively capturing complex interactions between different sensor modalities. In order to maintain the privacy of its use, PHMS-Fed, through adaptive tensor decomposition and secure parameter aggregation, automatically matches different combinations of sensor modalities across different institutions. The conducted extensive experiments on real-world healthcare datasets reveal the prominent effectiveness of the proposed framework, as PHMS-Fed has surpassed selected state-of-the-art methods by 25.6 % concerning privacy preservation and by 23.4 % in relation to the accuracy of the cross-institutional monitoring. As the results clearly show, the framework is extremely efficient in handling multiple sensor modalities while being able to deliver strong results in physiological monitoring (accuracy score: 0.9386 out of 1.0), privacy preservation (protection score: 0.9845 out of 1.0), and sensor fusion (fusion accuracy: 0.9591 out of 1.0) applications.
{"title":"Privacy-preserving heterogeneous multi-modal sensor data fusion via federated learning for smart healthcare","authors":"Jing Wang , Mohammad Tabrez Quasim , Bo Yi","doi":"10.1016/j.inffus.2025.103084","DOIUrl":"10.1016/j.inffus.2025.103084","url":null,"abstract":"<div><div>The widespread availability of medical Internet of Things devices and smart healthcare monitoring systems has unprecedentedly led to the emergence of the generation of heterogeneous sensor data throughout the different decentralized healthcare institutions. Although this data has a significant potential to enhance patient care, the handling of multi-modal sensor data, with the need to maintain the privacy of the patients and comply with the necessary regulations, proves to be very difficult using traditional ways of central processing. We propose PHMS-Fed, a novel privacy-preserving heterogeneous multi-modal sensor fusion framework based on federated learning for smart healthcare applications. Our framework enables healthcare institutions to train shared diagnostic models collaboratively without exchanging raw sensor data while effectively capturing complex interactions between different sensor modalities. In order to maintain the privacy of its use, PHMS-Fed, through adaptive tensor decomposition and secure parameter aggregation, automatically matches different combinations of sensor modalities across different institutions. The conducted extensive experiments on real-world healthcare datasets reveal the prominent effectiveness of the proposed framework, as PHMS-Fed has surpassed selected state-of-the-art methods by 25.6 % concerning privacy preservation and by 23.4 % in relation to the accuracy of the cross-institutional monitoring. As the results clearly show, the framework is extremely efficient in handling multiple sensor modalities while being able to deliver strong results in physiological monitoring (accuracy score: 0.9386 out of 1.0), privacy preservation (protection score: 0.9845 out of 1.0), and sensor fusion (fusion accuracy: 0.9591 out of 1.0) applications.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103084"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143592990","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-07DOI: 10.1016/j.inffus.2025.103058
Yiyu Wang , Haifang Jian , Jian Zhuang , Huimin Guo , Yan Leng
Multimodal Sentiment Analysis (MSA) integrates information from text, audio, and visuals to understand human emotions, but real-world applications face two challenges: (1) expensive annotation costs reduce the effectiveness of fully supervised methods, and (2) missing modality severely impact model robustness. While there are studies addressing these issues separately, few focus on solving both within a single framework. In real-world scenarios, these challenges often occur together, necessitating an algorithm that can handle both. To address this, we propose a Semi-Supervised Learning with Missing Modalities (SSLMM) framework. SSLMM combines self-supervised learning, alternating interaction information, semi-supervised learning, and modality reconstruction to tackle label scarcity and modality missing simultaneously. Firstly, SSLMM captures latent structural information through self-supervised pre-training. It then fine-tunes the model using semi-supervised learning and modality reconstruction to reduce dependence on labeled data and improve robustness to modality missing. The framework uses a graph-based architecture with an iterative message propagation mechanism to alternately propagate intra-modal and inter-modal messages, capturing emotional associations within and across modalities. Experiments on CMU-MOSI, CMU-MOSEI, and CH-SIMS demonstrate that under the condition where the proportion of labeled samples and the missing modality rate are both 0.5, SSLMM achieves binary classification (negative vs. positive) accuracies of 80.2%, 81.7%, and 77.1%, respectively, surpassing existing methods.
{"title":"SSLMM: Semi-Supervised Learning with Missing Modalities for Multimodal Sentiment Analysis","authors":"Yiyu Wang , Haifang Jian , Jian Zhuang , Huimin Guo , Yan Leng","doi":"10.1016/j.inffus.2025.103058","DOIUrl":"10.1016/j.inffus.2025.103058","url":null,"abstract":"<div><div>Multimodal Sentiment Analysis (MSA) integrates information from text, audio, and visuals to understand human emotions, but real-world applications face two challenges: (1) expensive annotation costs reduce the effectiveness of fully supervised methods, and (2) missing modality severely impact model robustness. While there are studies addressing these issues separately, few focus on solving both within a single framework. In real-world scenarios, these challenges often occur together, necessitating an algorithm that can handle both. To address this, we propose a Semi-Supervised Learning with Missing Modalities (SSLMM) framework. SSLMM combines self-supervised learning, alternating interaction information, semi-supervised learning, and modality reconstruction to tackle label scarcity and modality missing simultaneously. Firstly, SSLMM captures latent structural information through self-supervised pre-training. It then fine-tunes the model using semi-supervised learning and modality reconstruction to reduce dependence on labeled data and improve robustness to modality missing. The framework uses a graph-based architecture with an iterative message propagation mechanism to alternately propagate intra-modal and inter-modal messages, capturing emotional associations within and across modalities. Experiments on CMU-MOSI, CMU-MOSEI, and CH-SIMS demonstrate that under the condition where the proportion of labeled samples and the missing modality rate are both 0.5, SSLMM achieves binary classification (negative vs. positive) accuracies of 80.2%, 81.7%, and 77.1%, respectively, surpassing existing methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103058"},"PeriodicalIF":14.7,"publicationDate":"2025-03-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143619909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-03-06DOI: 10.1016/j.inffus.2025.103049
YaoChong Li, Yi Qu, Ri-Gui Zhou, Jing Zhang
Sentiment classification research is gaining prominence for enhancing user experience, facilitating targeted marketing, and supporting mental health assessments while driving technological innovation. Due to the complexity and diversity of emotional expression, this study proposes quantum multimodal learning for sentiment classification (QMLSC), a novel quantum–classical hybrid model that integrates text and speech data to capture emotional signals more effectively. To address the limitations of the noisy intermediate-scale quantum era, we designed advanced variational quantum circuit (VQC) architectures to efficiently process high-dimensional data, maximizing feature retention and minimizing information loss. Our approach employs a residual structure that fuses quantum and classical components, enhancing the benefits of quantum features and conventional machine learning attributes. By using randomized expressive circuits, we improve system flexibility, accuracy, and robustness in sentiment classification tasks. Integrating VQC significantly reduces the number of parameters compared to fully connected layers, resulting in improved accuracy and computational efficiency. Empirical findings validate the superior performance of our fusion approach in effectively mitigating noise and error impacts associated with quantum computing and demonstrate strong potential for future applications in complex emotional information processing. This study provides new insights and methodologies for advancing sentiment classification technology and highlights the broad application potential for advancing quantum computing in information processing fields.
{"title":"QMLSC: A quantum multimodal learning model for sentiment classification","authors":"YaoChong Li, Yi Qu, Ri-Gui Zhou, Jing Zhang","doi":"10.1016/j.inffus.2025.103049","DOIUrl":"10.1016/j.inffus.2025.103049","url":null,"abstract":"<div><div>Sentiment classification research is gaining prominence for enhancing user experience, facilitating targeted marketing, and supporting mental health assessments while driving technological innovation. Due to the complexity and diversity of emotional expression, this study proposes quantum multimodal learning for sentiment classification (QMLSC), a novel quantum–classical hybrid model that integrates text and speech data to capture emotional signals more effectively. To address the limitations of the noisy intermediate-scale quantum era, we designed advanced variational quantum circuit (VQC) architectures to efficiently process high-dimensional data, maximizing feature retention and minimizing information loss. Our approach employs a residual structure that fuses quantum and classical components, enhancing the benefits of quantum features and conventional machine learning attributes. By using randomized expressive circuits, we improve system flexibility, accuracy, and robustness in sentiment classification tasks. Integrating VQC significantly reduces the number of parameters compared to fully connected layers, resulting in improved accuracy and computational efficiency. Empirical findings validate the superior performance of our fusion approach in effectively mitigating noise and error impacts associated with quantum computing and demonstrate strong potential for future applications in complex emotional information processing. This study provides new insights and methodologies for advancing sentiment classification technology and highlights the broad application potential for advancing quantum computing in information processing fields.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"120 ","pages":"Article 103049"},"PeriodicalIF":14.7,"publicationDate":"2025-03-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143579943","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}