Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102808
Shashank Sheshar Singh , Sumit Kumar , Sunil Kumar Meena , Kuldeep Singh , Shivansh Mishra , Albert Y. Zomaya
Quantum social network analysis (QSNA) is a recent advancement in the interdisciplinary field of quantum computing and social network analysis. This manuscript comprehensively reviews QSNA, emphasizing its methodologies, implementation strategies, challenges, and potential applications. It explores the conceptual foundation of key social network analysis research problems, including link prediction, influence maximization, and community detection. The research examines how quantum algorithms can revolutionize such social network tasks by leveraging principles from quantum mechanics and information theory and highlights the advantages of quantum algorithms in handling complex social network structures. The implementation section delves into the practical aspects of QSNA, such as frameworks, experimental setups, and evaluation methods. We assess the capabilities of existing quantum programming language tools and platforms. Various case studies illustrate the potential of quantum computing to enhance the performance of social network analysis. Additionally, we identify several crucial challenges and future research directions for QSNA, including the complexity of developing quantum algorithms, the need for interdisciplinary knowledge, and the challenges of integrating quantum and classical computing resources. This paper aims to serve as a foundational resource for researchers and practitioners, providing insights into the transformative potential of quantum computing in advancing the analysis of social networks and outlining future research directions in this emerging field.
{"title":"Quantum social network analysis: Methodology, implementation, challenges, and future directions","authors":"Shashank Sheshar Singh , Sumit Kumar , Sunil Kumar Meena , Kuldeep Singh , Shivansh Mishra , Albert Y. Zomaya","doi":"10.1016/j.inffus.2024.102808","DOIUrl":"10.1016/j.inffus.2024.102808","url":null,"abstract":"<div><div>Quantum social network analysis (QSNA) is a recent advancement in the interdisciplinary field of quantum computing and social network analysis. This manuscript comprehensively reviews QSNA, emphasizing its methodologies, implementation strategies, challenges, and potential applications. It explores the conceptual foundation of key social network analysis research problems, including link prediction, influence maximization, and community detection. The research examines how quantum algorithms can revolutionize such social network tasks by leveraging principles from quantum mechanics and information theory and highlights the advantages of quantum algorithms in handling complex social network structures. The implementation section delves into the practical aspects of QSNA, such as frameworks, experimental setups, and evaluation methods. We assess the capabilities of existing quantum programming language tools and platforms. Various case studies illustrate the potential of quantum computing to enhance the performance of social network analysis. Additionally, we identify several crucial challenges and future research directions for QSNA, including the complexity of developing quantum algorithms, the need for interdisciplinary knowledge, and the challenges of integrating quantum and classical computing resources. This paper aims to serve as a foundational resource for researchers and practitioners, providing insights into the transformative potential of quantum computing in advancing the analysis of social networks and outlining future research directions in this emerging field.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102808"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Most Bilingual Lexicon Induction (BLI) methods retrieve word translation pairs by finding the closest target word for a given source word based on cross-lingual word embeddings (WEs). However, we find that solely retrieving translation from the source-to-target perspective leads to some false positive translation pairs, which significantly harm the precision of BLI. To address this problem, we propose a novel and effective method to improve translation pair retrieval in cross-lingual WEs. Specifically, we apply a fusion of both source-side and target-side perspectives throughout the retrieval process to alleviate false positive word pairings that emanate from a single perspective. Moreover, in translation scenarios using Large Language Models (LLMs), we propose fusing the LLMs perspective with the BLI model perspective to enhance LLM’s translation capability. On benchmark datasets of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings.
{"title":"Dual-perspective fusion for word translation enhancement","authors":"Qiuyu Ding, Hailong Cao, Zhiqiang Cao, Tiejun Zhao","doi":"10.1016/j.inffus.2024.102815","DOIUrl":"10.1016/j.inffus.2024.102815","url":null,"abstract":"<div><div>Most Bilingual Lexicon Induction (BLI) methods retrieve word translation pairs by finding the closest target word for a given source word based on cross-lingual word embeddings (WEs). However, we find that solely retrieving translation from the source-to-target perspective leads to some false positive translation pairs, which significantly harm the precision of BLI. To address this problem, we propose a novel and effective method to improve translation pair retrieval in cross-lingual WEs. Specifically, we apply a fusion of both source-side and target-side perspectives throughout the retrieval process to alleviate false positive word pairings that emanate from a single perspective. Moreover, in translation scenarios using Large Language Models (LLMs), we propose fusing the LLMs perspective with the BLI model perspective to enhance LLM’s translation capability. On benchmark datasets of BLI, our proposed method achieves competitive performance compared to existing state-of-the-art (SOTA) methods. It demonstrates effectiveness and robustness across six experimental languages, including similar language pairs and distant language pairs, under both supervised and unsupervised settings.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102815"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744311","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102822
Linqing Hu , Junqi Zhang , Jie Zhang , Shaoyin Cheng , Yuyi Wang , Weiming Zhang , Nenghai Yu
Multi-sensor Fusion (MSF) algorithms are critical components in modern autonomous driving systems, particularly in localization and AI-powered perception modules, which play a vital role in ensuring vehicle safety. The Error-State Kalman Filter (ESKF), specifically employed for localization fusion, is widely recognized for its robustness and accuracy in MSF implementations. While existing studies have demonstrated the vulnerability of ESKF to sensor spoofing attacks, these works have primarily focused on a black-box implementation, leading to an insufficient security analysis. Specifically, due to the lack of theoretical guidance in previous methods, these studies have consistently relied on exponential functions to fit attack sequences across all scenarios. As a result, the attacker had to explore an extensive parameter space to identify effective attack sequences, lacking the ability to adaptively generate optimal ones. This paper aims to fill this crucial gap by conducting a thorough security analysis of the ESKF model and presenting a simple approach for modeling injection errors in these systems. By utilizing this error modeling, we introduce a new attack strategy that employs constrained optimization to reduce the energy needed to reach the same deviation target, guaranteeing that the attack is both efficient and effective. This method increases the stealthiness of the attack, making it harder to detect. Unlike previous methods, our approach can dynamically produce nearly perfect injection signals without requiring multiple attempts to find the best parameter combination in different scenarios. Through extensive simulations and real-world experiments, we demonstrate the superiority of our method compared to state-of-the-art attack strategies. Our results indicate that our approach requires significantly less injection energy to achieve the same deviation target. Additionally, we validate the practical applicability and impact of our method through end-to-end testing on an AI-powered autonomous driving system.
{"title":"Security analysis and adaptive false data injection against multi-sensor fusion localization for autonomous driving","authors":"Linqing Hu , Junqi Zhang , Jie Zhang , Shaoyin Cheng , Yuyi Wang , Weiming Zhang , Nenghai Yu","doi":"10.1016/j.inffus.2024.102822","DOIUrl":"10.1016/j.inffus.2024.102822","url":null,"abstract":"<div><div>Multi-sensor Fusion (MSF) algorithms are critical components in modern autonomous driving systems, particularly in localization and AI-powered perception modules, which play a vital role in ensuring vehicle safety. The Error-State Kalman Filter (ESKF), specifically employed for localization fusion, is widely recognized for its robustness and accuracy in MSF implementations. While existing studies have demonstrated the vulnerability of ESKF to sensor spoofing attacks, these works have primarily focused on a black-box implementation, leading to an insufficient security analysis. Specifically, due to the lack of theoretical guidance in previous methods, these studies have consistently relied on exponential functions to fit attack sequences across all scenarios. As a result, the attacker had to explore an extensive parameter space to identify effective attack sequences, lacking the ability to adaptively generate optimal ones. This paper aims to fill this crucial gap by conducting a thorough security analysis of the ESKF model and presenting a simple approach for modeling injection errors in these systems. By utilizing this error modeling, we introduce a new attack strategy that employs constrained optimization to reduce the energy needed to reach the same deviation target, guaranteeing that the attack is both efficient and effective. This method increases the stealthiness of the attack, making it harder to detect. Unlike previous methods, our approach can dynamically produce nearly perfect injection signals without requiring multiple attempts to find the best parameter combination in different scenarios. Through extensive simulations and real-world experiments, we demonstrate the superiority of our method compared to state-of-the-art attack strategies. Our results indicate that our approach requires significantly less injection energy to achieve the same deviation target. Additionally, we validate the practical applicability and impact of our method through end-to-end testing on an AI-powered autonomous driving system.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102822"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142758694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102817
Xianfeng Huang , Jianming Zhan , Weiping Ding
Kernel extreme learning machine (KELM), as a natural extension of ELM to kernel learning, has been successfully applied to solve various multivariate time series prediction (MTSP) tasks. Nevertheless, the high-dimensional and nonlinear properties of prediction information against the background of big data bring great challenges to the application of KELM. Recognizing these challenges, this paper develops a KELM-based hybrid MTSP system, aiming to address the effective mining of potential relationships among variables and sample significance. Our system is initiated by devising a feature evaluation mechanism that leverages transfer entropy and directed graph theory, effectively capturing the intricate interactions and intrinsic influences among variables. Next, we introduce a robust local relative density concept to gauge the significance level of different samples in KELM learning, and develop a more efficient KELM. Diverging from previous MTSP methodologies, the developed prediction system is capable of automatically discovering potential relationships between input features and modeling, and simultaneously realizes feature subset selection and modeling learning. Empirical evidence drawn from real-world datasets substantiates the effectiveness and practicality of our proposed system. The results not only validate our approach but also highlight its theoretical and practical superiority over existing state-of-the-art methods.
{"title":"Hybrid multivariate time series prediction system fusing transfer entropy and local relative density","authors":"Xianfeng Huang , Jianming Zhan , Weiping Ding","doi":"10.1016/j.inffus.2024.102817","DOIUrl":"10.1016/j.inffus.2024.102817","url":null,"abstract":"<div><div>Kernel extreme learning machine (KELM), as a natural extension of ELM to kernel learning, has been successfully applied to solve various multivariate time series prediction (MTSP) tasks. Nevertheless, the high-dimensional and nonlinear properties of prediction information against the background of big data bring great challenges to the application of KELM. Recognizing these challenges, this paper develops a KELM-based hybrid MTSP system, aiming to address the effective mining of potential relationships among variables and sample significance. Our system is initiated by devising a feature evaluation mechanism that leverages transfer entropy and directed graph theory, effectively capturing the intricate interactions and intrinsic influences among variables. Next, we introduce a robust local relative density concept to gauge the significance level of different samples in KELM learning, and develop a more efficient KELM. Diverging from previous MTSP methodologies, the developed prediction system is capable of automatically discovering potential relationships between input features and modeling, and simultaneously realizes feature subset selection and modeling learning. Empirical evidence drawn from real-world datasets substantiates the effectiveness and practicality of our proposed system. The results not only validate our approach but also highlight its theoretical and practical superiority over existing state-of-the-art methods.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102817"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102819
Huan Rong , Zhongfeng Chen , Zhenyu Lu , Xiao-ke Xu , Kai Huang , Victor S. Sheng
In the field of information management, effective event intelligence management is crucial for its development. With the continuous evolution of events, predicting future events has become a key task in information management. Event Prediction aims to predict upcoming events based on given contextual information. This requires modeling events and their relationships in the context to infer the structure of future events. However, the existing event prediction methods ignore that the event graph schema based on core events can provide more knowledge about history and future for event prediction through induction and deduction, so as to achieve accurate event prediction. In addressing this issue, we directed our focus towards Event Schema Induction. Inspired by it, we propose the Pred-ID model, designed to build event evolutionary pattern through Inductive Event Graph Generation, Deductive Event Graph Expansion, and Graph Fusion for Event Prediction. Specifically, in the Inductive Event Graph Generation phase, Pred-ID extracts the event core subgraph and event developmental trends from the instance event graph, learning the global structure and uncovering the main processes of event development. Then, in the Deductive Event Graph Expansion phase, by expanding future event node and stretching the main processes of event development into future directions, Pred-ID obtains deductive results, so as to construct the event evolutionary pattern. Finally, in the Graph Fusion for Event Prediction phase, aligning and merging the event evolutionary pattern with the instance event graph enables collaborative prediction of future events. The experimental results indicate that our proposed Pred-ID achieves optimal performance in event evolutionary pattern generation and event prediction tasks.
{"title":"Pred-ID: Future event prediction based on event type schema mining by graph induction and deduction","authors":"Huan Rong , Zhongfeng Chen , Zhenyu Lu , Xiao-ke Xu , Kai Huang , Victor S. Sheng","doi":"10.1016/j.inffus.2024.102819","DOIUrl":"10.1016/j.inffus.2024.102819","url":null,"abstract":"<div><div>In the field of information management, effective event intelligence management is crucial for its development. With the continuous evolution of events, predicting future events has become a key task in information management. <em>Event Prediction</em> aims to predict upcoming events based on given contextual information. This requires modeling events and their relationships in the context to infer the structure of future events. However, the existing event prediction methods ignore that the event graph schema based on core events can provide more knowledge about history and future for event prediction through induction and deduction, so as to achieve accurate event prediction. In addressing this issue, we directed our focus towards <em>Event Schema Induction</em>. Inspired by it, we propose the <strong><em>Pred-ID</em></strong> model, designed to build event evolutionary pattern through <em>Inductive Event Graph Generation</em>, <em>Deductive Event Graph Expansion</em>, and <em>Graph Fusion for Event Prediction</em>. Specifically, in the <em>Inductive Event Graph Generation</em> phase, Pred-ID extracts the event core subgraph and event developmental trends from the instance event graph, learning the global structure and uncovering the main processes of event development. Then, in the <em>Deductive Event Graph Expansion</em> phase, by expanding future event node and stretching the main processes of event development into future directions, Pred-ID obtains deductive results, so as to construct the event evolutionary pattern. Finally, in the <em>Graph Fusion for Event Prediction</em> phase, aligning and merging the event evolutionary pattern with the instance event graph enables collaborative prediction of future events. The experimental results indicate that our proposed Pred-ID achieves optimal performance in event evolutionary pattern generation and event prediction tasks.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102819"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744431","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102807
Nguyen Huu Quyen, Phan The Duy, Ngo Thao Nguyen, Nghi Hoang Khoa, Van-Hau Pham
In the realm of the Internet of Things (IoT), there has been a notable increase in the development and efficacy of Intrusion Detection Systems (IDS) that leverage machine learning (ML). Specifically, Federated Learning-based IDSs (FL-based IDS) have witnessed significant growth. These systems aim to mitigate data privacy breaches and minimize the communication overhead associated with dataset collection. Limited hardware resources also pose a significant constraint, preventing numerous IoT devices from actively engaging in FL. However, despite these advancements, certain challenges persist in the research domain. Issues such as elevated communication overhead, the potential for recovering private data, non-independent and identically distributed (Non-IID) data and a scarcity of labeled data remain noteworthy concerns. Additionally, vulnerabilities exist in the server-client communication during the FL process, creating opportunities for attackers to execute poisoning attacks on the client side with relative ease. To address these challenges, our paper introduces a semi-supervised approach for FL-based IDS. Our approach, named FedKD-IDS, employs knowledge distillation with a voting mechanism in place of weighted parameter aggregation and incorporates an anti-poisoning method. We conducted experiments to evaluate the effectiveness of our approach across diverse scenarios, including scenarios with Non-IID and varying data distributions. Additionally, we investigated various rates of malicious collaboration to demonstrate their impact in the federated training process. The results obtained from the real-world N-BaIoT dataset indicate that our approach surpasses the performance of the state-of-the-art (SOTA) SSFL method. Especially, even in the context of a poisoning attack where 50% of all collaborators targeted label flipping attack, FedKD-IDS demonstrated an accuracy of 79%, surpassing SSFL, which achieved only 19.86%. Furthermore, the outcomes also validated that the FedKD-IDS method has the capability to exclude over 85% of malicious collaborators during the aggregation phase of the federated training process.
{"title":"FedKD-IDS: A robust intrusion detection system using knowledge distillation-based semi-supervised federated learning and anti-poisoning attack mechanism","authors":"Nguyen Huu Quyen, Phan The Duy, Ngo Thao Nguyen, Nghi Hoang Khoa, Van-Hau Pham","doi":"10.1016/j.inffus.2024.102807","DOIUrl":"10.1016/j.inffus.2024.102807","url":null,"abstract":"<div><div>In the realm of the Internet of Things (IoT), there has been a notable increase in the development and efficacy of Intrusion Detection Systems (IDS) that leverage machine learning (ML). Specifically, Federated Learning-based IDSs (FL-based IDS) have witnessed significant growth. These systems aim to mitigate data privacy breaches and minimize the communication overhead associated with dataset collection. Limited hardware resources also pose a significant constraint, preventing numerous IoT devices from actively engaging in FL. However, despite these advancements, certain challenges persist in the research domain. Issues such as elevated communication overhead, the potential for recovering private data, non-independent and identically distributed (Non-IID) data and a scarcity of labeled data remain noteworthy concerns. Additionally, vulnerabilities exist in the server-client communication during the FL process, creating opportunities for attackers to execute poisoning attacks on the client side with relative ease. To address these challenges, our paper introduces a semi-supervised approach for FL-based IDS. Our approach, named FedKD-IDS, employs knowledge distillation with a voting mechanism in place of weighted parameter aggregation and incorporates an anti-poisoning method. We conducted experiments to evaluate the effectiveness of our approach across diverse scenarios, including scenarios with Non-IID and varying data distributions. Additionally, we investigated various rates of malicious collaboration to demonstrate their impact in the federated training process. The results obtained from the real-world N-BaIoT dataset indicate that our approach surpasses the performance of the state-of-the-art (SOTA) SSFL method. Especially, even in the context of a poisoning attack where 50% of all collaborators targeted label flipping attack, FedKD-IDS demonstrated an accuracy of 79%, surpassing SSFL, which achieved only 19.86%. Furthermore, the outcomes also validated that the FedKD-IDS method has the capability to exclude over 85% of malicious collaborators during the aggregation phase of the federated training process.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102807"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-26DOI: 10.1016/j.inffus.2024.102810
Yakun Ju , Jun Xiao , Cong Zhang , Hao Xie , Anwei Luo , Huiyu Zhou , Junyu Dong , Alex C. Kot
Marine snow, caused by the aggregation of small organic and inorganic particles, creates a visual effect similar to drifting snowflakes. Traditional methods for removing marine snow often use median filtering, which can blur the entire image. Although deep learning approaches attempt to address this issue, they typically only work in the spatial domain and still struggle with blurring and residual marine snow artifacts. These challenges arise because the spatial domain alone cannot easily distinguish between real object structures and noise-like marine snow artifacts. To address this, we propose the Deep Fourier Marine Snow Removal Network (DF-MSRN), which integrates both spatial and Fourier domain information to effectively restore images affected by marine snow. DF-MSRN employs a two-stage approach that leverages both Fourier frequency and spatial information: it first estimates a restored map of the amplitude component to address particle removal, avoiding additional noise in the spatial domain. Then, a fusion module combines Fourier frequency global information with spatial local information to refine image details. Experimental results show that DF-MSRN significantly outperforms existing denoising techniques on various marine image datasets, enhancing image clarity and detail preservation.
{"title":"Towards marine snow removal with fusing Fourier information","authors":"Yakun Ju , Jun Xiao , Cong Zhang , Hao Xie , Anwei Luo , Huiyu Zhou , Junyu Dong , Alex C. Kot","doi":"10.1016/j.inffus.2024.102810","DOIUrl":"10.1016/j.inffus.2024.102810","url":null,"abstract":"<div><div>Marine snow, caused by the aggregation of small organic and inorganic particles, creates a visual effect similar to drifting snowflakes. Traditional methods for removing marine snow often use median filtering, which can blur the entire image. Although deep learning approaches attempt to address this issue, they typically only work in the spatial domain and still struggle with blurring and residual marine snow artifacts. These challenges arise because the spatial domain alone cannot easily distinguish between real object structures and noise-like marine snow artifacts. To address this, we propose the Deep Fourier Marine Snow Removal Network (DF-MSRN), which integrates both spatial and Fourier domain information to effectively restore images affected by marine snow. DF-MSRN employs a two-stage approach that leverages both Fourier frequency and spatial information: it first estimates a restored map of the amplitude component to address particle removal, avoiding additional noise in the spatial domain. Then, a fusion module combines Fourier frequency global information with spatial local information to refine image details. Experimental results show that DF-MSRN significantly outperforms existing denoising techniques on various marine image datasets, enhancing image clarity and detail preservation.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102810"},"PeriodicalIF":14.7,"publicationDate":"2024-11-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142744308","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-24DOI: 10.1016/j.inffus.2024.102818
Qiang Liu , Xiangchao Meng , Shenfu Zhang , Xuebin Li , Feng Shao
Spatio-temporal fusion has become a popular technology for generating remote sensing images with high spatial and high temporal resolutions, thus providing valuable data support for remote sensing monitoring applications, such as environmental monitoring and city planning. Currently, deep learning-based methods have garnered a significant amount of attention, and they mostly employ the fine image at the neighboring date as an auxiliary image. However, capturing usable neighboring fine images may be challenging due to the adverse effects of weather conditions on optical images. Moreover, the fusion performance drops sharply when the temporal interval is long (i.e., there are significant differences in images). In this paper, we proposed a bidirectional pyramid fusion network with semantic prior regularization (BPFN-SPR), which exhibits remarkable flexibility and robustness to temporal intervals.
Specifically, the proposed BPFN-SPR contains dual-path operations (i.e., Semantic Extraction path and Image Reconstruction path). The semantic extraction path has two modes: parameter learning mode and parameter freezing mode. The parameter learning mode aims to learn the information representation of the auxiliary fine image, while the parameter freezing mode aims to perceive the accurate semantic information of the target fine image. The image reconstruction path progressively reconstructs spatial details of fine images from coarse images, which jointly optimizes the target fine image and the auxiliary fine image, reducing the temporal sensitivity of the reconstruction branch, and thereby improving its generalization ability. Experimental results show that the proposed method has competitive performance, especially for areas with land cover changes. In addition, extensive experiments using images at multi-temporal intervals as auxiliary images have also demonstrated the significant advantages of the proposed method. The mean PSNR value attains 31.0713, while the average spectral index SAM measures 0.1640 on the LGC test set. Meanwhile, for the CIA test set, the average PSNR is recorded at 29.5332, accompanied by an average spectral index SAM of 0.1865. Therefore, the proposed BPFN-SPR has considerable potential in monitoring Earth's surface dynamics.
{"title":"A temporally insensitive spatio-temporal fusion method for remote sensing imagery via semantic prior regularization","authors":"Qiang Liu , Xiangchao Meng , Shenfu Zhang , Xuebin Li , Feng Shao","doi":"10.1016/j.inffus.2024.102818","DOIUrl":"10.1016/j.inffus.2024.102818","url":null,"abstract":"<div><div>Spatio-temporal fusion has become a popular technology for generating remote sensing images with high spatial and high temporal resolutions, thus providing valuable data support for remote sensing monitoring applications, such as environmental monitoring and city planning. Currently, deep learning-based methods have garnered a significant amount of attention, and they mostly employ the fine image at the neighboring date as an auxiliary image. However, capturing usable neighboring fine images may be challenging due to the adverse effects of weather conditions on optical images. Moreover, the fusion performance drops sharply when the temporal interval is long (i.e., there are significant differences in images). In this paper, we proposed a bidirectional pyramid fusion network with semantic prior regularization (BPFN-SPR), which exhibits remarkable flexibility and robustness to temporal intervals.</div><div>Specifically, the proposed BPFN-SPR contains dual-path operations (i.e., Semantic Extraction path and Image Reconstruction path). The semantic extraction path has two modes: parameter learning mode and parameter freezing mode. The parameter learning mode aims to learn the information representation of the auxiliary fine image, while the parameter freezing mode aims to perceive the accurate semantic information of the target fine image. The image reconstruction path progressively reconstructs spatial details of fine images from coarse images, which jointly optimizes the target fine image and the auxiliary fine image, reducing the temporal sensitivity of the reconstruction branch, and thereby improving its generalization ability. Experimental results show that the proposed method has competitive performance, especially for areas with land cover changes. In addition, extensive experiments using images at multi-temporal intervals as auxiliary images have also demonstrated the significant advantages of the proposed method. The mean PSNR value attains 31.0713, while the average spectral index SAM measures 0.1640 on the LGC test set. Meanwhile, for the CIA test set, the average PSNR is recorded at 29.5332, accompanied by an average spectral index SAM of 0.1865. Therefore, the proposed BPFN-SPR has considerable potential in monitoring Earth's surface dynamics.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102818"},"PeriodicalIF":14.7,"publicationDate":"2024-11-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142759032","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-23DOI: 10.1016/j.inffus.2024.102806
Genji Yuan , Jintao Song , Jinjiang Li
Underwater salient object detection (USOD) has garnered increasing attention due to its superior performance in various underwater visual tasks. Despite the growing interest, research on USOD remains in its nascent stages, with existing methods often struggling to capture long-range contextual features of salient objects. Additionally, these methods frequently overlook the complementary nature of multimodal information. The multimodal information fusion can render previously indiscernible objects more detectable, as capturing complementary features from diverse source images enables a more accurate depiction of objects. In this work, we explore an innovative approach that integrates RGB and depth information, coupled with interactive feature enhancement, to advance the detection of underwater salient objects. Our method first leverages the strengths of both transformer and convolutional neural network architectures to extract features from source images. Here, we employ a two-stage training strategy designed to optimize feature fusion. Subsequently, we utilize self-attention and cross-attention mechanisms to model the correlations among the extracted features, thereby amplifying the relevant features. Finally, to fully exploit features across different network layers, we introduce a cross-scale learning strategy to facilitate multi-scale feature fusion, which improves the detection accuracy of underwater salient objects by generating both coarse and fine salient predictions. Extensive experimental evaluations demonstrate the state-of-the-art model performance of our proposed method.
{"title":"IF-USOD: Multimodal information fusion interactive feature enhancement architecture for underwater salient object detection","authors":"Genji Yuan , Jintao Song , Jinjiang Li","doi":"10.1016/j.inffus.2024.102806","DOIUrl":"10.1016/j.inffus.2024.102806","url":null,"abstract":"<div><div>Underwater salient object detection (USOD) has garnered increasing attention due to its superior performance in various underwater visual tasks. Despite the growing interest, research on USOD remains in its nascent stages, with existing methods often struggling to capture long-range contextual features of salient objects. Additionally, these methods frequently overlook the complementary nature of multimodal information. The multimodal information fusion can render previously indiscernible objects more detectable, as capturing complementary features from diverse source images enables a more accurate depiction of objects. In this work, we explore an innovative approach that integrates RGB and depth information, coupled with interactive feature enhancement, to advance the detection of underwater salient objects. Our method first leverages the strengths of both transformer and convolutional neural network architectures to extract features from source images. Here, we employ a two-stage training strategy designed to optimize feature fusion. Subsequently, we utilize self-attention and cross-attention mechanisms to model the correlations among the extracted features, thereby amplifying the relevant features. Finally, to fully exploit features across different network layers, we introduce a cross-scale learning strategy to facilitate multi-scale feature fusion, which improves the detection accuracy of underwater salient objects by generating both coarse and fine salient predictions. Extensive experimental evaluations demonstrate the state-of-the-art model performance of our proposed method.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102806"},"PeriodicalIF":14.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721319","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-11-23DOI: 10.1016/j.inffus.2024.102805
Rui Liu , Zhiyuan Zhang , Yini Peng , Jiayi Ma , Xin Tian
Shape from polarization (SfP) is a powerful passive three-dimensional imaging technique that enables the reconstruction of surface normal with dense textural details. However, existing deep learning-based SfP methods only focus on the polarization prior, which makes it difficult to accurately reconstruct targets with rich texture details under complicated scenes. Aiming to improve the reconstruction accuracy, we utilize the surface normal estimated from shading cues and the innovatively proposed specular confidence as shading prior to provide additional feature information. Furthermore, to efficiently combine the polarization and shading priors, a novel deep fusion network named SfPSNet is proposed for the information extraction and the reconstruction of surface normal. SfPSNet is implemented based on a dual-branch architecture to handle different physical priors. A feature correction module is specifically designed to mutually rectify the defects in channel-wise and spatial-wise dimensions, respectively. In addition, a feature fusion module is proposed to fuse the feature maps of polarization and shading priors based on an efficient cross-attention mechanism. Our experimental results show that the fusion of polarization and shading priors can significantly improve the reconstruction quality of surface normal, especially for objects or scenes illuminated by complex lighting sources. As a result, SfPSNet shows state-of-the-art performance compared with existing deep learning-based SfP methods benefiting from its efficiency in extracting and fusing information from different priors.
{"title":"Physical prior-guided deep fusion network with shading cues for shape from polarization","authors":"Rui Liu , Zhiyuan Zhang , Yini Peng , Jiayi Ma , Xin Tian","doi":"10.1016/j.inffus.2024.102805","DOIUrl":"10.1016/j.inffus.2024.102805","url":null,"abstract":"<div><div>Shape from polarization (SfP) is a powerful passive three-dimensional imaging technique that enables the reconstruction of surface normal with dense textural details. However, existing deep learning-based SfP methods only focus on the polarization prior, which makes it difficult to accurately reconstruct targets with rich texture details under complicated scenes. Aiming to improve the reconstruction accuracy, we utilize the surface normal estimated from shading cues and the innovatively proposed specular confidence as shading prior to provide additional feature information. Furthermore, to efficiently combine the polarization and shading priors, a novel deep fusion network named SfPSNet is proposed for the information extraction and the reconstruction of surface normal. SfPSNet is implemented based on a dual-branch architecture to handle different physical priors. A feature correction module is specifically designed to mutually rectify the defects in channel-wise and spatial-wise dimensions, respectively. In addition, a feature fusion module is proposed to fuse the feature maps of polarization and shading priors based on an efficient cross-attention mechanism. Our experimental results show that the fusion of polarization and shading priors can significantly improve the reconstruction quality of surface normal, especially for objects or scenes illuminated by complex lighting sources. As a result, SfPSNet shows state-of-the-art performance compared with existing deep learning-based SfP methods benefiting from its efficiency in extracting and fusing information from different priors.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"117 ","pages":"Article 102805"},"PeriodicalIF":14.7,"publicationDate":"2024-11-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142721314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}