Information Fusion最新文献_第8页

Adversarial attacks and defenses on text-to-image diffusion models: A survey 文本到图像扩散模型的对抗性攻击和防御：调查

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-18 DOI: 10.1016/j.inffus.2024.102701

Chenyu Zhang, Mingwang Hu, Wenhui Li, Lanjun Wang

Recently, the text-to-image diffusion model has gained considerable attention from the community due to its exceptional image generation capability. A representative model, Stable Diffusion, amassed more than 10 million users within just two months of its release. This surge in popularity has facilitated studies on the robustness and safety of the model, leading to the proposal of various adversarial attack methods. Simultaneously, there has been a marked increase in research focused on defense methods to improve the robustness and safety of these models. In this survey, we provide a comprehensive review of the literature on adversarial attacks and defenses targeting text-to-image diffusion models. We begin with an overview of text-to-image diffusion models, followed by an introduction to a taxonomy of adversarial attacks and an in-depth review of existing attack methods. We then present a detailed analysis of current defense methods that improve model robustness and safety. Finally, we discuss ongoing challenges and explore promising future research directions. For a complete list of the adversarial attack and defense methods covered in this survey, please refer to our curated repository at https://github.com/datar001/Awesome-AD-on-T2IDM.

Warning:

This paper includes model-generated content that may contain offensive or distressing material.

最近，文本到图像扩散模型因其卓越的图像生成能力而受到了社会各界的广泛关注。一个具有代表性的模型--"稳定扩散 "在发布后的短短两个月内就积累了 1000 多万用户。这种受欢迎程度的激增促进了对模型鲁棒性和安全性的研究，并提出了各种对抗性攻击方法。与此同时，为提高这些模型的鲁棒性和安全性，有关防御方法的研究也显著增加。在本研究中，我们将全面回顾针对文本到图像扩散模型的对抗性攻击和防御的文献。我们首先概述了文本到图像扩散模型，然后介绍了对抗性攻击的分类方法，并对现有的攻击方法进行了深入评述。然后，我们详细分析了当前可提高模型鲁棒性和安全性的防御方法。最后，我们讨论了当前面临的挑战，并探讨了前景广阔的未来研究方向。有关本调查所涉及的对抗性攻击和防御方法的完整列表，请参阅我们策划的资料库：https://github.com/datar001/Awesome-AD-on-T2IDM.Warning:This 论文包含模型生成的内容，可能含有攻击性或令人不安的材料。

{"title":"Adversarial attacks and defenses on text-to-image diffusion models: A survey","authors":"Chenyu Zhang, Mingwang Hu, Wenhui Li, Lanjun Wang","doi":"10.1016/j.inffus.2024.102701","DOIUrl":"10.1016/j.inffus.2024.102701","url":null,"abstract":"<div><p>Recently, the text-to-image diffusion model has gained considerable attention from the community due to its exceptional image generation capability. A representative model, Stable Diffusion, amassed more than 10 million users within just two months of its release. This surge in popularity has facilitated studies on the robustness and safety of the model, leading to the proposal of various adversarial attack methods. Simultaneously, there has been a marked increase in research focused on defense methods to improve the robustness and safety of these models. In this survey, we provide a comprehensive review of the literature on adversarial attacks and defenses targeting text-to-image diffusion models. We begin with an overview of text-to-image diffusion models, followed by an introduction to a taxonomy of adversarial attacks and an in-depth review of existing attack methods. We then present a detailed analysis of current defense methods that improve model robustness and safety. Finally, we discuss ongoing challenges and explore promising future research directions. For a complete list of the adversarial attack and defense methods covered in this survey, please refer to our curated repository at <span><span>https://github.com/datar001/Awesome-AD-on-T2IDM</span><svg><path></path></svg></span>.</p></div><div><h3>Warning:</h3><p>This paper includes model-generated content that may contain offensive or distressing material.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102701"},"PeriodicalIF":14.7,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

MMSR: Symbolic regression is a multi-modal information fusion task MMSR：符号回归是一项多模态信息融合任务

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102681

Yanjie Li , Jingyi Liu , Min Wu , Lina Yu , Weijun Li , Xin Ning , Wenqiang Li , Meilan Hao , Yusong Deng , Shu Wei

Mathematical formulas are the crystallization of human wisdom in exploring the laws of nature for thousands of years. Describing the complex laws of nature with a concise mathematical formula is a constant pursuit of scientists and a great challenge for artificial intelligence. This field is called symbolic regression (SR). Symbolic regression was originally formulated as a combinatorial optimization problem, and Genetic Programming (GP) and Reinforcement Learning algorithms were used to solve it. However, GP is sensitive to hyperparameters, and these two types of algorithms are inefficient. To solve this problem, researchers treat the mapping from data to expressions as a translation problem. And the corresponding large-scale pre-trained model is introduced. However, the data and expression skeletons do not have very clear word correspondences as the two languages do. Instead, they are more like two modalities (e.g., image and text). Therefore, in this paper, we proposed MMSR. The SR problem is solved as a pure multi-modal problem, and contrastive learning is also introduced in the training process for modal alignment to facilitate later modal feature fusion. It is worth noting that to better promote the modal feature fusion, we adopt the strategy of training contrastive learning loss and other losses at the same time, which only needs one-step training, instead of training contrastive learning loss first and then training other losses. Because our experiments prove training together can make the feature extraction module and feature fusion module wearing-in better. Experimental results show that compared with multiple large-scale pre-training baselines, MMSR achieves the most advanced results on multiple mainstream datasets including SRBench. Our code is open source at https://github.com/1716757342/MMSR.

数学公式是人类千百年来探索自然规律的智慧结晶。用简洁的数学公式描述复杂的自然规律，是科学家们不断追求的目标，也是人工智能面临的巨大挑战。这一领域被称为符号回归（SR）。符号回归最初被表述为一个组合优化问题，遗传编程（GP）和强化学习算法被用来解决这个问题。然而，GP 对超参数很敏感，而这两类算法的效率很低。为了解决这个问题，研究人员将数据到表达式的映射视为一个翻译问题。并引入了相应的大规模预训练模型。然而，数据和表达骨架并不像两种语言那样有非常明确的单词对应关系。相反，它们更像是两种模式（如图像和文本）。因此，我们在本文中提出了 MMSR。SR 问题作为一个纯粹的多模态问题来解决，在模态对齐的训练过程中还引入了对比学习，以方便后期的模态特征融合。值得注意的是，为了更好地促进模态特征融合，我们采用了对比学习损失和其他损失同时训练的策略，这只需要一步训练，而不是先训练对比学习损失，再训练其他损失。实验证明，同时训练可以使特征提取模块和特征融合模块更好地磨合。实验结果表明，与多个大规模预训练基线相比，MMSR 在包括 SRBench 在内的多个主流数据集上取得了最先进的结果。我们的代码开源于 https://github.com/1716757342/MMSR。

{"title":"MMSR: Symbolic regression is a multi-modal information fusion task","authors":"Yanjie Li , Jingyi Liu , Min Wu , Lina Yu , Weijun Li , Xin Ning , Wenqiang Li , Meilan Hao , Yusong Deng , Shu Wei","doi":"10.1016/j.inffus.2024.102681","DOIUrl":"10.1016/j.inffus.2024.102681","url":null,"abstract":"<div><p>Mathematical formulas are the crystallization of human wisdom in exploring the laws of nature for thousands of years. Describing the complex laws of nature with a concise mathematical formula is a constant pursuit of scientists and a great challenge for artificial intelligence. This field is called symbolic regression (SR). Symbolic regression was originally formulated as a combinatorial optimization problem, and Genetic Programming (GP) and Reinforcement Learning algorithms were used to solve it. However, GP is sensitive to hyperparameters, and these two types of algorithms are inefficient. To solve this problem, researchers treat the mapping from data to expressions as a translation problem. And the corresponding large-scale pre-trained model is introduced. However, the data and expression skeletons do not have very clear word correspondences as the two languages do. Instead, they are more like two modalities (e.g., image and text). Therefore, in this paper, we proposed MMSR. The SR problem is solved as a pure multi-modal problem, and contrastive learning is also introduced in the training process for modal alignment to facilitate later modal feature fusion. It is worth noting that to better promote the modal feature fusion, we adopt the strategy of training contrastive learning loss and other losses at the same time, which only needs one-step training, instead of training contrastive learning loss first and then training other losses. Because our experiments prove training together can make the feature extraction module and feature fusion module wearing-in better. Experimental results show that compared with multiple large-scale pre-training baselines, MMSR achieves the most advanced results on multiple mainstream datasets including SRBench. Our code is open source at <span><span>https://github.com/1716757342/MMSR</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102681"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

D3WC: Deep three-way clustering with granular evidence fusion D3WC：深度三向聚类与粒度证据融合

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102699

Hengrong Ju , Jing Guo , Weiping Ding , Xibei Yang

Deep clustering has gained significant traction as an unsupervised learning method, demonstrating considerable success in processing high-dimensional samples in data mining and computer vision. However, the ambiguity of high-dimensional data presents a challenge for deep clustering, which struggles to manage data uncertainty directly. In addition, while similarities and correlations in data often concentrate in local neighborhoods, traditional deep clustering methods frequently overlook these local relationships. To overcome these limitations, this paper presents a novel deep three-way clustering with granular evidence fusion. First, a fused contrastive deep FCM clustering network framework is introduced to project data from complex original data space to a more suitable deep feature space. Second, drawing upon the principles of three-way decision, the clustering results of the first stage are divided into positive, boundary, and negative regions, effectively addressing data uncertainty. Finally, a novel semiball neighborhood granulation method is employed to construct information granules for uncertain samples. This paper further leverages evidence theory to integrate belief information in these information granules, facilitating the redistribution of uncertain data. By emphasizing local structures, the proposed method effectively describes the characteristics of complex data. Experimental results confirm the effectiveness of this approach, showcasing its ability to enhance the clustering process.

深度聚类作为一种无监督学习方法，在处理数据挖掘和计算机视觉领域的高维样本方面取得了相当大的成功，受到了广泛关注。然而，高维数据的模糊性给深度聚类带来了挑战，因为深度聚类难以直接管理数据的不确定性。此外，虽然数据中的相似性和相关性往往集中在局部邻域，但传统的深度聚类方法经常忽略这些局部关系。为了克服这些局限性，本文提出了一种新颖的具有粒度证据融合的深度三向聚类方法。首先，本文引入了融合对比深度 FCM 聚类网络框架，将数据从复杂的原始数据空间投射到更合适的深度特征空间。其次，借鉴三向决策原理，将第一阶段的聚类结果划分为正区域、边界区域和负区域，有效解决了数据的不确定性问题。最后，本文采用了一种新颖的半球邻域颗粒化方法来构建不确定样本的信息颗粒。本文进一步利用证据理论，在这些信息颗粒中整合信念信息，促进不确定数据的重新分配。通过强调局部结构，所提出的方法有效地描述了复杂数据的特征。实验结果证实了这种方法的有效性，展示了其增强聚类过程的能力。

{"title":"D3WC: Deep three-way clustering with granular evidence fusion","authors":"Hengrong Ju , Jing Guo , Weiping Ding , Xibei Yang","doi":"10.1016/j.inffus.2024.102699","DOIUrl":"10.1016/j.inffus.2024.102699","url":null,"abstract":"<div><p>Deep clustering has gained significant traction as an unsupervised learning method, demonstrating considerable success in processing high-dimensional samples in data mining and computer vision. However, the ambiguity of high-dimensional data presents a challenge for deep clustering, which struggles to manage data uncertainty directly. In addition, while similarities and correlations in data often concentrate in local neighborhoods, traditional deep clustering methods frequently overlook these local relationships. To overcome these limitations, this paper presents a novel deep three-way clustering with granular evidence fusion. First, a fused contrastive deep FCM clustering network framework is introduced to project data from complex original data space to a more suitable deep feature space. Second, drawing upon the principles of three-way decision, the clustering results of the first stage are divided into positive, boundary, and negative regions, effectively addressing data uncertainty. Finally, a novel semiball neighborhood granulation method is employed to construct information granules for uncertain samples. This paper further leverages evidence theory to integrate belief information in these information granules, facilitating the redistribution of uncertain data. By emphasizing local structures, the proposed method effectively describes the characteristics of complex data. Experimental results confirm the effectiveness of this approach, showcasing its ability to enhance the clustering process.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102699"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Transformers in biosignal analysis: A review 生物信号分析中的变压器：综述

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102697

Ayman Anwar , Yassin Khalifa , James L. Coyle , Ervin Sejdic

Transformer architectures have become increasingly popular in healthcare applications. Through outstanding performance in natural language processing and superior capability to encode sequences, transformers have influenced researchers from various healthcare domains. Biosignal processing, in particular, has been a main focus in healthcare research to understand and assess complex physiological processes. Since their advent, multiple variants of transformer architectures have been leveraged by numerous studies to classify, analyze, and extract physiological events encoded within biosignals. In this paper, we aim to conduct a comprehensive survey that bridges research endeavors and highlights the most common and state-of-the-art transformer architectures utilized across the various subfields of biosignal analysis. Additionally, we also provide an objective comparison between transformers and similar sequence-specialized neural networks to highlight strengths, weaknesses, and best practices in biosignal analysis. By doing so, we aspire to provide a roadmap for researchers interested in leveraging transformer architectures for biosignal analysis applications.

变换器架构在医疗保健应用中越来越受欢迎。变压器在自然语言处理方面表现出色，在序列编码方面能力出众，因此影响了各个医疗保健领域的研究人员。特别是生物信号处理，一直是医疗保健研究的重点，以了解和评估复杂的生理过程。自变压器问世以来，许多研究利用变压器架构的多种变体对生物信号中编码的生理事件进行分类、分析和提取。在本文中，我们旨在进行一次全面调查，为研究工作搭建桥梁，并重点介绍在生物信号分析的各个子领域中使用的最常见和最先进的变压器架构。此外，我们还对变压器和类似的序列专用神经网络进行了客观比较，以突出生物信号分析的优势、劣势和最佳实践。通过这样做，我们希望为有兴趣在生物信号分析应用中利用变压器架构的研究人员提供一个路线图。

{"title":"Transformers in biosignal analysis: A review","authors":"Ayman Anwar , Yassin Khalifa , James L. Coyle , Ervin Sejdic","doi":"10.1016/j.inffus.2024.102697","DOIUrl":"10.1016/j.inffus.2024.102697","url":null,"abstract":"<div><p>Transformer architectures have become increasingly popular in healthcare applications. Through outstanding performance in natural language processing and superior capability to encode sequences, transformers have influenced researchers from various healthcare domains. Biosignal processing, in particular, has been a main focus in healthcare research to understand and assess complex physiological processes. Since their advent, multiple variants of transformer architectures have been leveraged by numerous studies to classify, analyze, and extract physiological events encoded within biosignals. In this paper, we aim to conduct a comprehensive survey that bridges research endeavors and highlights the most common and state-of-the-art transformer architectures utilized across the various subfields of biosignal analysis. Additionally, we also provide an objective comparison between transformers and similar sequence-specialized neural networks to highlight strengths, weaknesses, and best practices in biosignal analysis. By doing so, we aspire to provide a roadmap for researchers interested in leveraging transformer architectures for biosignal analysis applications.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102697"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

ℓp-norm constrained one-class classifier combination ℓp-norm约束的单类分类器组合

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102700

Sepehr Nourmohammadi , Shervin Rahimzadeh Arashloo , Josef Kittler

Classifier fusion is established as an effective methodology for boosting performance in different classification settings and one-class classification is no exception. In this study, we consider the one-class classifier fusion problem by modelling the sparsity/uniformity of the ensemble. To this end, we formulate a convex objective function to learn the weights in a linear ensemble model and impose a variable $ℓ_{p \geq 1}$ -norm constraint on the weight vector. The vector-norm constraint enables the model to adapt to the intrinsic uniformity/sparsity of the ensemble in the space of base learners and acts as a (soft) classifier selection mechanism by shaping the relative magnitudes of fusion weights. Drawing on the Frank–Wolfe algorithm, we then present an effective approach to solve the proposed convex constrained optimisation problem efficiently.

We evaluate the proposed one-class classifier combination approach on multiple data sets from diverse application domains and illustrate its merits in comparison to the existing approaches.

在不同的分类环境中，分类器融合是提高性能的有效方法，单类分类也不例外。在本研究中，我们通过模拟集合的稀疏性/不均匀性来考虑单类分类器融合问题。为此，我们制定了一个凸目标函数来学习线性集合模型中的权重，并对权重向量施加了一个变量ℓp≥1-norm 约束。矢量-正则约束使模型能够适应基础学习者空间中集合的内在均匀性/稀疏性，并通过塑造融合权重的相对大小充当（软）分类器选择机制。借鉴弗兰克-沃尔夫算法，我们提出了一种有效的方法来高效解决所提出的凸约束优化问题。我们在来自不同应用领域的多个数据集上对所提出的单类分类器组合方法进行了评估，并说明了它与现有方法相比的优点。

{"title":"ℓp-norm constrained one-class classifier combination","authors":"Sepehr Nourmohammadi , Shervin Rahimzadeh Arashloo , Josef Kittler","doi":"10.1016/j.inffus.2024.102700","DOIUrl":"10.1016/j.inffus.2024.102700","url":null,"abstract":"<div><p>Classifier fusion is established as an effective methodology for boosting performance in different classification settings and one-class classification is no exception. In this study, we consider the one-class classifier fusion problem by modelling the sparsity/uniformity of the ensemble. To this end, we formulate a convex objective function to learn the weights in a linear ensemble model and impose a variable <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mi>p</mi><mo>≥</mo><mn>1</mn></mrow></msub></math></span>-norm constraint on the weight vector. The vector-norm constraint enables the model to adapt to the intrinsic uniformity/sparsity of the ensemble in the space of base learners and acts as a (soft) classifier selection mechanism by shaping the relative magnitudes of fusion weights. Drawing on the Frank–Wolfe algorithm, we then present an effective approach to solve the proposed convex constrained optimisation problem efficiently.</p><p>We evaluate the proposed one-class classifier combination approach on multiple data sets from diverse application domains and illustrate its merits in comparison to the existing approaches.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102700"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Lead-fusion Barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms 导联融合巴洛双胞胎：多导联心电图的融合自监督学习方法

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102698

Wenhan Liu , Shurong Pan , Zhoutong Li , Sheng Chang , Qijun Huang , Nan Jiang

Nowadays, deep learning depends on large-scale labeled datasets, which limits its broader application in electrocardiogram (ECG) analysis, as manual labeling of ECGs is consistently costly. To overcome this issue, this paper proposes a fused self-supervised learning (SSL) method for multi-lead ECGs: lead-fusion Barlow twins (LFBT). It utilizes unlabeled ECG datasets to pretrain an encoder group using a fused loss. This loss fuses intra-lead and inter-lead BT loss. By employing BT, LFBT avoids the need for additional techniques to prevent trivial solutions (collapse) in pretraining. Moreover, multi-branch concatenation (MBC) fuses information from all leads when transferring pretrained encoders to downstream tasks. According to the experiments, LFBT can extract prior knowledge from unlabeled ECG datasets, making a deep learning model yield comparable performances with its supervised counterpart (trained from scratch) using 3 $\sim 5 \times$ fewer labels. Furthermore, LFBT is robust when applied to uncurated ECGs from real-world hospitals, with no significant performance decline observed after pretraining. Model interpretation based on gradient-weighted class activation mapping (GradCAM) indicates that LFBT helps models focus on critical waveform changes when training data and labels are insufficient. Compared with previous methods, LFBT demonstrates advantages in performance and implementation. To summarize, LFBT shows considerable potential in reducing the need for manual labeling of ECGs, thereby advancing deep learning applications in real-world ECG-based diagnoses. Code is available at https://github.com/Aiwiscal/ECG_SSL_LFBT.

如今，深度学习依赖于大规模标记数据集，这限制了其在心电图（ECG）分析中的广泛应用，因为手动标记心电图的成本一直很高。为了克服这一问题，本文提出了一种针对多导联心电图的融合自监督学习（SSL）方法：导联融合巴洛双胞胎（LFBT）。它利用未标记的心电图数据集，使用融合损失预训练编码器组。这种损耗融合了导联内和导联间的 BT 损耗。通过使用 BT，LFBT 避免了在预训练中使用额外的技术来防止琐碎解（崩溃）。此外，在将预训练编码器转移到下游任务时，多分支连接（MBC）可融合所有线索的信息。实验结果表明，LFBT 可以从无标记的心电图数据集中提取先验知识，从而使深度学习模型在使用 3∼5 倍较少标记的情况下，获得与监督模型（从头开始训练）相当的性能。此外，当将 LFBT 应用于来自真实世界医院的未经标注的心电图时，它具有很强的鲁棒性，在预训练后没有观察到明显的性能下降。基于梯度加权类激活映射（GradCAM）的模型解释表明，当训练数据和标签不足时，LFBT 能帮助模型关注关键的波形变化。与之前的方法相比，LFBT 在性能和实施方面都具有优势。总之，LFBT 在减少对心电图手动标记的需求方面显示出了巨大的潜力，从而推动了深度学习在基于心电图的实际诊断中的应用。代码见 https://github.com/Aiwiscal/ECG_SSL_LFBT。

{"title":"Lead-fusion Barlow twins: A fused self-supervised learning method for multi-lead electrocardiograms","authors":"Wenhan Liu , Shurong Pan , Zhoutong Li , Sheng Chang , Qijun Huang , Nan Jiang","doi":"10.1016/j.inffus.2024.102698","DOIUrl":"10.1016/j.inffus.2024.102698","url":null,"abstract":"<div><p>Nowadays, deep learning depends on large-scale labeled datasets, which limits its broader application in electrocardiogram (ECG) analysis, as manual labeling of ECGs is consistently costly. To overcome this issue, this paper proposes a fused self-supervised learning (SSL) method for multi-lead ECGs: lead-fusion Barlow twins (LFBT). It utilizes unlabeled ECG datasets to pretrain an encoder group using a fused loss. This loss fuses intra-lead and inter-lead BT loss. By employing BT, LFBT avoids the need for additional techniques to prevent trivial solutions (collapse) in pretraining. Moreover, multi-branch concatenation (MBC) fuses information from all leads when transferring pretrained encoders to downstream tasks. According to the experiments, LFBT can extract prior knowledge from unlabeled ECG datasets, making a deep learning model yield comparable performances with its supervised counterpart (trained from scratch) using 3<span><math><mrow><mo>∼</mo><mn>5</mn><mo>×</mo></mrow></math></span> fewer labels. Furthermore, LFBT is robust when applied to uncurated ECGs from real-world hospitals, with no significant performance decline observed after pretraining. Model interpretation based on gradient-weighted class activation mapping (GradCAM) indicates that LFBT helps models focus on critical waveform changes when training data and labels are insufficient. Compared with previous methods, LFBT demonstrates advantages in performance and implementation. To summarize, LFBT shows considerable potential in reducing the need for manual labeling of ECGs, thereby advancing deep learning applications in real-world ECG-based diagnoses. Code is available at <span><span>https://github.com/Aiwiscal/ECG_SSL_LFBT</span><svg><path></path></svg></span>.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102698"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Survey on data fusion approaches for fall-detection 跌倒检测数据融合方法调查

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102696

Ehsan Rassekh, Lauro Snidaro

Human fall detection is a critical research area focused on developing methods and systems that can automatically detect and recognize falls, particularly among the elderly and individuals with disabilities. Falls are a major cause of injuries and deaths among these populations, and timely intervention can reduce the severity of consequences. This article presents a comprehensive review of fall detection systems, emphasizing the use of cutting-edge technologies such as deep learning, sensor fusion, and machine learning. The research explores a variety of methodologies and strategies employed in fall detection systems, including the integration of wearable sensors, smartphones, and cameras. By examining various fall detection techniques and their experimental results, the article highlights the effectiveness of these systems in identifying and classifying falls. The study also addresses the challenges and limitations associated with fall detection systems, emphasizing the need for ongoing research and advancements. In summary, this research contributes to the development of advanced fall detection systems, demonstrating their potential to improve the quality of life for the elderly, alleviate healthcare burdens, and provide reliable solutions for fall detection and classification.

人体跌倒检测是一个重要的研究领域，重点是开发能够自动检测和识别跌倒的方法和系统，尤其是老年人和残疾人。跌倒是造成这些人群受伤和死亡的主要原因，及时干预可以降低后果的严重性。本文全面回顾了跌倒检测系统，强调了深度学习、传感器融合和机器学习等尖端技术的使用。研究探讨了跌倒检测系统采用的各种方法和策略，包括可穿戴传感器、智能手机和摄像头的集成。通过研究各种跌倒检测技术及其实验结果，文章强调了这些系统在识别和分类跌倒方面的有效性。研究还探讨了与跌倒检测系统相关的挑战和局限性，强调了持续研究和进步的必要性。总之，这项研究有助于开发先进的跌倒检测系统，展示其在提高老年人生活质量、减轻医疗负担以及为跌倒检测和分类提供可靠解决方案方面的潜力。

{"title":"Survey on data fusion approaches for fall-detection","authors":"Ehsan Rassekh, Lauro Snidaro","doi":"10.1016/j.inffus.2024.102696","DOIUrl":"10.1016/j.inffus.2024.102696","url":null,"abstract":"<div><p>Human fall detection is a critical research area focused on developing methods and systems that can automatically detect and recognize falls, particularly among the elderly and individuals with disabilities. Falls are a major cause of injuries and deaths among these populations, and timely intervention can reduce the severity of consequences. This article presents a comprehensive review of fall detection systems, emphasizing the use of cutting-edge technologies such as deep learning, sensor fusion, and machine learning. The research explores a variety of methodologies and strategies employed in fall detection systems, including the integration of wearable sensors, smartphones, and cameras. By examining various fall detection techniques and their experimental results, the article highlights the effectiveness of these systems in identifying and classifying falls. The study also addresses the challenges and limitations associated with fall detection systems, emphasizing the need for ongoing research and advancements. In summary, this research contributes to the development of advanced fall detection systems, demonstrating their potential to improve the quality of life for the elderly, alleviate healthcare burdens, and provide reliable solutions for fall detection and classification.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102696"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dynamic prediction and optimization of tunneling parameters with high reliability based on a hybrid intelligent algorithm 基于混合智能算法的高可靠性隧道参数动态预测与优化

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-16 DOI: 10.1016/j.inffus.2024.102705

Hongyu Chen , Qiping Geoffrey Shen , Miroslaw J. Skibniewski , Yuan Cao , Yang Liu

In this paper, a hybrid intelligent framework comprising Bayesian optimization (BO), gradient boosting with categorical features (CatBoost) and the nondominated sorting genetic algorithm-III (NSGA-III) was proposed to support multiobjective optimization of shield construction parameters without large sample datasets, improve the shield performance, and ensure reliable and interpretable results. First, with the use of the specific tunneling energy, advancing speed and cutter wear as objective functions, a BO-CatBoost prediction model for shield construction parameters and various objectives was constructed, and the key influencing factors were identified via the SHapley Additive exPlanations (SHAP) method. Then, a BO-CatBoost-NSGA-III model was developed to obtain Pareto solutions under different scenarios involving the adjustment of the key influencing factors. Finally, adopting the Wuhan Metro as the background, the accuracy, stability, and generalizability of the constructed algorithm were verified. The results indicated that (1) the developed BO-CatBoost algorithm is superior to 9 other algorithms. The R² values of the proposed approach were 0.976 and 0.901–0.976 on the test set. (2) The developed BO-CatBoost-NSGA-III algorithm could be used to obtain Pareto solutions under different scenarios via the adjustment of the key influencing factors with the SHAP method, and the optimal solutions could facilitate improvements in the advancing speed, specific tunneling energy and cutter wear of 3.45 %, 6.09 %, and 0.52 %, respectively, with an overall average reliability of 90.5 %. (3) By comparing various prediction algorithms, optimization schemes of different objectives and geological conditions, the accuracy, stability, and generalizability of the constructed algorithm were verified. The developed BO-CatBoost-NSGA-III framework could enable dynamic adjustment of shield construction parameters for decision-making purposes in the event of conflicting shield construction objectives and exhibits generality.

本文提出了一种由贝叶斯优化（BO）、带分类特征的梯度提升（CatBoost）和非支配排序遗传算法-III（NSGA-III）组成的混合智能框架，以支持无大样本数据集的盾构施工参数多目标优化，提高盾构性能，并确保结果的可靠性和可解释性。首先，以特定掘进能量、推进速度和刀盘磨损为目标函数，构建了盾构施工参数和各种目标的 BO-CatBoost 预测模型，并通过 SHapley Additive exPlanations（SHAP）方法确定了关键影响因素。然后，建立了 BO-CatBoost-NSGA-III 模型，通过调整关键影响因素获得不同方案下的帕累托方案。最后，以武汉地铁为背景，验证了所建算法的准确性、稳定性和普适性。结果表明：(1) 所开发的 BO-CatBoost 算法优于其他 9 种算法。在测试集上，所提方法的 R2 值分别为 0.976 和 0.901-0.976。(2) 所开发的 BO-CatBoost-NSGA-III 算法可通过 SHAP 方法对关键影响因素进行调整，获得不同场景下的帕累托方案，最优方案可促进推进速度、特定掘进能量和刀具磨损分别提高 3.45 %、6.09 % 和 0.52 %，总体平均可靠性达到 90.5 %。(3) 通过比较各种预测算法、不同目标的优化方案和地质条件，验证了所构建算法的准确性、稳定性和普适性。所开发的 BO-CatBoost-NSGA-III 框架可在盾构施工目标发生冲突时动态调整盾构施工参数，以达到决策目的，并具有通用性。

{"title":"Dynamic prediction and optimization of tunneling parameters with high reliability based on a hybrid intelligent algorithm","authors":"Hongyu Chen , Qiping Geoffrey Shen , Miroslaw J. Skibniewski , Yuan Cao , Yang Liu","doi":"10.1016/j.inffus.2024.102705","DOIUrl":"10.1016/j.inffus.2024.102705","url":null,"abstract":"<div><p>In this paper, a hybrid intelligent framework comprising Bayesian optimization (BO), gradient boosting with categorical features (CatBoost) and the nondominated sorting genetic algorithm-III (NSGA-III) was proposed to support multiobjective optimization of shield construction parameters without large sample datasets, improve the shield performance, and ensure reliable and interpretable results. First, with the use of the specific tunneling energy, advancing speed and cutter wear as objective functions, a BO-CatBoost prediction model for shield construction parameters and various objectives was constructed, and the key influencing factors were identified via the SHapley Additive exPlanations (SHAP) method. Then, a BO-CatBoost-NSGA-III model was developed to obtain Pareto solutions under different scenarios involving the adjustment of the key influencing factors. Finally, adopting the Wuhan Metro as the background, the accuracy, stability, and generalizability of the constructed algorithm were verified. The results indicated that (1) the developed BO-CatBoost algorithm is superior to 9 other algorithms. The R<sup>2</sup> values of the proposed approach were 0.976 and 0.901–0.976 on the test set. (2) The developed BO-CatBoost-NSGA-III algorithm could be used to obtain Pareto solutions under different scenarios via the adjustment of the key influencing factors with the SHAP method, and the optimal solutions could facilitate improvements in the advancing speed, specific tunneling energy and cutter wear of 3.45 %, 6.09 %, and 0.52 %, respectively, with an overall average reliability of 90.5 %. (3) By comparing various prediction algorithms, optimization schemes of different objectives and geological conditions, the accuracy, stability, and generalizability of the constructed algorithm were verified. The developed BO-CatBoost-NSGA-III framework could enable dynamic adjustment of shield construction parameters for decision-making purposes in the event of conflicting shield construction objectives and exhibits generality.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102705"},"PeriodicalIF":14.7,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-sensor temporal-spatial graph network fusion empirical mode decomposition convolution for machine fault diagnosis 多传感器时空图网络融合经验模式分解卷积用于机器故障诊断

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-15 DOI: 10.1016/j.inffus.2024.102708

Kuangchi Sun, Aijun Yin

Multi-sensor time-series data at different locations contains not only temporal correlation information but also spatial correlation information which is treasure for machine fault diagnosis. Existing graph construction methods mainly apply different data analysis methods to connect nodes and edges. Few works, however, consider the location of the sensor itself and temporal correlation information of multi-sensor time-series data. To mine the relationship between spatial information and temporal information, the multi-sensor temporal-spatial graph is constructed in this paper. Hereinto, the different data points of multi-sensor are severed as different nodes which represents the spatial feature information. The temporal information is contained between different nodes of the same sensor. Moreover, an empirical mode decomposition graph convolution network (EGCN) is proposed to extract the feature. Specifically, the traditional graph convolution operator is changed to empirical mode decomposition which can decompose the input features into multiple intrinsic modal features to achieve adaptive feature extraction and improve the representation capability of the network. Finally, the different fault types can be classified by fully connected layers. Experiments from different test rigs demonstrate that the proposed method achieves a diagnostic accuracy exceeding 99 % under limited fault samples.

不同位置的多传感器时间序列数据不仅包含时间相关性信息，还包含空间相关性信息，这对于机器故障诊断来说是非常宝贵的。现有的图构建方法主要采用不同的数据分析方法来连接节点和边。然而，很少有研究考虑传感器本身的位置和多传感器时间序列数据的时间相关性信息。为了挖掘空间信息与时间信息之间的关系，本文构建了多传感器时空图。其中，多传感器的不同数据点被划分为不同的节点，代表空间特征信息。同一传感器的不同节点之间包含时间信息。此外，本文还提出了一种经验模式分解图卷积网络（EGCN）来提取特征。具体来说，将传统的图卷积算子改为经验模态分解算子，可将输入特征分解为多个本征模态特征，从而实现自适应特征提取，提高网络的表示能力。最后，不同的故障类型可以通过全连接层进行分类。来自不同测试平台的实验证明，在有限的故障样本下，该方法的诊断准确率超过 99%。

{"title":"Multi-sensor temporal-spatial graph network fusion empirical mode decomposition convolution for machine fault diagnosis","authors":"Kuangchi Sun, Aijun Yin","doi":"10.1016/j.inffus.2024.102708","DOIUrl":"10.1016/j.inffus.2024.102708","url":null,"abstract":"<div><p>Multi-sensor time-series data at different locations contains not only temporal correlation information but also spatial correlation information which is treasure for machine fault diagnosis. Existing graph construction methods mainly apply different data analysis methods to connect nodes and edges. Few works, however, consider the location of the sensor itself and temporal correlation information of multi-sensor time-series data. To mine the relationship between spatial information and temporal information, the multi-sensor temporal-spatial graph is constructed in this paper. Hereinto, the different data points of multi-sensor are severed as different nodes which represents the spatial feature information. The temporal information is contained between different nodes of the same sensor. Moreover, an empirical mode decomposition graph convolution network (EGCN) is proposed to extract the feature. Specifically, the traditional graph convolution operator is changed to empirical mode decomposition which can decompose the input features into multiple intrinsic modal features to achieve adaptive feature extraction and improve the representation capability of the network. Finally, the different fault types can be classified by fully connected layers. Experiments from different test rigs demonstrate that the proposed method achieves a diagnostic accuracy exceeding 99 % under limited fault samples.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102708"},"PeriodicalIF":14.7,"publicationDate":"2024-09-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142272908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Review of multimodal machine learning approaches in healthcare 医疗保健领域多模态机器学习方法综述

IF 14.7 1区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

Information Fusion

Pub Date : 2024-09-14 DOI: 10.1016/j.inffus.2024.102690

Felix Krones , Umar Marikkar , Guy Parsons , Adam Szmul , Adam Mahdi

Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients’ demographic information, laboratory data, vital signs and various imaging data modalities to make informed decisions and contextualise their findings. Recent advances in machine learning have facilitated the more efficient incorporation of multimodal data, resulting in applications that better represent the clinician’s approach. Here, we provide an overview of multimodal machine learning approaches in healthcare, encompassing various data modalities commonly used in clinical diagnoses, such as imaging, text, time series and tabular data. We discuss key stages of model development, including pre-training, fine-tuning and evaluation. Additionally, we explore common data fusion approaches used in modelling, highlighting their advantages and performance challenges. An overview is provided of 17 multimodal clinical datasets with detailed description of the specific data modalities used in each dataset. Over 50 studies have been reviewed, with a predominant focus on the integration of imaging and tabular data. While multimodal techniques have shown potential in improving predictive accuracy across many healthcare areas, our review highlights that the effectiveness of a method is contingent upon the specific data and task at hand.

医疗保健领域的机器学习方法历来侧重于使用来自单一模式的数据，这限制了它们有效复制整合多种信息源以改进决策的临床实践的能力。临床医生通常依赖各种数据源，包括患者的人口统计学信息、实验室数据、生命体征和各种成像数据模式，以做出明智的决策并将其结果与上下文联系起来。机器学习的最新进展有助于更有效地整合多模态数据，从而使应用能更好地代表临床医生的方法。在此，我们将概述医疗保健领域的多模态机器学习方法，包括临床诊断中常用的各种数据模式，如成像、文本、时间序列和表格数据。我们讨论了模型开发的关键阶段，包括预训练、微调和评估。此外，我们还探讨了建模中常用的数据融合方法，强调了它们的优势和性能挑战。本文概述了 17 个多模态临床数据集，并详细描述了每个数据集中使用的特定数据模式。研究回顾了 50 多项研究，主要侧重于成像和表格数据的整合。虽然多模态技术在提高许多医疗保健领域的预测准确性方面已显示出潜力，但我们的综述强调，一种方法的有效性取决于手头的具体数据和任务。

{"title":"Review of multimodal machine learning approaches in healthcare","authors":"Felix Krones , Umar Marikkar , Guy Parsons , Adam Szmul , Adam Mahdi","doi":"10.1016/j.inffus.2024.102690","DOIUrl":"10.1016/j.inffus.2024.102690","url":null,"abstract":"<div><p>Machine learning methods in healthcare have traditionally focused on using data from a single modality, limiting their ability to effectively replicate the clinical practice of integrating multiple sources of information for improved decision making. Clinicians typically rely on a variety of data sources including patients’ demographic information, laboratory data, vital signs and various imaging data modalities to make informed decisions and contextualise their findings. Recent advances in machine learning have facilitated the more efficient incorporation of multimodal data, resulting in applications that better represent the clinician’s approach. Here, we provide an overview of multimodal machine learning approaches in healthcare, encompassing various data modalities commonly used in clinical diagnoses, such as imaging, text, time series and tabular data. We discuss key stages of model development, including pre-training, fine-tuning and evaluation. Additionally, we explore common data fusion approaches used in modelling, highlighting their advantages and performance challenges. An overview is provided of 17 multimodal clinical datasets with detailed description of the specific data modalities used in each dataset. Over 50 studies have been reviewed, with a predominant focus on the integration of imaging and tabular data. While multimodal techniques have shown potential in improving predictive accuracy across many healthcare areas, our review highlights that the effectiveness of a method is contingent upon the specific data and task at hand.</p></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"114 ","pages":"Article 102690"},"PeriodicalIF":14.7,"publicationDate":"2024-09-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S1566253524004688/pdfft?md5=c13f0b2819a78d412d45575c042d7e61&pid=1-s2.0-S1566253524004688-main.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142240687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0