首页 > 最新文献

Information Fusion最新文献

英文 中文
Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition 针对长尾视觉识别的异构知识转移动态协作学习
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-15 DOI: 10.1016/j.inffus.2024.102734
Hao Zhou , Tingjin Luo , Yongming He
Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.
利用深度卷积神经网络解决长尾视觉识别问题仍然是一项具有挑战性的任务。作为一种主流方法,多专家模型在解决这一问题时可以达到 SOTA 的精度,但网络学习的不确定性和融合推理的复杂性限制了多专家模型的性能和实用性。为了解决这一问题,我们在本文中提出了一种新颖的异构知识转移动态协作学习模型(DCHKT),在该模型中,具有不同专业知识的专家共同协作进行预测。DCHKT 由两个核心部分组成:动态自适应权重调整和异构知识转移学习。首先,动态自适应权重调整旨在通过动态自适应权重在全局专家和领域专家之间转移模型训练的重点。通过调节特征学习和分类器学习之间的权衡,动态自适应权重调整可以增强每位专家的判别能力,缓解模型学习的不确定性。然后,异质知识转移学习通过测量多位专家的融合对数与每位专家不同专业预测对数之间的分布差异,实现专家间的信息传递,增强模型训练和推理中集合预测的一致性,促进专家间的合作。最后,在公共长尾数据集上取得了大量实验结果:最后,在 CIFAR-LT、ImageNet-LT、Place-LT 和 iNaturalist2018 等公共长尾数据集上的大量实验结果证明了我们的 DCHKT 的有效性和优越性。
{"title":"Dynamic collaborative learning with heterogeneous knowledge transfer for long-tailed visual recognition","authors":"Hao Zhou ,&nbsp;Tingjin Luo ,&nbsp;Yongming He","doi":"10.1016/j.inffus.2024.102734","DOIUrl":"10.1016/j.inffus.2024.102734","url":null,"abstract":"<div><div>Solving the long-tailed visual recognition with deep convolutional neural networks is still a challenging task. As a mainstream method, multi-experts models achieve SOTA accuracy for tackling this problem, but the uncertainty in network learning and the complexity in fusion inference constrain the performance and practicality of the multi-experts models. To remedy this, we propose a novel dynamic collaborative learning with heterogeneous knowledge transfer model (DCHKT) in this paper, in which experts with different expertise collaborate to make predictions. DCHKT consists of two core components: dynamic adaptive weight adjustment and heterogeneous knowledge transfer learning. First, the dynamic adaptive weight adjustment is designed to shift the focus of model training between the global expert and domain experts via dynamic adaptive weight. By modulating the trade-off between the learning of features and classifier, the dynamic adaptive weight adjustment can enhance the discriminative ability of each expert and alleviate the uncertainty of model learning. Then, heterogeneous knowledge transfer learning, which measures the distribution differences between the fusion logits of multiple experts and the predicted logits of each expert with different specialties, can achieve message passing between experts and enhance the consistency of ensemble prediction in model training and inference to promote their collaborations. Finally, extensive experimental results on public long-tailed datasets: CIFAR-LT, ImageNet-LT, Place-LT and iNaturalist2018, demonstrate the effectiveness and superiority of our DCHKT.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102734"},"PeriodicalIF":14.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A multi-source domain feature-decision dual fusion adversarial transfer network for cross-domain anti-noise mechanical fault diagnosis in sustainable city 用于可持续城市跨域抗噪声机械故障诊断的多源域特征-决策双融合对抗传递网络
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-15 DOI: 10.1016/j.inffus.2024.102739
Changdong Wang , Huamin Jie , Jingli Yang , Tianyu Gao , Zhenyu Zhao , Yongqi Chang , Kye Yak See
Rotating machinery forms the critical backbone of infrastructure in a sustainable city, with bearings playing a pivotal role as key mechanical transmission components. Therefore, the health status of these bearings directly influences the safe operation of the infrastructure. Accurate and reliable diagnosis of defects in these components minimizes downtime, reduces maintenance costs, and prevents major accidents, ultimately providing insights in the construction and management of a sustainable city. Typically, in actual industrial scenarios, varying working conditions and various types of machines can result in significant discrepancies in the distribution of sample data. Moreover, the non-negligible noise may degrade the diagnostic performance. Therefore, realizing an accurate and reliable bearing diagnosis considering the cross-domain and noise environment remains a challenge. Leveraging the merits of information fusion and multi-source domain transfer learning, this article proposes a multi-source domain feature-decision dual fusion adversarial transfer network (DFATN) to break through the aforesaid limitations. Initially, an adversarial transfer framework is developed, incorporating novel feature matching evaluation and joint distribution difference losses. This framework is designed to facilitate the learning of feature invariants across domains and to enhance the sharing of domain-specific knowledge, even in noise. Relying on channel-spatial interactive feature fusion, a multi-scale feature extractor (MFE) is constructed to share the interaction and enhance the modeling of complex features in multiple dimensions. Additionally, a fault state-related decision fusion mechanism (SDF) is also implemented to integrate diagnostic information, significantly enhancing the generalization performance and robustness of the proposed network. By employing both public Paderborn University (PU) and laboratory-collected (Lab) datasets, the effectiveness and superiority of the proposed DFATN on bearing fault diagnosis are validated. For cross-working condition tasks, the proposed method realizes impressive performance, with average accuracies of 96.52% and 98.76% for Paderborn University (PU) and laboratory-collected (Lab) datasets, respectively. For cross-machine tasks, the average accuracy is 83.36%, outperforming other latest cross-domain fault diagnosis techniques.
旋转机械是可持续城市基础设施的重要支柱,轴承作为关键的机械传动部件发挥着举足轻重的作用。因此,这些轴承的健康状况直接影响着基础设施的安全运行。对这些部件的缺陷进行准确可靠的诊断,可以最大限度地减少停机时间、降低维护成本并防止重大事故的发生,最终为可持续城市的建设和管理提供启示。通常情况下,在实际工业场景中,不同的工作条件和各种类型的机器会导致样本数据的分布存在显著差异。此外,不可忽略的噪声也会降低诊断性能。因此,考虑到跨域和噪声环境,实现准确可靠的轴承诊断仍然是一项挑战。本文利用信息融合和多源域迁移学习的优点,提出了一种多源域特征-决策双融合对抗迁移网络(DFATN),以突破上述限制。首先,本文开发了一个对抗转移框架,其中包含新颖的特征匹配评估和联合分布差异损失。该框架旨在促进跨领域的特征不变性学习,并加强特定领域知识的共享,即使在噪声中也是如此。依靠信道空间交互式特征融合,构建了多尺度特征提取器(MFE),以共享交互并增强多维度复杂特征的建模。此外,还采用了与故障状态相关的决策融合机制(SDF)来整合诊断信息,从而显著提高了拟议网络的泛化性能和鲁棒性。通过使用帕德博恩大学(PU)的公共数据集和实验室收集的数据集,验证了所提出的 DFATN 在轴承故障诊断方面的有效性和优越性。在跨工况任务中,所提出的方法表现出色,帕德博恩大学(PU)和实验室收集的数据集的平均准确率分别为 96.52% 和 98.76%。在跨机器任务中,平均准确率为 83.36%,优于其他最新的跨领域故障诊断技术。
{"title":"A multi-source domain feature-decision dual fusion adversarial transfer network for cross-domain anti-noise mechanical fault diagnosis in sustainable city","authors":"Changdong Wang ,&nbsp;Huamin Jie ,&nbsp;Jingli Yang ,&nbsp;Tianyu Gao ,&nbsp;Zhenyu Zhao ,&nbsp;Yongqi Chang ,&nbsp;Kye Yak See","doi":"10.1016/j.inffus.2024.102739","DOIUrl":"10.1016/j.inffus.2024.102739","url":null,"abstract":"<div><div>Rotating machinery forms the critical backbone of infrastructure in a sustainable city, with bearings playing a pivotal role as key mechanical transmission components. Therefore, the health status of these bearings directly influences the safe operation of the infrastructure. Accurate and reliable diagnosis of defects in these components minimizes downtime, reduces maintenance costs, and prevents major accidents, ultimately providing insights in the construction and management of a sustainable city. Typically, in actual industrial scenarios, varying working conditions and various types of machines can result in significant discrepancies in the distribution of sample data. Moreover, the non-negligible noise may degrade the diagnostic performance. Therefore, realizing an accurate and reliable bearing diagnosis considering the cross-domain and noise environment remains a challenge. Leveraging the merits of information fusion and multi-source domain transfer learning, this article proposes a multi-source domain feature-decision dual fusion adversarial transfer network (DFATN) to break through the aforesaid limitations. Initially, an adversarial transfer framework is developed, incorporating novel feature matching evaluation and joint distribution difference losses. This framework is designed to facilitate the learning of feature invariants across domains and to enhance the sharing of domain-specific knowledge, even in noise. Relying on channel-spatial interactive feature fusion, a multi-scale feature extractor (MFE) is constructed to share the interaction and enhance the modeling of complex features in multiple dimensions. Additionally, a fault state-related decision fusion mechanism (SDF) is also implemented to integrate diagnostic information, significantly enhancing the generalization performance and robustness of the proposed network. By employing both public Paderborn University (PU) and laboratory-collected (Lab) datasets, the effectiveness and superiority of the proposed DFATN on bearing fault diagnosis are validated. For cross-working condition tasks, the proposed method realizes impressive performance, with average accuracies of 96.52% and 98.76% for Paderborn University (PU) and laboratory-collected (Lab) datasets, respectively. For cross-machine tasks, the average accuracy is 83.36%, outperforming other latest cross-domain fault diagnosis techniques.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102739"},"PeriodicalIF":14.7,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142441316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial robust image processing in medical digital twin 医学数字孪生中的逆向鲁棒图像处理
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-11 DOI: 10.1016/j.inffus.2024.102728
Samaneh Shamshiri , Huaping Liu , Insoo Sohn
Recent advancements in state-of-the-art technologies, including Artificial Intelligence (AI), Internet of Things (IoT), and cloud computing, have led to the emergence of an innovative technology known as digital twins (DTs). A digital twin is a virtual replica of the physical entity, with data connections in between. This technology has proven highly effective in several industries by improving decision-making and operational efficiency. In critical areas like healthcare, digital twins are increasingly being used to address the limitations of conventional approaches by creating virtual simulations of hospitals, medical equipment, patients, or even individual organs. These medical digital twins (MDT) revolutionize the healthcare industry by offering advanced solutions to enhance treatment outcomes and overall patient care. However, these systems are challenging because of the security and critical issues involved. Therefore, despite their achievements, the numerous security threats make it crucial to address the security challenges of digital twin technology. Given the lack of research on attacks targeting MDT functionalities, we concentrated on a specific cyber threat called adversarial attacks. Adversarial attacks exploit the model’s performance by introducing small, carefully crafted perturbations to manipulate the input data. To assess the vulnerability of medical digital twins to such attacks, we carried out a proof-of-concept study. Using image processing techniques and an artificial neural network model, we created a digital twin to diagnose breast cancer through thermography images. Then, we employed this digital twin to initiate an adversarial attack. For this purpose, we inserted adversarial perturbation as input to the trained model. Our results demonstrated the vulnerability of the digital twin model to adversarial attacks. To tackle this problem, we implemented an innovative modification to the digital twin’s architecture to enhance its robustness against various attacks. We proposed a novel defense method that fuses wavelet denoising and adversarial training, substantially strengthening the model’s resilience to adversarial attacks. Furthermore, the proposed digital twin is evaluated using a dataset of diabetic foot ulcers. To the best of our knowledge, it is the first defense method that makes the medical digital twin significantly robust against adversarial attacks.
人工智能(AI)、物联网(IoT)和云计算等最新技术的发展,催生了一种被称为数字孪生(DTs)的创新技术。数字孪生是物理实体的虚拟复制品,中间有数据连接。事实证明,这项技术在多个行业都非常有效,能够改善决策和提高运营效率。在医疗保健等关键领域,数字孪生越来越多地被用于通过创建医院、医疗设备、病人甚至单个器官的虚拟仿真来解决传统方法的局限性。这些医疗数字孪生系统(MDT)通过提供先进的解决方案来提高治疗效果和整体患者护理水平,从而彻底改变了医疗保健行业。然而,由于涉及安全和关键问题,这些系统具有挑战性。因此,尽管数字孪生技术已经取得了一定的成就,但众多的安全威胁使得应对数字孪生技术的安全挑战变得至关重要。鉴于缺乏针对 MDT 功能攻击的研究,我们将注意力集中在一种被称为对抗性攻击的特定网络威胁上。对抗性攻击通过引入精心设计的微小扰动来操纵输入数据,从而利用模型的性能。为了评估医疗数字孪生对此类攻击的脆弱性,我们开展了一项概念验证研究。利用图像处理技术和人工神经网络模型,我们创建了一个数字孪生,通过热成像图像诊断乳腺癌。然后,我们利用这个数字孪生来发起对抗性攻击。为此,我们在训练好的模型中插入了对抗性扰动作为输入。我们的结果表明,数字孪生模型很容易受到对抗性攻击。为了解决这个问题,我们对数字孪生的架构进行了创新性修改,以增强其抵御各种攻击的能力。我们提出了一种融合了小波去噪和对抗训练的新型防御方法,大大增强了模型对对抗性攻击的抵御能力。此外,我们还利用糖尿病足溃疡数据集对所提出的数字孪生模型进行了评估。据我们所知,这是第一种能使医学数字孪生模型显著抵御对抗性攻击的防御方法。
{"title":"Adversarial robust image processing in medical digital twin","authors":"Samaneh Shamshiri ,&nbsp;Huaping Liu ,&nbsp;Insoo Sohn","doi":"10.1016/j.inffus.2024.102728","DOIUrl":"10.1016/j.inffus.2024.102728","url":null,"abstract":"<div><div>Recent advancements in state-of-the-art technologies, including Artificial Intelligence (AI), Internet of Things (IoT), and cloud computing, have led to the emergence of an innovative technology known as digital twins (DTs). A digital twin is a virtual replica of the physical entity, with data connections in between. This technology has proven highly effective in several industries by improving decision-making and operational efficiency. In critical areas like healthcare, digital twins are increasingly being used to address the limitations of conventional approaches by creating virtual simulations of hospitals, medical equipment, patients, or even individual organs. These medical digital twins (MDT) revolutionize the healthcare industry by offering advanced solutions to enhance treatment outcomes and overall patient care. However, these systems are challenging because of the security and critical issues involved. Therefore, despite their achievements, the numerous security threats make it crucial to address the security challenges of digital twin technology. Given the lack of research on attacks targeting MDT functionalities, we concentrated on a specific cyber threat called adversarial attacks. Adversarial attacks exploit the model’s performance by introducing small, carefully crafted perturbations to manipulate the input data. To assess the vulnerability of medical digital twins to such attacks, we carried out a proof-of-concept study. Using image processing techniques and an artificial neural network model, we created a digital twin to diagnose breast cancer through thermography images. Then, we employed this digital twin to initiate an adversarial attack. For this purpose, we inserted adversarial perturbation as input to the trained model. Our results demonstrated the vulnerability of the digital twin model to adversarial attacks. To tackle this problem, we implemented an innovative modification to the digital twin’s architecture to enhance its robustness against various attacks. We proposed a novel defense method that fuses wavelet denoising and adversarial training, substantially strengthening the model’s resilience to adversarial attacks. Furthermore, the proposed digital twin is evaluated using a dataset of diabetic foot ulcers. To the best of our knowledge, it is the first defense method that makes the medical digital twin significantly robust against adversarial attacks.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102728"},"PeriodicalIF":14.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Human activity recognition using binary sensors: A systematic review 使用二进制传感器识别人类活动:系统综述
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-11 DOI: 10.1016/j.inffus.2024.102731
Muhammad Toaha Raza Khan, Enver Ever, Sukru Eraslan, Yeliz Yesilada
Human activity recognition (HAR) is an emerging area of study and research field that explores the development of automated systems to identify and categorize human activities using data collected from various sensors. In the field of Human Activity Recognition (HAR), binary sensors offer a distinct approach by providing simpler on/off readings to indicate the presence of events such as door openings or light switch activations. Compared to other sensors used for HAR, binary sensors have several advantages, including lower cost, low power consumption, ease of installation, and privacy preservation. For instance, they can be effectively used in smart homes to detect when someone enters or leaves a room without user input. This study presents a systematic review of the state-of-the-art methods and techniques for HAR using binary sensors. We comprehensively consider five crucial aspects: data collection methods, preprocessing techniques, feature extraction and fusion strategies, classification algorithms, and evaluation metrics. Furthermore, we identify the gaps and limitations of the existing studies and provide directions for future research. This comprehensive and up-to-date review can serve as a valuable reference for researchers and practitioners in the field of HAR using binary sensors.
人类活动识别(HAR)是一个新兴的学习和研究领域,它探索开发自动系统,利用从各种传感器收集的数据识别人类活动并对其进行分类。在人类活动识别(HAR)领域,二进制传感器提供了一种与众不同的方法,它通过提供更简单的开/关读数来指示事件的存在,如门的打开或电灯开关的启动。与其他用于人体活动识别的传感器相比,二进制传感器具有成本低、功耗低、易于安装和保护隐私等优点。例如,二进制传感器可以有效地用于智能家居,在没有用户输入的情况下检测某人何时进入或离开房间。本研究系统回顾了使用二进制传感器进行 HAR 的最新方法和技术。我们全面考虑了五个关键方面:数据收集方法、预处理技术、特征提取和融合策略、分类算法和评估指标。此外,我们还指出了现有研究的不足和局限,并为未来研究指明了方向。这篇全面、最新的综述可为使用二进制传感器的 HAR 领域的研究人员和从业人员提供有价值的参考。
{"title":"Human activity recognition using binary sensors: A systematic review","authors":"Muhammad Toaha Raza Khan,&nbsp;Enver Ever,&nbsp;Sukru Eraslan,&nbsp;Yeliz Yesilada","doi":"10.1016/j.inffus.2024.102731","DOIUrl":"10.1016/j.inffus.2024.102731","url":null,"abstract":"<div><div>Human activity recognition (HAR) is an emerging area of study and research field that explores the development of automated systems to identify and categorize human activities using data collected from various sensors. In the field of Human Activity Recognition (HAR), binary sensors offer a distinct approach by providing simpler on/off readings to indicate the presence of events such as door openings or light switch activations. Compared to other sensors used for HAR, binary sensors have several advantages, including lower cost, low power consumption, ease of installation, and privacy preservation. For instance, they can be effectively used in smart homes to detect when someone enters or leaves a room without user input. This study presents a systematic review of the state-of-the-art methods and techniques for HAR using binary sensors. We comprehensively consider five crucial aspects: data collection methods, preprocessing techniques, feature extraction and fusion strategies, classification algorithms, and evaluation metrics. Furthermore, we identify the gaps and limitations of the existing studies and provide directions for future research. This comprehensive and up-to-date review can serve as a valuable reference for researchers and practitioners in the field of HAR using binary sensors.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102731"},"PeriodicalIF":14.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable natural language processing for corporate sustainability analysis 用于企业可持续发展分析的可解释自然语言处理技术
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-11 DOI: 10.1016/j.inffus.2024.102726
Keane Ong , Rui Mao , Ranjan Satapathy , Ricardo Shirota Filho , Erik Cambria , Johan Sulaeman , Gianmarco Mengaldo
Sustainability commonly refers to entities, such as individuals, companies, and institutions, having a non-detrimental (or even positive) impact on the environment, society, and the economy. With sustainability becoming a synonym of acceptable and legitimate behaviour, it is being increasingly demanded and regulated. Several frameworks and standards have been proposed to measure the sustainability impact of corporations, including United Nations’ sustainable development goals and the recently introduced global sustainability reporting framework, amongst others. However, the concept of corporate sustainability is complex due to the diverse and intricate nature of firm operations (i.e. geography, size, business activities, interlinks with other stakeholders). As a result, corporate sustainability assessments are plagued by subjectivity both within data that reflect corporate sustainability efforts (i.e. corporate sustainability disclosures) and the analysts evaluating them. This subjectivity can be distilled into distinct challenges, such as incompleteness, ambiguity, unreliability and sophistication on the data dimension, as well as limited resources and potential bias on the analyst dimension. Put together, subjectivity hinders effective cost attribution to entities non-compliant with prevailing sustainability expectations, potentially rendering sustainability efforts and its associated regulations futile. To this end, we argue that Explainable Natural Language Processing (XNLP) can significantly enhance corporate sustainability analysis. Specifically, linguistic understanding algorithms (lexical, semantic, syntactic), integrated with XAI capabilities (interpretability, explainability, faithfulness), can bridge gaps in analyst resources and mitigate subjectivity problems within data.
可持续发展通常指个人、公司和机构等实体对环境、社会和经济产生无害(甚至积极)的影响。随着可持续发展成为可接受的合法行为的代名词,人们对可持续发展的要求和监管也越来越多。已经提出了一些框架和标准来衡量企业的可持续发展影响,其中包括联合国的可持续发展目标和最近推出的全球可持续发展报告框架等。然而,由于企业运营的多样性和复杂性(即地理位置、规模、业务活动、与其他利益相关者的相互联系),企业可持续发展的概念非常复杂。因此,无论是反映企业可持续发展努力的数据(即企业可持续发展信息披露),还是评估这些数据的分析师,都受到主观性的困扰。这种主观性可以提炼为不同的挑战,如数据方面的不完整性、模糊性、不可靠性和复杂性,以及分析师方面的资源有限和潜在偏见。总之,主观性阻碍了对不符合当前可持续发展预期的实体进行有效的成本归因,可能导致可持续发展工作及其相关法规徒劳无功。为此,我们认为可解释自然语言处理(XNLP)可以显著提高企业可持续发展分析能力。具体来说,语言理解算法(词法、语义、句法)与 XAI 功能(可解释性、可解释性、忠实性)相结合,可以弥补分析师资源的不足,缓解数据中的主观性问题。
{"title":"Explainable natural language processing for corporate sustainability analysis","authors":"Keane Ong ,&nbsp;Rui Mao ,&nbsp;Ranjan Satapathy ,&nbsp;Ricardo Shirota Filho ,&nbsp;Erik Cambria ,&nbsp;Johan Sulaeman ,&nbsp;Gianmarco Mengaldo","doi":"10.1016/j.inffus.2024.102726","DOIUrl":"10.1016/j.inffus.2024.102726","url":null,"abstract":"<div><div>Sustainability commonly refers to entities, such as individuals, companies, and institutions, having a non-detrimental (or even positive) impact on the environment, society, and the economy. With sustainability becoming a synonym of acceptable and legitimate behaviour, it is being increasingly demanded and regulated. Several frameworks and standards have been proposed to measure the sustainability impact of corporations, including United Nations’ sustainable development goals and the recently introduced global sustainability reporting framework, amongst others. However, the concept of corporate sustainability is complex due to the diverse and intricate nature of firm operations (<em>i.e.</em> geography, size, business activities, interlinks with other stakeholders). As a result, corporate sustainability assessments are plagued by subjectivity both within data that reflect corporate sustainability efforts (<em>i.e.</em> corporate sustainability disclosures) and the analysts evaluating them. This subjectivity can be distilled into distinct challenges, such as incompleteness, ambiguity, unreliability and sophistication on the data dimension, as well as limited resources and potential bias on the analyst dimension. Put together, subjectivity hinders effective cost attribution to entities non-compliant with prevailing sustainability expectations, potentially rendering sustainability efforts and its associated regulations futile. To this end, we argue that Explainable Natural Language Processing (XNLP) can significantly enhance corporate sustainability analysis. Specifically, linguistic understanding algorithms (lexical, semantic, syntactic), integrated with XAI capabilities (interpretability, explainability, faithfulness), can bridge gaps in analyst resources and mitigate subjectivity problems within data.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102726"},"PeriodicalIF":14.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing few-shot lifelong learning through fusion of cross-domain knowledge 通过跨领域知识的融合,加强少数人的终身学习
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-11 DOI: 10.1016/j.inffus.2024.102730
Yaoyue Zheng , Xuetao Zhang , Zhiqiang Tian , Shaoyi Du
Humans can continually solve new problems with a few examples and enhance their learned knowledge by incorporating new ones. Few-shot lifelong learning (FSLL) has been presented to mimic human learning ability. However, they overlook the significance of cross-domain knowledge and little effort has been made to investigate it. In this paper, we explore the effects of cross-domain knowledge in FSLL and propose a new framework to enhance the model’s ability by fusing cross-domain knowledge into the learning process. Moreover, we investigate the impact of both debiased and non-debiased models in the FSLL context for the first time. Compared with previous works, our setting presents a unique challenge: the model should continually learn new knowledge from cross-domain few-shot data and update its existing knowledge by fusing new knowledge throughout its lifelong learning process. To address this challenge, the proposed framework focuses on learning and updating while migrating the well-known issues of forgetting and overfitting. The framework comprises three key components designed for learning cross-domain knowledge: the Debiased Base Learning strategy, Knowledge Acquisition, and Knowledge Update. The superiority of the framework is validated on mini-ImageNet, CIFAR-100, OfficeHome, and Meta-Dataset. Experiments show that the proposed framework exhibits the capability to perform in cross-domain situations and also achieves state-of-the-art performance in the non-cross-domain situation.
人类可以通过少量实例不断解决新问题,并通过吸收新的实例来增强已学知识。有人提出了 "少量终生学习"(FSLL)来模仿人类的学习能力。然而,它们忽视了跨领域知识的重要性,而且很少有人对此进行研究。在本文中,我们探讨了跨领域知识在 FSLL 中的作用,并提出了一个新的框架,通过将跨领域知识融合到学习过程中来增强模型的能力。此外,我们还首次在 FSLL 中研究了去偏差模型和非去偏差模型的影响。与之前的研究相比,我们的研究提出了一个独特的挑战:模型应不断从跨领域的少量数据中学习新知识,并在整个终身学习过程中通过融合新知识来更新现有知识。为了应对这一挑战,我们提出的框架侧重于学习和更新,同时解决了众所周知的遗忘和过拟合问题。该框架由三个为学习跨领域知识而设计的关键部分组成:去偏基学习策略、知识获取和知识更新。该框架的优越性在 mini-ImageNet、CIFAR-100、OfficeHome 和 Meta-Dataset 上得到了验证。实验表明,所提出的框架具有在跨领域情况下执行任务的能力,而且在非跨领域情况下也达到了最先进的性能。
{"title":"Enhancing few-shot lifelong learning through fusion of cross-domain knowledge","authors":"Yaoyue Zheng ,&nbsp;Xuetao Zhang ,&nbsp;Zhiqiang Tian ,&nbsp;Shaoyi Du","doi":"10.1016/j.inffus.2024.102730","DOIUrl":"10.1016/j.inffus.2024.102730","url":null,"abstract":"<div><div>Humans can continually solve new problems with a few examples and enhance their learned knowledge by incorporating new ones. Few-shot lifelong learning (FSLL) has been presented to mimic human learning ability. However, they overlook the significance of cross-domain knowledge and little effort has been made to investigate it. In this paper, we explore the effects of cross-domain knowledge in FSLL and propose a new framework to enhance the model’s ability by fusing cross-domain knowledge into the learning process. Moreover, we investigate the impact of both debiased and non-debiased models in the FSLL context for the first time. Compared with previous works, our setting presents a unique challenge: the model should continually learn new knowledge from cross-domain few-shot data and update its existing knowledge by fusing new knowledge throughout its lifelong learning process. To address this challenge, the proposed framework focuses on learning and updating while migrating the well-known issues of forgetting and overfitting. The framework comprises three key components designed for learning cross-domain knowledge: the Debiased Base Learning strategy, Knowledge Acquisition, and Knowledge Update. The superiority of the framework is validated on mini-ImageNet, CIFAR-100, OfficeHome, and Meta-Dataset. Experiments show that the proposed framework exhibits the capability to perform in cross-domain situations and also achieves state-of-the-art performance in the non-cross-domain situation.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102730"},"PeriodicalIF":14.7,"publicationDate":"2024-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532035","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards facial micro-expression detection and classification using modified multimodal ensemble learning approach 利用改进的多模态集合学习方法实现面部微表情检测和分类
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-10 DOI: 10.1016/j.inffus.2024.102735
Fuli Zhang , Yu Liu , Xiaoling Yu , Zhichen Wang , Qi Zhang , Jing Wang , Qionghua Zhang
A micro-expression is a fleeting, delicate and localized facial gesture. It can expose the true feelings that someone is trying to hide and is seen to be a crucial indicator for spotting lies. Because of its possible applications in a variety of sectors, micro-expression research has garnered a lot of attention. The accuracy of micro-expression recognition still needs to be improved, though, because of the brief and weak motions that make up micro-expressions. In recent years, Deep convolution neural methods have depicted a higher degree of efficiency for complex challenge of face detection. Although several attempts were made for micro-expression recognition (MER), the problem is far from being resolved problem which is portrayed by the lowest accuracy rate depicted by the other models. In this study, present a Facial Micro-Expression Detection and Classification using Modified Multimodal Ensemble Learning (FMEDC-MMEL) approach. The major intention of the FMEDC-MMEL technique lies in the proficient identification of MEs that exist in the facial images. As a pre-processing phase, the FMEDC-MMEL technique exploits histogram equalization (HE) approach to improve the contrast level of the image. In the FMEDC-MMEL technique, improved densely connected networks (DenseNet) model is used for learning feature patterns from the pre-processed images. To enhance the proficiency of the improved DenseNet model, stochastic gradient descent (SGD) approach is used for hyperparameter selection process. For facial ME detection, the FMEDC-MMEL technique follows an ensemble of three classifiers namely bi-directional gated recurrent unit (Bi-GRU), long short-term memory (LSTM) and extreme learning machine (ELM). A tailored ensemble learning approach is shown, which combines many machine learning models to improve classification performance and detection accuracy. Sophisticated feature extraction methods are utilized to extract the subtle aspects of micro-expressions, and precision is maintained by optimizations that minimize computing cost. Empirical findings reveal that this methodology notably surpasses conventional techniques, providing enhanced precision and resilience on a variety of complex and demanding datasets. In addition to pushing the boundaries of micro-expression analysis research, the proposed strategy has potential uses in the real world in fields including security, psychology testing, and human-computer interaction.
微表情是一种稍纵即逝、细腻而局部的面部动作。它可以暴露一个人试图隐藏的真实情感,被视为识破谎言的关键指标。由于微表情可能应用于多个领域,因此微表情研究受到了广泛关注。不过,由于微表情的动作短暂而微弱,因此微表情识别的准确性仍有待提高。近年来,深度卷积神经方法在应对复杂的人脸检测挑战时表现出了更高的效率。虽然人们对微表情识别(MER)进行了多次尝试,但问题远未解决,其他模型的准确率最低就说明了这一点。本研究提出了一种使用修正多模态集合学习(FMEDC-MMEL)的面部微表情检测和分类方法。FMEDC-MMEL 技术的主要目的在于熟练识别面部图像中存在的微表情。作为预处理阶段,FMEDC-MMEL 技术利用直方图均衡化(HE)方法来提高图像的对比度。在 FMEDC-MMEL 技术中,改进的密集连接网络(DenseNet)模型用于从预处理图像中学习特征模式。为了提高改进型 DenseNet 模型的能力,在超参数选择过程中使用了随机梯度下降(SGD)方法。对于面部 ME 检测,FMEDC-MMEL 技术采用了三种分类器的集合,即双向门控递归单元(Bi-GRU)、长短期记忆(LSTM)和极端学习机(ELM)。图中展示了一种量身定制的集合学习方法,它结合了多种机器学习模型,以提高分类性能和检测准确率。复杂的特征提取方法用于提取微表情的细微特征,并通过优化计算成本来保持精确度。实证研究结果表明,这种方法明显超越了传统技术,在各种复杂和高要求的数据集上提供了更高的精度和弹性。除了推动微表情分析研究的发展,所提出的策略在现实世界中还有潜在用途,包括安全、心理测试和人机交互等领域。
{"title":"Towards facial micro-expression detection and classification using modified multimodal ensemble learning approach","authors":"Fuli Zhang ,&nbsp;Yu Liu ,&nbsp;Xiaoling Yu ,&nbsp;Zhichen Wang ,&nbsp;Qi Zhang ,&nbsp;Jing Wang ,&nbsp;Qionghua Zhang","doi":"10.1016/j.inffus.2024.102735","DOIUrl":"10.1016/j.inffus.2024.102735","url":null,"abstract":"<div><div>A micro-expression is a fleeting, delicate and localized facial gesture. It can expose the true feelings that someone is trying to hide and is seen to be a crucial indicator for spotting lies. Because of its possible applications in a variety of sectors, micro-expression research has garnered a lot of attention. The accuracy of micro-expression recognition still needs to be improved, though, because of the brief and weak motions that make up micro-expressions. In recent years, Deep convolution neural methods have depicted a higher degree of efficiency for complex challenge of face detection. Although several attempts were made for micro-expression recognition (MER), the problem is far from being resolved problem which is portrayed by the lowest accuracy rate depicted by the other models. In this study, present a Facial Micro-Expression Detection and Classification using Modified Multimodal Ensemble Learning (FMEDC-MMEL) approach. The major intention of the FMEDC-MMEL technique lies in the proficient identification of MEs that exist in the facial images. As a pre-processing phase, the FMEDC-MMEL technique exploits histogram equalization (HE) approach to improve the contrast level of the image. In the FMEDC-MMEL technique, improved densely connected networks (DenseNet) model is used for learning feature patterns from the pre-processed images. To enhance the proficiency of the improved DenseNet model, stochastic gradient descent (SGD) approach is used for hyperparameter selection process. For facial ME detection, the FMEDC-MMEL technique follows an ensemble of three classifiers namely bi-directional gated recurrent unit (Bi-GRU), long short-term memory (LSTM) and extreme learning machine (ELM). A tailored ensemble learning approach is shown, which combines many machine learning models to improve classification performance and detection accuracy. Sophisticated feature extraction methods are utilized to extract the subtle aspects of micro-expressions, and precision is maintained by optimizations that minimize computing cost. Empirical findings reveal that this methodology notably surpasses conventional techniques, providing enhanced precision and resilience on a variety of complex and demanding datasets. In addition to pushing the boundaries of micro-expression analysis research, the proposed strategy has potential uses in the real world in fields including security, psychology testing, and human-computer interaction.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102735"},"PeriodicalIF":14.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142446474","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A survey of evidential clustering: Definitions, methods, and applications 证据聚类调查:定义、方法和应用
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-10 DOI: 10.1016/j.inffus.2024.102736
Zuowei Zhang , Yiru Zhang , Hongpeng Tian , Arnaud Martin , Zhunga Liu , Weiping Ding
In the realm of information fusion, clustering stands out as a common subject and is extensively applied across various fields. Evidential clustering, an increasingly popular method in the soft clustering family, derives its strength from the theory of belief functions, which enables it to effectively characterize the uncertainty and imprecision of data distributions. This survey provides a comprehensive overview of evidential clustering, detailing its theoretical foundations, methodologies, and applications. Specifically, we start by briefly recalling the theory of belief functions with its transformations into other uncertainty reasoning theories. Then, we introduce the concepts of soft data, partitions, and methods with an emphasis on data and partitioning within the theory of belief functions. Subsequently, we summarize the advancements and quantitative evaluations of existing evidential clustering methods and provide a roadmap to help in selecting an appropriate method based on specific application needs. Finally, we identify the major challenges faced in the development and application of evidential clustering, pointing out promising avenues for future research, including theoretical limitations, applicable datasets, and application domains. The survey offers a structured understanding of existing evidential clustering methods, highlighting their theoretical underpinnings, practical implementations, and future research directions. It serves as a valuable resource for researchers seeking to deepen their understanding of evidential clustering.
在信息融合领域,聚类是一个常见的课题,被广泛应用于各个领域。证据聚类是软聚类家族中一种日益流行的方法,其优势来自于信念函数理论,该理论使其能够有效地描述数据分布的不确定性和不精确性。本研究全面概述了证据聚类,详细介绍了其理论基础、方法和应用。具体来说,我们首先简要回顾了信念函数理论及其与其他不确定性推理理论的转换。然后,我们介绍软数据、分区和方法的概念,重点是信念函数理论中的数据和分区。随后,我们总结了现有证据聚类方法的进展和定量评估,并提供了一个路线图,以帮助根据具体应用需求选择合适的方法。最后,我们明确了证据聚类的开发和应用所面临的主要挑战,指出了未来研究的前景,包括理论限制、适用数据集和应用领域。本调查报告提供了对现有证据聚类方法的结构化理解,强调了这些方法的理论基础、实际应用和未来研究方向。它是研究人员加深对证据聚类理解的宝贵资源。
{"title":"A survey of evidential clustering: Definitions, methods, and applications","authors":"Zuowei Zhang ,&nbsp;Yiru Zhang ,&nbsp;Hongpeng Tian ,&nbsp;Arnaud Martin ,&nbsp;Zhunga Liu ,&nbsp;Weiping Ding","doi":"10.1016/j.inffus.2024.102736","DOIUrl":"10.1016/j.inffus.2024.102736","url":null,"abstract":"<div><div>In the realm of information fusion, clustering stands out as a common subject and is extensively applied across various fields. Evidential clustering, an increasingly popular method in the soft clustering family, derives its strength from the theory of belief functions, which enables it to effectively characterize the uncertainty and imprecision of data distributions. This survey provides a comprehensive overview of evidential clustering, detailing its theoretical foundations, methodologies, and applications. Specifically, we start by briefly recalling the theory of belief functions with its transformations into other uncertainty reasoning theories. Then, we introduce the concepts of soft data, partitions, and methods with an emphasis on data and partitioning within the theory of belief functions. Subsequently, we summarize the advancements and quantitative evaluations of existing evidential clustering methods and provide a roadmap to help in selecting an appropriate method based on specific application needs. Finally, we identify the major challenges faced in the development and application of evidential clustering, pointing out promising avenues for future research, including theoretical limitations, applicable datasets, and application domains. The survey offers a structured understanding of existing evidential clustering methods, highlighting their theoretical underpinnings, practical implementations, and future research directions. It serves as a valuable resource for researchers seeking to deepen their understanding of evidential clustering.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102736"},"PeriodicalIF":14.7,"publicationDate":"2024-10-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Attention-guided hierarchical fusion U-Net for uncertainty-driven medical image segmentation 用于不确定性驱动医学图像分割的注意力引导分层融合 U-Net
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-09 DOI: 10.1016/j.inffus.2024.102719
Afsana Ahmed Munia , Moloud Abdar , Mehedi Hasan , Mohammad S. Jalali , Biplab Banerjee , Abbas Khosravi , Ibrahim Hossain , Huazhu Fu , Alejandro F. Frangi
Small inaccuracies in the system components or artificial intelligence (AI) models for medical imaging could have significant consequences leading to life hazards. To mitigate those risks, one must consider the precision of the image analysis outcomes (e.g., image segmentation), along with the confidence in the underlying model predictions. U-shaped architectures, based on the convolutional encoder–decoder, have established themselves as a critical component of many AI-enabled diagnostic imaging systems. However, most of the existing methods focus on producing accurate diagnostic predictions without assessing the uncertainty associated with such predictions or the introduced techniques. Uncertainty maps highlight areas in the predicted segmented results, where the model is uncertain or less confident. This could lead radiologists to pay more attention to ensuring patient safety and pave the way for trustworthy AI applications. In this paper, we therefore propose the Attention-guided Hierarchical Fusion U-Net (named AHF-U-Net) for medical image segmentation. We then introduce the uncertainty-aware version of it called UA-AHF-U-Net which provides the uncertainty map alongside the predicted segmentation map. The network is designed by integrating the Encoder Attention Fusion module (EAF) and the Decoder Attention Fusion module (DAF) on the encoder and decoder sides of the U-Net architecture, respectively. The EAF and DAF modules utilize spatial and channel attention to capture relevant spatial information and indicate which channels are appropriate for a given image. Furthermore, an enhanced skip connection is introduced and named the Hierarchical Attention-Enhanced (HAE) skip connection. We evaluated the efficiency of our model by comparing it with eleven well-established methods for three popular medical image segmentation datasets consisting of coarse-grained images with unclear boundaries. Based on the quantitative and qualitative results, the proposed method ranks first in two datasets and second in a third. The code can be accessed at: https://github.com/AfsanaAhmedMunia/AHF-Fusion-U-Net.
医疗成像系统组件或人工智能(AI)模型中的微小误差都可能造成严重后果,导致生命危险。为了降低这些风险,我们必须考虑图像分析结果(如图像分割)的精确度以及对基础模型预测的信心。基于卷积编码器-解码器的 U 型架构已成为许多人工智能诊断成像系统的重要组成部分。然而,现有的大多数方法都只关注准确的诊断预测,而没有评估与这些预测或引入的技术相关的不确定性。不确定性图会突出显示预测分割结果中模型不确定或信心不足的区域。这将促使放射科医生更加关注确保患者安全,并为值得信赖的人工智能应用铺平道路。因此,我们在本文中提出了用于医学图像分割的注意力引导分层融合 U-Net(命名为 AHF-U-Net)。然后,我们介绍了其不确定性感知版本 UA-AHF-U-Net,该版本可在预测分割图的同时提供不确定性图。该网络是通过在 U-Net 架构的编码器和解码器侧分别集成编码器注意融合模块(EAF)和解码器注意融合模块(DAF)而设计的。EAF 和 DAF 模块利用空间和信道注意力来捕捉相关的空间信息,并指出哪些信道适合给定的图像。此外,我们还引入了一种增强型跳转连接,并将其命名为 "分层注意力增强型(HAE)跳转连接"。我们将我们的模型与 11 种成熟的方法进行了比较,从而评估了我们模型的效率,这些方法适用于三种流行的医学图像分割数据集,其中包括边界不清晰的粗粒度图像。根据定量和定性结果,所提出的方法在两个数据集中排名第一,在第三个数据集中排名第二。代码可从以下网址获取:https://github.com/AfsanaAhmedMunia/AHF-Fusion-U-Net。
{"title":"Attention-guided hierarchical fusion U-Net for uncertainty-driven medical image segmentation","authors":"Afsana Ahmed Munia ,&nbsp;Moloud Abdar ,&nbsp;Mehedi Hasan ,&nbsp;Mohammad S. Jalali ,&nbsp;Biplab Banerjee ,&nbsp;Abbas Khosravi ,&nbsp;Ibrahim Hossain ,&nbsp;Huazhu Fu ,&nbsp;Alejandro F. Frangi","doi":"10.1016/j.inffus.2024.102719","DOIUrl":"10.1016/j.inffus.2024.102719","url":null,"abstract":"<div><div>Small inaccuracies in the system components or artificial intelligence (AI) models for medical imaging could have significant consequences leading to life hazards. To mitigate those risks, one must consider the precision of the image analysis outcomes (e.g., image segmentation), along with the confidence in the underlying model predictions. U-shaped architectures, based on the convolutional encoder–decoder, have established themselves as a critical component of many AI-enabled diagnostic imaging systems. However, most of the existing methods focus on producing accurate diagnostic predictions without assessing the uncertainty associated with such predictions or the introduced techniques. Uncertainty maps highlight areas in the predicted segmented results, where the model is uncertain or less confident. This could lead radiologists to pay more attention to ensuring patient safety and pave the way for trustworthy AI applications. In this paper, we therefore propose the Attention-guided Hierarchical Fusion U-Net (named AHF-U-Net) for medical image segmentation. We then introduce the uncertainty-aware version of it called UA-AHF-U-Net which provides the uncertainty map alongside the predicted segmentation map. The network is designed by integrating the Encoder Attention Fusion module (EAF) and the Decoder Attention Fusion module (DAF) on the encoder and decoder sides of the U-Net architecture, respectively. The EAF and DAF modules utilize spatial and channel attention to capture relevant spatial information and indicate which channels are appropriate for a given image. Furthermore, an enhanced skip connection is introduced and named the Hierarchical Attention-Enhanced (HAE) skip connection. We evaluated the efficiency of our model by comparing it with eleven well-established methods for three popular medical image segmentation datasets consisting of coarse-grained images with unclear boundaries. Based on the quantitative and qualitative results, the proposed method ranks first in two datasets and second in a third. The code can be accessed at: <span><span>https://github.com/AfsanaAhmedMunia/AHF-Fusion-U-Net</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102719"},"PeriodicalIF":14.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142531893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretability research of deep learning: A literature survey 深度学习的可解释性研究:文献调查
IF 14.7 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-10-09 DOI: 10.1016/j.inffus.2024.102721
Biao Xu, Guanci Yang
Deep learning (DL) has been widely used in various fields. However, its black-box nature limits people's understanding and trust in its decision-making process. Therefore, it becomes crucial to research the DL interpretability, which can elucidate the model's decision-making processes and behaviors. This review provides an overview of the current status of interpretability research. First, the DL's typical models, principles, and applications are introduced. Then, the definition and significance of interpretability are clarified. Subsequently, some typical interpretability algorithms are introduced into four groups: active, passive, supplementary, and integrated explanations. After that, several evaluation indicators for interpretability are briefly described, and the relationship between interpretability and model performance is explored. Next, the specific applications of some interpretability methods/models in actual scenarios are introduced. Finally, the interpretability research challenges and future development directions are discussed.
深度学习(DL)已被广泛应用于各个领域。然而,其黑箱性质限制了人们对其决策过程的理解和信任。因此,研究深度学习的可解释性变得至关重要,它可以阐明模型的决策过程和行为。本综述概述了可解释性研究的现状。首先,介绍了 DL 的典型模型、原理和应用。然后,阐明了可解释性的定义和意义。随后,介绍了一些典型的可解释性算法,分为四类:主动解释、被动解释、补充解释和综合解释。之后,简要介绍了几种可解释性评价指标,并探讨了可解释性与模型性能之间的关系。接着,介绍了一些可解释性方法/模型在实际场景中的具体应用。最后,讨论了可解释性研究的挑战和未来发展方向。
{"title":"Interpretability research of deep learning: A literature survey","authors":"Biao Xu,&nbsp;Guanci Yang","doi":"10.1016/j.inffus.2024.102721","DOIUrl":"10.1016/j.inffus.2024.102721","url":null,"abstract":"<div><div>Deep learning (DL) has been widely used in various fields. However, its black-box nature limits people's understanding and trust in its decision-making process. Therefore, it becomes crucial to research the DL interpretability, which can elucidate the model's decision-making processes and behaviors. This review provides an overview of the current status of interpretability research. First, the DL's typical models, principles, and applications are introduced. Then, the definition and significance of interpretability are clarified. Subsequently, some typical interpretability algorithms are introduced into four groups: active, passive, supplementary, and integrated explanations. After that, several evaluation indicators for interpretability are briefly described, and the relationship between interpretability and model performance is explored. Next, the specific applications of some interpretability methods/models in actual scenarios are introduced. Finally, the interpretability research challenges and future development directions are discussed.</div></div>","PeriodicalId":50367,"journal":{"name":"Information Fusion","volume":"115 ","pages":"Article 102721"},"PeriodicalIF":14.7,"publicationDate":"2024-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142532033","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Fusion
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1