首页 > 最新文献

Information Processing & Management最新文献

英文 中文
The joint extraction of fact-condition statement and super relation in scientific text with table filling method 用表格填充法联合提取科学文本中的事实条件陈述和超级关系
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-10-01 DOI: 10.1016/j.ipm.2024.103906
Qizhi Chen , Hong Yao , Diange Zhou
The fact-condition statements are of great significance in scientific text, via which the natural phenomenon and its precondition are detailly recorded. In previous study, the extraction of fact-condition statement and their relation (super relation) from scientific text is designed as a pipeline that the fact-condition statement and super relation are extracted successively, which leads to the error propagation and lowers the accuracy. To solve this problem, the table filling method is firstly adopted for joint extraction of fact-condition statement and super relation, and the Biaffine Convolution Neural Network model (BCNN) is proposed to complete the task. In the BCNN, the pretrained language model and Biaffine Neural Network work as the encoder, while the Convolution Neural Network is added into the model as the decoder that enhances the local semantic information. Benefiting from the local semantic enhancement, the BCNN achieves the best F1 score with different pretrained language models in comparison with other baselines. Its F1 scores in GeothCF (geological text) reach 73.17% and 71.04% with BERT and SciBERT as pretrained language model, respectively. Moreover, the local semantic enhancement also increases its training efficiency, via which the tags’ distribution can be more easily learned by the model. Besides, the BCNN trained with GeothCF also exhibits the best performance in BioCF (biomedical text), which indicates that it can be widely applied for the information extraction in all scientific domains. Finally, the geological fact-condition knowledge graph is built with BCNN, showing a new pipeline for construction of scientific fact-condition knowledge graph.
事实条件语句在科学文本中具有重要意义,通过它可以详细记录自然现象及其前提条件。在以往的研究中,从科学文本中提取事实条件语句及其关系(超关系)被设计成一个流水线,即先后提取事实条件语句和超关系,这导致了错误的传播,降低了准确性。为解决这一问题,首先采用表格填充法对事实条件语句和超级关系进行联合提取,并提出了双峰卷积神经网络(Biaffine Convolution Neural Network,BCNN)模型来完成这一任务。在 BCNN 中,预训练的语言模型和 Biaffine 神经网络作为编码器工作,而卷积神经网络则作为解码器加入到模型中,以增强局部语义信息。得益于局部语义增强,与其他基线相比,BCNN 在不同的预训练语言模型中取得了最好的 F1 分数。在使用 BERT 和 SciBERT 作为预训练语言模型时,BCNN 在 GeothCF(地质文本)中的 F1 分数分别达到 73.17% 和 71.04%。此外,局部语义增强也提高了其训练效率,通过这种方法,模型可以更容易地学习标签的分布。此外,用 GeothCF 训练的 BCNN 在 BioCF(生物医学文本)中也表现出了最佳性能,这表明它可以广泛应用于所有科学领域的信息提取。最后,利用 BCNN 构建了地质事实条件知识图谱,为科学事实条件知识图谱的构建提供了新的管道。
{"title":"The joint extraction of fact-condition statement and super relation in scientific text with table filling method","authors":"Qizhi Chen ,&nbsp;Hong Yao ,&nbsp;Diange Zhou","doi":"10.1016/j.ipm.2024.103906","DOIUrl":"10.1016/j.ipm.2024.103906","url":null,"abstract":"<div><div>The fact-condition statements are of great significance in scientific text, via which the natural phenomenon and its precondition are detailly recorded. In previous study, the extraction of fact-condition statement and their relation (super relation) from scientific text is designed as a pipeline that the fact-condition statement and super relation are extracted successively, which leads to the error propagation and lowers the accuracy. To solve this problem, the table filling method is firstly adopted for joint extraction of fact-condition statement and super relation, and the Biaffine Convolution Neural Network model (BCNN) is proposed to complete the task. In the BCNN, the pretrained language model and Biaffine Neural Network work as the encoder, while the Convolution Neural Network is added into the model as the decoder that enhances the local semantic information. Benefiting from the local semantic enhancement, the BCNN achieves the best F1 score with different pretrained language models in comparison with other baselines. Its F1 scores in GeothCF (geological text) reach 73.17% and 71.04% with BERT and SciBERT as pretrained language model, respectively. Moreover, the local semantic enhancement also increases its training efficiency, via which the tags’ distribution can be more easily learned by the model. Besides, the BCNN trained with GeothCF also exhibits the best performance in BioCF (biomedical text), which indicates that it can be widely applied for the information extraction in all scientific domains. Finally, the geological fact-condition knowledge graph is built with BCNN, showing a new pipeline for construction of scientific fact-condition knowledge graph.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103906"},"PeriodicalIF":7.4,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142417752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Crowdsourced auction-based framework for time-critical and budget-constrained last mile delivery 基于众包拍卖的时间紧迫、预算有限的最后一英里交付框架
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-30 DOI: 10.1016/j.ipm.2024.103888
Esraa Odeh , Shakti Singh , Rabeb Mizouni , Hadi Otrok
This work addresses the problem of Last Mile Delivery (LMD) under time-critical and budget-constrained environments. Given the rapid growth of e-commerce worldwide, LMD has become a primary bottleneck to the efficiency of delivery services due to several factors, including travelling distance, service cost, and delivery time. Existing works mainly target optimizing travelled distance and maximizing gained profit; however, they do not consider time-critical and budget-limited tasks. The deployment of UAVs and the development of crowdsourcing platforms have provided a range of solutions to advance performance in LMD frameworks, as they offer many crowdworkers at varying locations ready to perform tasks instead of having a single point of departure. This work proposes a Hybrid, Crowdsourced, Auction-based LMD (HCA-LMD) framework with a dynamic allocation mechanism for optimized delivery of time-sensitive and budget-limited tasks. The proposed framework allocates tasks to workers as soon as they are submitted, given their urgency level and dropoff location, while considering the price, rating, and location of available workers. This work was compared against two benchmarks to assess the framework’s performance in dynamic environments in terms of on-time deliveries, average delay, and profit. Extensive simulation results showed an outstanding performance of the proposed state-of-the-art LMD framework by accomplishing almost 92% on-time deliveries under varying time- and budget-constrained scenarios, outperforming the first benchmark in the on-time allocation rate by fulfiling an additional 24% of the tasks the benchmark failed, with around 50% drop in average delay time and up to x5.8 gained profit when compared against the second benchmark.
这项研究解决的是时间紧迫和预算有限环境下的最后一英里配送(LMD)问题。鉴于全球电子商务的迅猛发展,受旅行距离、服务成本和交付时间等多种因素的影响,最后一英里配送已成为影响配送服务效率的主要瓶颈。现有研究主要以优化运输距离和最大化收益为目标,但没有考虑时间紧迫和预算有限的任务。无人机的部署和众包平台的发展为提高 LMD 框架的性能提供了一系列解决方案,因为它们在不同地点提供了许多准备执行任务的众包者,而不是只有一个出发点。本研究提出了一种基于拍卖的混合众包 LMD(HCA-LMD)框架,该框架具有动态分配机制,可优化时间敏感型和预算有限型任务的交付。根据任务的紧急程度和下达地点,提议的框架会在任务提交后立即将任务分配给工人,同时考虑可用工人的价格、等级和地点。这项工作与两个基准进行了比较,从准时交货、平均延迟和利润方面评估了该框架在动态环境中的性能。广泛的仿真结果表明,所提出的最先进的 LMD 框架表现出色,在时间和预算受限的不同情况下,完成了近 92% 的准时交付,在准时分配率方面优于第一个基准,额外完成了 24% 基准未能完成的任务,与第二个基准相比,平均延迟时间减少了约 50%,利润增加了高达 x5.8。
{"title":"Crowdsourced auction-based framework for time-critical and budget-constrained last mile delivery","authors":"Esraa Odeh ,&nbsp;Shakti Singh ,&nbsp;Rabeb Mizouni ,&nbsp;Hadi Otrok","doi":"10.1016/j.ipm.2024.103888","DOIUrl":"10.1016/j.ipm.2024.103888","url":null,"abstract":"<div><div>This work addresses the problem of Last Mile Delivery (LMD) under time-critical and budget-constrained environments. Given the rapid growth of e-commerce worldwide, LMD has become a primary bottleneck to the efficiency of delivery services due to several factors, including travelling distance, service cost, and delivery time. Existing works mainly target optimizing travelled distance and maximizing gained profit; however, they do not consider time-critical and budget-limited tasks. The deployment of UAVs and the development of crowdsourcing platforms have provided a range of solutions to advance performance in LMD frameworks, as they offer many crowdworkers at varying locations ready to perform tasks instead of having a single point of departure. This work proposes a Hybrid, Crowdsourced, Auction-based LMD (HCA-LMD) framework with a dynamic allocation mechanism for optimized delivery of time-sensitive and budget-limited tasks. The proposed framework allocates tasks to workers as soon as they are submitted, given their urgency level and dropoff location, while considering the price, rating, and location of available workers. This work was compared against two benchmarks to assess the framework’s performance in dynamic environments in terms of on-time deliveries, average delay, and profit. Extensive simulation results showed an outstanding performance of the proposed state-of-the-art LMD framework by accomplishing almost 92% on-time deliveries under varying time- and budget-constrained scenarios, outperforming the first benchmark in the on-time allocation rate by fulfiling an additional 24% of the tasks the benchmark failed, with around 50% drop in average delay time and up to x5.8 gained profit when compared against the second benchmark.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103888"},"PeriodicalIF":7.4,"publicationDate":"2024-09-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356984","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An interpretable polytomous cognitive diagnosis framework for predicting examinee performance 用于预测考生成绩的可解释多项式认知诊断框架
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-29 DOI: 10.1016/j.ipm.2024.103913
Xiaoyu Li , Shaoyang Guo , Jin Wu , Chanjin Zheng
As a fundamental task of intelligent education, deep learning-based cognitive diagnostic models (CDMs) have been introduced to effectively model dichotomous testing data. However, it remains a challenge to model the polytomous data within the deep-learning framework. This paper proposed a novel Polytomous Cognitive Diagnosis Framework (PCDF), which employs Cumulative Category Response Function (CCRF) theory to partition and consolidate data, thereby enabling existing cognitive diagnostic models to seamlessly analyze graded response data. By combining the proposed PCDF with IRT, MIRT, NCDM, KaNCD, and ICDM, extensive experiments were complemented by data re-encoding techniques on the four real-world graded scoring datasets, along with baseline methods such as linear-split, one-vs-all, and random. The results suggest that when combined with existing CDMs, PCDF outperforms the baseline models in terms of prediction. Additionally, we showcase the interpretability of examinee ability and item parameters through the utilization of PCDF.
作为智能教育的一项基本任务,基于深度学习的认知诊断模型(CDM)已被引入,以有效地对二分测试数据进行建模。然而,如何在深度学习框架内对多态数据建模仍是一个挑战。本文提出了一种新颖的多态认知诊断框架(PCDF),它采用累积类别响应函数(CCRF)理论来分割和整合数据,从而使现有的认知诊断模型能够无缝地分析分级响应数据。通过将所提出的 PCDF 与 IRT、MIRT、NCDM、KaNCD 和 ICDM 相结合,在四个真实世界的分级评分数据集上进行了广泛的实验,并辅以数据重新编码技术,以及线性拆分、one-vs-all 和随机等基线方法。结果表明,当与现有的 CDM 相结合时,PCDF 在预测方面优于基线模型。此外,我们还展示了利用 PCDF 对考生能力和项目参数的可解释性。
{"title":"An interpretable polytomous cognitive diagnosis framework for predicting examinee performance","authors":"Xiaoyu Li ,&nbsp;Shaoyang Guo ,&nbsp;Jin Wu ,&nbsp;Chanjin Zheng","doi":"10.1016/j.ipm.2024.103913","DOIUrl":"10.1016/j.ipm.2024.103913","url":null,"abstract":"<div><div>As a fundamental task of intelligent education, deep learning-based cognitive diagnostic models (CDMs) have been introduced to effectively model dichotomous testing data. However, it remains a challenge to model the polytomous data within the deep-learning framework. This paper proposed a novel <strong>P</strong>olytomous <strong>C</strong>ognitive <strong>D</strong>iagnosis <strong>F</strong>ramework (PCDF), which employs <strong>C</strong>umulative <strong>C</strong>ategory <strong>R</strong>esponse <strong>F</strong>unction (CCRF) theory to partition and consolidate data, thereby enabling existing cognitive diagnostic models to seamlessly analyze graded response data. By combining the proposed PCDF with IRT, MIRT, NCDM, KaNCD, and ICDM, extensive experiments were complemented by data re-encoding techniques on the four real-world graded scoring datasets, along with baseline methods such as linear-split, one-vs-all, and random. The results suggest that when combined with existing CDMs, PCDF outperforms the baseline models in terms of prediction. Additionally, we showcase the interpretability of examinee ability and item parameters through the utilization of PCDF.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103913"},"PeriodicalIF":7.4,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356982","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A theoretical framework for human-centered intelligent information services: A systematic review 以人为本的智能信息服务理论框架:系统回顾
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-29 DOI: 10.1016/j.ipm.2024.103891
Qiao Li, Yuelin Li, Shuhan Zhang, Xin Zhou, Zhengyuan Pan
Intelligent Information Services (IIS) employ Artificial Intelligence (AI)-based systems to provide information that matches the user's needs in diverse and evolving environments. Acknowledging the importance of users in AI-empowered IIS success, a growing number of researchers are investigating AI-empowered IIS from a user-centric perspective, establishing the foundation for a new research domain called “Human-Centered Intelligent Information Services” (HCIIS). Nonetheless, a review of user studies in AI-empowered IIS is still lacking, impeding the development of a clear definition and research framework for the HCIIS field. To fill this gap, this study conducts a systematic review of 116 user studies in AI-empowered IIS. Results reveal two primary research themes in user studies in AI-empowered IIS: human-IIS interaction (including user experience, system quality, user attitude, intention and behavior, information quality, and individual task performance) and IIS ethics (e.g., explainability and interpretability, privacy and safety, and inclusivity). Analyzing research gaps within these topics, this study formulates an HCIIS research framework consisting of three interconnected elements: human values and needs, environment, and service. The interconnections between each pair of elements identify three key research domains in HCIIS: interaction, ethics, and evolution. Interaction pertains to the facilitation of human-IIS interaction to meet human needs, encompassing topics including human-centered theory, evaluation, and the design of AI-empowered IIS interaction. Ethics emphasize ensuring AI-empowered IIS alignment with human values and norms within specific environments, covering topics like general and context-specific AI-empowered IIS ethical principles, risk assessment, and governance strategies. Evolution focuses on addressing the fulfillment of human needs in diverse and dynamic environments by continually evolving intelligence, involving the enhancement of AI-empowered IIS environmental sensitivity and adaptability within an intelligent ecosystem driven by technology integration. Central to HCIIS is co-creation, situated at the intersection of interaction, evolution, and ethics, emphasizing collaborative information creation between IIS and humans through hybrid intelligence. In conclusion, HCIIS is defined as a field centered on information co-creation between IIS and humans, distinguishing it from IIS, which focuses on providing information to humans.
智能信息服务(IIS)采用基于人工智能(AI)的系统,在多样化和不断发展的环境中提供符合用户需求的信息。越来越多的研究人员认识到用户对人工智能智能信息服务成功的重要性,正在从以用户为中心的角度研究人工智能智能信息服务,为 "以人为本的智能信息服务"(HCIIS)这一新的研究领域奠定了基础。然而,目前仍缺乏对人工智能赋能的智能信息服务的用户研究的综述,这阻碍了为人机交互智能信息服务领域制定明确的定义和研究框架。为了填补这一空白,本研究对人工智能赋能的智能信息系统中的 116 项用户研究进行了系统回顾。研究结果揭示了人工智能赋能 IIS 用户研究的两个主要研究主题:人-IIS 交互(包括用户体验、系统质量、用户态度、意图和行为、信息质量和个人任务绩效)和 IIS 伦理(如可解释性和可解释性、隐私和安全性以及包容性)。通过分析这些主题中的研究空白,本研究制定了人机交互信息系统研究框架,该框架由三个相互关联的要素组成:人类价值与需求、环境和服务。每对元素之间的相互联系确定了人机交互信息系统的三个关键研究领域:交互、伦理和进化。交互涉及促进人与信息系统之间的交互,以满足人类的需求,包括以人为本的理论、评估和人工智能赋能的 IIS 交互设计等主题。伦理强调确保人工智能赋能的IIS在特定环境中符合人类的价值观和规范,涵盖的主题包括一般和特定环境下人工智能赋能的IIS伦理原则、风险评估和管理策略。进化侧重于通过不断进化的智能来满足人类在多样化动态环境中的需求,包括在技术集成驱动的智能生态系统中提高人工智能赋能的智能信息系统对环境的敏感性和适应性。HCIIS 的核心是共同创造,它位于交互、进化和伦理的交叉点,强调 IIS 与人类通过混合智能协同创造信息。总之,HCIIS 被定义为一个以 IIS 与人类共同创造信息为中心的领域,有别于侧重于向人类提供信息的 IIS。
{"title":"A theoretical framework for human-centered intelligent information services: A systematic review","authors":"Qiao Li,&nbsp;Yuelin Li,&nbsp;Shuhan Zhang,&nbsp;Xin Zhou,&nbsp;Zhengyuan Pan","doi":"10.1016/j.ipm.2024.103891","DOIUrl":"10.1016/j.ipm.2024.103891","url":null,"abstract":"<div><div>Intelligent Information Services (IIS) employ Artificial Intelligence (AI)-based systems to provide information that matches the user's needs in diverse and evolving environments. Acknowledging the importance of users in AI-empowered IIS success, a growing number of researchers are investigating AI-empowered IIS from a user-centric perspective, establishing the foundation for a new research domain called “Human-Centered Intelligent Information Services” (HCIIS). Nonetheless, a review of user studies in AI-empowered IIS is still lacking, impeding the development of a clear definition and research framework for the HCIIS field. To fill this gap, this study conducts a systematic review of 116 user studies in AI-empowered IIS. Results reveal two primary research themes in user studies in AI-empowered IIS: human-IIS interaction (including user experience, system quality, user attitude, intention and behavior, information quality, and individual task performance) and IIS ethics (e.g., explainability and interpretability, privacy and safety, and inclusivity). Analyzing research gaps within these topics, this study formulates an HCIIS research framework consisting of three interconnected elements: human values and needs, environment, and service. The interconnections between each pair of elements identify three key research domains in HCIIS: interaction, ethics, and evolution. Interaction pertains to the facilitation of human-IIS interaction to meet human needs, encompassing topics including human-centered theory, evaluation, and the design of AI-empowered IIS interaction. Ethics emphasize ensuring AI-empowered IIS alignment with human values and norms within specific environments, covering topics like general and context-specific AI-empowered IIS ethical principles, risk assessment, and governance strategies. Evolution focuses on addressing the fulfillment of human needs in diverse and dynamic environments by continually evolving intelligence, involving the enhancement of AI-empowered IIS environmental sensitivity and adaptability within an intelligent ecosystem driven by technology integration. Central to HCIIS is co-creation, situated at the intersection of interaction, evolution, and ethics, emphasizing collaborative information creation between IIS and humans through hybrid intelligence. In conclusion, HCIIS is defined as a field centered on information co-creation between IIS and humans, distinguishing it from IIS, which focuses on providing information to humans.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103891"},"PeriodicalIF":7.4,"publicationDate":"2024-09-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356983","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DualFLAT: Dual Flat-Lattice Transformer for domain-specific Chinese named entity recognition DualFLAT:用于特定领域中文命名实体识别的双平面-网格变换器
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-28 DOI: 10.1016/j.ipm.2024.103902
Yinlong Xiao , Zongcheng Ji , Jianqiang Li , Qing Zhu
Recently, lexicon-enhanced methods for Chinese Named Entity Recognition (NER) have achieved great success which requires a high-quality lexicon. However, for the domain-specific Chinese NER, it is challenging to obtain such a high-quality lexicon due to the different distribution between the general lexicon and domain-specific data, and the high construction cost of the domain lexicon. To address these challenges, we introduce dual-source lexicons (i.e., a general lexicon and a domain lexicon) to acquire enriched lexical knowledge. Considering that the general lexicon often contains more noise compared to its domain counterparts, we further propose a dual-stream model, Dual Flat-LAttice Transformer (DualFLAT), designed to mitigate the impact of noise originating from the general lexicon while comprehensively harnessing the knowledge contained within the dual-source lexicons. Experimental results on three public domain-specific Chinese NER datasets (i.e., News, Novel and E-commerce) demonstrate that our method consistently outperforms the single-source lexicon-enhanced approaches, achieving state-of-the-art results. Specifically, our proposed DualFLAT model consistently outperforms the baseline FLAT, with an increase of up to 1.52%, 4.84% and 1.34% in F1 score for the News, Novel and E-commerce datasets, respectively.
最近,用于中文命名实体识别(NER)的词典增强方法取得了巨大成功,这需要高质量的词典。然而,对于特定领域的中文 NER,由于通用词库和特定领域数据的分布不同,以及领域词库的构建成本较高,要获得这样一个高质量的词库具有挑战性。为了应对这些挑战,我们引入了双源词典(即通用词典和领域词典)来获取丰富的词汇知识。考虑到与领域词库相比,通用词库通常包含更多噪声,我们进一步提出了一种双流模型--双扁平阶梯转换器(Dual Flat-LAttice Transformer,DualFLAT),旨在减轻来自通用词库的噪声的影响,同时全面利用双源词库中包含的知识。在三个公共领域特定中文 NER 数据集(即新闻、小说和电子商务)上的实验结果表明,我们的方法始终优于单源词典增强方法,取得了最先进的结果。具体来说,我们提出的 DualFLAT 模型始终优于基线 FLAT,在新闻、小说和电子商务数据集上的 F1 分数分别提高了 1.52%、4.84% 和 1.34%。
{"title":"DualFLAT: Dual Flat-Lattice Transformer for domain-specific Chinese named entity recognition","authors":"Yinlong Xiao ,&nbsp;Zongcheng Ji ,&nbsp;Jianqiang Li ,&nbsp;Qing Zhu","doi":"10.1016/j.ipm.2024.103902","DOIUrl":"10.1016/j.ipm.2024.103902","url":null,"abstract":"<div><div>Recently, lexicon-enhanced methods for Chinese Named Entity Recognition (NER) have achieved great success which requires a high-quality lexicon. However, for the domain-specific Chinese NER, it is challenging to obtain such a high-quality lexicon due to the different distribution between the general lexicon and domain-specific data, and the high construction cost of the domain lexicon. To address these challenges, we introduce dual-source lexicons (<em>i.e.,</em> a general lexicon and a domain lexicon) to acquire enriched lexical knowledge. Considering that the general lexicon often contains more noise compared to its domain counterparts, we further propose a dual-stream model, Dual Flat-LAttice Transformer (DualFLAT), designed to mitigate the impact of noise originating from the general lexicon while comprehensively harnessing the knowledge contained within the dual-source lexicons. Experimental results on three public domain-specific Chinese NER datasets (<em>i.e.,</em> News, Novel and E-commerce) demonstrate that our method consistently outperforms the single-source lexicon-enhanced approaches, achieving state-of-the-art results. Specifically, our proposed DualFLAT model consistently outperforms the baseline FLAT, with an increase of up to 1.52%, 4.84% and 1.34% in F1 score for the News, Novel and E-commerce datasets, respectively.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103902"},"PeriodicalIF":7.4,"publicationDate":"2024-09-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142356981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gauging, enriching and applying geography knowledge in Pre-trained Language Models 衡量、丰富和应用预训练语言模型中的地理知识
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-27 DOI: 10.1016/j.ipm.2024.103892
Nitin Ramrakhiyani , Vasudeva Varma , Girish Keshav Palshikar , Sachin Pawar
To employ Pre-trained Language Models (PLMs) as knowledge containers in niche domains it is important to gauge the knowledge of these PLMs about facts in these domains. It is also an important pre-requisite to know how much enrichment effort is required to make them better. As part of this work, we aim to gauge and enrich small PLMs for knowledge of world geography. Firstly, we develop a moderately sized dataset of masked sentences covering 24 different fact types about world geography to estimate knowledge of PLMs on these facts. We hypothesize that for this niche domain, smaller PLMs may not be well equipped. Secondly, we enrich PLMs with this knowledge through fine-tuning and check if the knowledge in the dataset is infused sufficiently. We further hypothesize that linguistic variability in the manual templates used to embed the knowledge in masked sentences does not affect the knowledge infusion. Finally, we demonstrate the application of PLMs to tourism blog search and Wikidata KB augmentation. In both applications, we aim at showing the effectiveness of using PLMs to achieve competitive performance.
要使用预训练语言模型(PLM)作为利基领域的知识容器,就必须评估这些 PLM 对这些领域事实的了解程度。此外,了解需要做多少丰富工作才能使它们变得更好也是一个重要的先决条件。作为这项工作的一部分,我们旨在衡量和丰富小型 PLM 的世界地理知识。首先,我们开发了一个中等规模的掩码句子数据集,涵盖 24 种不同的世界地理事实类型,以估算 PLM 对这些事实的了解程度。我们假设,对于这一利基领域,较小的 PLM 可能不具备很好的装备。其次,我们通过微调来丰富 PLM 的知识,并检查数据集中的知识是否得到了充分注入。我们进一步假设,用于在屏蔽句子中嵌入知识的人工模板的语言差异性不会影响知识注入。最后,我们展示了 PLM 在旅游博客搜索和维基数据知识库扩充中的应用。在这两项应用中,我们的目标都是展示使用 PLM 实现竞争性性能的有效性。
{"title":"Gauging, enriching and applying geography knowledge in Pre-trained Language Models","authors":"Nitin Ramrakhiyani ,&nbsp;Vasudeva Varma ,&nbsp;Girish Keshav Palshikar ,&nbsp;Sachin Pawar","doi":"10.1016/j.ipm.2024.103892","DOIUrl":"10.1016/j.ipm.2024.103892","url":null,"abstract":"<div><div>To employ Pre-trained Language Models (PLMs) as knowledge containers in niche domains it is important to gauge the knowledge of these PLMs about facts in these domains. It is also an important pre-requisite to know how much enrichment effort is required to make them better. As part of this work, we aim to gauge and enrich small PLMs for knowledge of world geography. Firstly, we develop a moderately sized dataset of masked sentences covering 24 different fact types about world geography to estimate knowledge of PLMs on these facts. We hypothesize that for this niche domain, smaller PLMs may not be well equipped. Secondly, we enrich PLMs with this knowledge through fine-tuning and check if the knowledge in the dataset is infused sufficiently. We further hypothesize that linguistic variability in the manual templates used to embed the knowledge in masked sentences does not affect the knowledge infusion. Finally, we demonstrate the application of PLMs to tourism blog search and Wikidata KB augmentation. In both applications, we aim at showing the effectiveness of using PLMs to achieve competitive performance.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103892"},"PeriodicalIF":7.4,"publicationDate":"2024-09-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142326351","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DST: Continual event prediction by decomposing and synergizing the task commonality and specificity DST:通过分解和协同任务的共性和特殊性来进行连续事件预测
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-26 DOI: 10.1016/j.ipm.2024.103899
Yuxin Zhang , Songlin Zhai , Yongrui Chen , Shenyu Zhang , Sheng Bi , Yuan Meng , Guilin Qi
Event prediction aims to forecast future events by analyzing the inherent development patterns of historical events. A desirable event prediction system should learn new event knowledge, and adapt to new domains or tasks that arise in real-world application scenarios. However, continuous training can lead to catastrophic forgetting of the model. While existing continuous learning methods can retain characteristic knowledge from previous domains, they ignore potential shared knowledge in subsequent tasks. To tackle these challenges, we propose a novel event prediction method based on graph structural commonality and domain characteristic prompts, which not only avoids forgetting but also facilitates bi-directional knowledge transfer across domains. Specifically, we mitigate model forgetting by designing domain characteristic-oriented prompts in a continuous task stream with frozen the backbone pre-trained model. Building upon this, we further devise a commonality-based adaptive updating algorithm by harnessing a unique structural commonality prompt to inspire implicit common features across domains. Our experimental results on two public benchmark datasets for event prediction demonstrate the effectiveness of our proposed continuous learning event prediction method compared to state-of-the-art baselines. In tests conducted on the IED-Stream, DST’s ET-TA metric significantly improved by 5.6% over the current best baseline model, while the ET-MD metric, which reveals forgetting, decreased by 5.8%.
事件预测旨在通过分析历史事件的内在发展模式来预测未来事件。一个理想的事件预测系统应该学习新的事件知识,并适应现实世界应用场景中出现的新领域或新任务。然而,持续训练可能会导致模型的灾难性遗忘。虽然现有的持续学习方法可以保留以前领域的特征知识,但它们忽略了后续任务中潜在的共享知识。为了应对这些挑战,我们提出了一种基于图结构共性和领域特征提示的新型事件预测方法,它不仅能避免遗忘,还能促进跨领域的双向知识转移。具体来说,我们通过在连续任务流中设计以领域特征为导向的提示来减轻模型遗忘,同时冻结预先训练好的骨干模型。在此基础上,我们进一步设计了一种基于共性的自适应更新算法,利用独特的结构共性提示来激发跨领域的隐含共性特征。我们在两个公共事件预测基准数据集上的实验结果表明,与最先进的基准相比,我们提出的持续学习事件预测方法非常有效。在对 IED-Stream 进行的测试中,DST 的 ET-TA 指标比当前最佳基线模型显著提高了 5.6%,而揭示遗忘的 ET-MD 指标则下降了 5.8%。
{"title":"DST: Continual event prediction by decomposing and synergizing the task commonality and specificity","authors":"Yuxin Zhang ,&nbsp;Songlin Zhai ,&nbsp;Yongrui Chen ,&nbsp;Shenyu Zhang ,&nbsp;Sheng Bi ,&nbsp;Yuan Meng ,&nbsp;Guilin Qi","doi":"10.1016/j.ipm.2024.103899","DOIUrl":"10.1016/j.ipm.2024.103899","url":null,"abstract":"<div><div>Event prediction aims to forecast future events by analyzing the inherent development patterns of historical events. A desirable event prediction system should learn new event knowledge, and adapt to new domains or tasks that arise in real-world application scenarios. However, continuous training can lead to catastrophic forgetting of the model. While existing continuous learning methods can retain characteristic knowledge from previous domains, they ignore potential shared knowledge in subsequent tasks. To tackle these challenges, we propose a novel event prediction method based on graph structural commonality and domain characteristic prompts, which not only avoids forgetting but also facilitates bi-directional knowledge transfer across domains. Specifically, we mitigate model forgetting by designing domain characteristic-oriented prompts in a continuous task stream with frozen the backbone pre-trained model. Building upon this, we further devise a commonality-based adaptive updating algorithm by harnessing a unique structural commonality prompt to inspire implicit common features across domains. Our experimental results on two public benchmark datasets for event prediction demonstrate the effectiveness of our proposed continuous learning event prediction method compared to state-of-the-art baselines. In tests conducted on the IED-Stream, DST’s ET-TA metric significantly improved by 5.6% over the current best baseline model, while the ET-MD metric, which reveals forgetting, decreased by 5.8%.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103899"},"PeriodicalIF":7.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323907","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
An adaptive confidence-based data revision framework for Document-level Relation Extraction 用于文档级关系提取的基于置信度的自适应数据修订框架
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-26 DOI: 10.1016/j.ipm.2024.103909
Chao Jiang , Jinzhi Liao , Xiang Zhao , Daojian Zeng , Jianhua Dai
Noisy annotations have become a key issue limiting Document-level Relation Extraction (DocRE). Previous research explored the problem through manual re-annotation. However, the handcrafted strategy is of low efficiency, incurs high human costs and cannot be generalized to large-scale datasets. To address the problem, we construct a confidence-based Revision framework for DocRE (ReD), aiming to achieve high-quality automatic data revision. Specifically, we first introduce a denoising training module to recognize relational facts and prevent noisy annotations. Second, a confidence-based data revision module is equipped to perform adaptive data revision for long-tail distributed relational facts. After the data revision, we design an iterative training module to create a virtuous cycle, which transforms the revised data into useful training data to support further revision. By capitalizing on ReD, we propose ReD-DocRED, which consists of 101,873 revised annotated documents from DocRED. ReD-DocRED has introduced 57.1% new relational facts, and concurrently, models trained on ReD-DocRED have achieved significant improvements in F1 scores, ranging from 6.35 to 16.55. The experimental results demonstrate that ReD can achieve high-quality data revision and, to some extent, replace manual labeling.1
嘈杂的注释已成为限制文档级关系提取(DocRE)的一个关键问题。以往的研究通过人工重新标注来解决这一问题。然而,这种手工策略效率低、人力成本高,而且无法推广到大规模数据集。为解决这一问题,我们构建了基于置信度的 DocRE 修订框架(ReD),旨在实现高质量的数据自动修订。具体来说,我们首先引入了一个去噪训练模块,以识别关系事实并防止出现噪声注释。其次,配备基于置信度的数据修订模块,对长尾分布式关系事实进行自适应数据修订。数据修订后,我们设计了一个迭代训练模块,以创建一个良性循环,将修订后的数据转化为有用的训练数据,以支持进一步的修订。通过利用 ReD,我们提出了 ReD-DocRED,它由来自 DocRED 的 101,873 份修订注释文档组成。ReD-DocRED 引入了 57.1% 的新关系事实,同时,在 ReD-DocRED 上训练的模型的 F1 分数也有了显著提高,从 6.35 到 16.55 不等。实验结果表明,ReD 可以实现高质量的数据修订,并在一定程度上取代人工标注1。
{"title":"An adaptive confidence-based data revision framework for Document-level Relation Extraction","authors":"Chao Jiang ,&nbsp;Jinzhi Liao ,&nbsp;Xiang Zhao ,&nbsp;Daojian Zeng ,&nbsp;Jianhua Dai","doi":"10.1016/j.ipm.2024.103909","DOIUrl":"10.1016/j.ipm.2024.103909","url":null,"abstract":"<div><div>Noisy annotations have become a key issue limiting <strong>Doc</strong>ument-level <strong>R</strong>elation <strong>E</strong>xtraction <strong>(DocRE)</strong>. Previous research explored the problem through manual re-annotation. However, the handcrafted strategy is of low efficiency, incurs high human costs and cannot be generalized to large-scale datasets. To address the problem, we construct a confidence-based <strong>Re</strong>vision framework for <strong>D</strong>ocRE (<strong>ReD</strong>), aiming to achieve high-quality automatic data revision. Specifically, we first introduce a denoising training module to recognize relational facts and prevent noisy annotations. Second, a confidence-based data revision module is equipped to perform adaptive data revision for long-tail distributed relational facts. After the data revision, we design an iterative training module to create a virtuous cycle, which transforms the revised data into useful training data to support further revision. By capitalizing on ReD, we propose <strong>ReD-DocRED</strong>, which consists of 101,873 revised annotated documents from DocRED. ReD-DocRED has introduced 57.1% new relational facts, and concurrently, models trained on ReD-DocRED have achieved significant improvements in F1 scores, ranging from 6.35 to 16.55. The experimental results demonstrate that ReD can achieve high-quality data revision and, to some extent, replace manual labeling.<span><span><sup>1</sup></span></span></div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103909"},"PeriodicalIF":7.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323905","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Mitigating the negative impact of over-association for conversational query production 减轻过度关联对会话查询制作的负面影响
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-26 DOI: 10.1016/j.ipm.2024.103907
Ante Wang , Linfeng Song , Zijun Min , Ge Xu , Xiaoli Wang , Junfeng Yao , Jinsong Su
Conversational query generation aims at producing search queries from dialogue histories, which are then used to retrieve relevant knowledge from a search engine to help knowledge-based dialogue systems. Trained to maximize the likelihood of gold queries, previous models suffer from the data hunger issue, and they tend to both drop important concepts from dialogue histories and generate irrelevant concepts at inference time. We attribute these issues to the over-association phenomenon where a large number of gold queries are indirectly related to the dialogue topics, because annotators may unconsciously perform reasoning with their background knowledge when generating these gold queries. We carefully analyze the negative effects of this phenomenon on pretrained Seq2seq query producers and then propose effective instance-level weighting strategies for training to mitigate these issues from multiple perspectives. Experiments on two benchmarks, Wizard-of-Internet and DuSinc, show that our strategies effectively alleviate the negative effects and lead to significant performance gains (2%   5% across automatic metrics and human evaluation). Further analysis shows that our model selects better concepts from dialogue histories and is 10 times more data efficient than the baseline.
对话查询生成的目的是从对话历史中生成搜索查询,然后利用这些查询从搜索引擎中检索相关知识,从而帮助基于知识的对话系统。以往的模型是为了最大限度地提高黄金查询的可能性而训练的,但却存在数据饥饿问题,它们往往会放弃对话历史中的重要概念,并在推理时产生不相关的概念。我们将这些问题归咎于过度关联现象,即大量金查询与对话主题间接相关,因为注释者在生成这些金查询时可能会不自觉地利用其背景知识进行推理。我们仔细分析了这种现象对预训练 Seq2seq 查询生成器的负面影响,然后提出了有效的实例级加权训练策略,从多个角度缓解了这些问题。在 Wizard-of-Internet 和 DuSinc 这两个基准上进行的实验表明,我们的策略有效地缓解了负面影响,并带来了显著的性能提升(在自动度量和人工评估中均为 2% ∼ 5%)。进一步的分析表明,我们的模型能从对话历史中选择更好的概念,其数据效率是基准模型的 10 倍。
{"title":"Mitigating the negative impact of over-association for conversational query production","authors":"Ante Wang ,&nbsp;Linfeng Song ,&nbsp;Zijun Min ,&nbsp;Ge Xu ,&nbsp;Xiaoli Wang ,&nbsp;Junfeng Yao ,&nbsp;Jinsong Su","doi":"10.1016/j.ipm.2024.103907","DOIUrl":"10.1016/j.ipm.2024.103907","url":null,"abstract":"<div><div>Conversational query generation aims at producing search queries from dialogue histories, which are then used to retrieve relevant knowledge from a search engine to help knowledge-based dialogue systems. Trained to maximize the likelihood of gold queries, previous models suffer from the data hunger issue, and they tend to both drop important concepts from dialogue histories and generate irrelevant concepts at inference time. We attribute these issues to the <em>over-association</em> phenomenon where a large number of gold queries are indirectly related to the dialogue topics, because annotators may unconsciously perform reasoning with their background knowledge when generating these gold queries. We carefully analyze the negative effects of this phenomenon on pretrained Seq2seq query producers and then propose effective instance-level weighting strategies for training to mitigate these issues from multiple perspectives. Experiments on two benchmarks, Wizard-of-Internet and DuSinc, show that our strategies effectively alleviate the negative effects and lead to significant performance gains (2%<!--> <span><math><mo>∼</mo></math></span> <!--> <!-->5% across automatic metrics and human evaluation). Further analysis shows that our model selects better concepts from dialogue histories and is <em>10 times</em> more data efficient than the baseline.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103907"},"PeriodicalIF":7.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A comprehensive study on fidelity metrics for XAI 关于 XAI 真实度指标的综合研究
IF 7.4 1区 管理学 Q1 COMPUTER SCIENCE, INFORMATION SYSTEMS Pub Date : 2024-09-26 DOI: 10.1016/j.ipm.2024.103900
Miquel Miró-Nicolau, Antoni Jaume-i-Capó, Gabriel Moyà-Alcover
The use of eXplainable Artificial Intelligence (XAI) systems has introduced a set of challenges that need resolution. Herein, we focus on how to correctly select an XAI method, an open questions within the field. The inherent difficulty of this task is due to the lack of a ground truth. Several authors have proposed metrics to approximate the fidelity of different XAI methods. These metrics lack verification and have concerning disagreements. In this study, we proposed a novel methodology to verify fidelity metrics, using transparent models. These models allowed us to obtain explanations with perfect fidelity. Our proposal constitutes the first objective benchmark for these metrics, facilitating a comparison of existing proposals, and surpassing existing methods. We applied our benchmark to assess the existing fidelity metrics in two different experiments, each using public datasets comprising 52,000 images. The images from these datasets had a size a 128 by 128 pixels and were synthetic data that simplified the training process. We identified that two fidelity metrics, Faithfulness Estimate and Faithfulness Correlation, obtained the expected perfect results for linear models, showing their ability to approximate fidelity for this kind of methods. However, when present with non-linear models, as the ones most used in the state-of-the-art,all metric values, indicated a lack of fidelity, with the best one showing a 30% deviation from the expected values for perfect explanation. Our experimentation led us to conclude that the current fidelity metrics are not reliable enough to be used in real scenarios. From this finding, we deemed it necessary to development new metrics, to avoid the detected problems, and we recommend the usage of our proposal as a benchmark within the scientific community to address these limitations.
可解释人工智能(XAI)系统的使用带来了一系列需要解决的挑战。在这里,我们重点讨论如何正确选择 XAI 方法,这是该领域的一个开放性问题。这项任务的内在困难在于缺乏基本真相。有几位作者提出了近似不同 XAI 方法保真度的指标。这些指标缺乏验证,而且存在分歧。在本研究中,我们提出了一种使用透明模型验证保真度指标的新方法。这些模型使我们能够获得完全保真的解释。我们的建议是这些指标的首个客观基准,有助于对现有建议进行比较,并超越现有方法。我们在两个不同的实验中应用了我们的基准来评估现有的保真度指标,每个实验都使用了由 52,000 张图片组成的公共数据集。这些数据集中的图像大小为 128 x 128 像素,是简化训练过程的合成数据。我们发现,忠实度估算和忠实度相关性这两个忠实度度量指标在线性模型中获得了预期的完美结果,显示出它们有能力近似这类方法的忠实度。然而,当使用非线性模型(最先进的模型)时,所有度量值都显示缺乏忠实性,最好的度量值显示与完美解释的预期值有 30% 的偏差。通过实验,我们得出结论:当前的保真度指标不够可靠,无法在实际场景中使用。根据这一结论,我们认为有必要开发新的指标,以避免发现的问题,并建议将我们的建议作为科学界的基准,以解决这些局限性。
{"title":"A comprehensive study on fidelity metrics for XAI","authors":"Miquel Miró-Nicolau,&nbsp;Antoni Jaume-i-Capó,&nbsp;Gabriel Moyà-Alcover","doi":"10.1016/j.ipm.2024.103900","DOIUrl":"10.1016/j.ipm.2024.103900","url":null,"abstract":"<div><div>The use of eXplainable Artificial Intelligence (XAI) systems has introduced a set of challenges that need resolution. Herein, we focus on how to correctly select an XAI method, an open questions within the field. The inherent difficulty of this task is due to the lack of a ground truth. Several authors have proposed metrics to approximate the fidelity of different XAI methods. These metrics lack verification and have concerning disagreements. In this study, we proposed a novel methodology to verify fidelity metrics, using transparent models. These models allowed us to obtain explanations with perfect fidelity. Our proposal constitutes the first objective benchmark for these metrics, facilitating a comparison of existing proposals, and surpassing existing methods. We applied our benchmark to assess the existing fidelity metrics in two different experiments, each using public datasets comprising 52,000 images. The images from these datasets had a size a 128 by 128 pixels and were synthetic data that simplified the training process. We identified that two fidelity metrics, Faithfulness Estimate and Faithfulness Correlation, obtained the expected perfect results for linear models, showing their ability to approximate fidelity for this kind of methods. However, when present with non-linear models, as the ones most used in the state-of-the-art,all metric values, indicated a lack of fidelity, with the best one showing a 30% deviation from the expected values for perfect explanation. Our experimentation led us to conclude that the current fidelity metrics are not reliable enough to be used in real scenarios. From this finding, we deemed it necessary to development new metrics, to avoid the detected problems, and we recommend the usage of our proposal as a benchmark within the scientific community to address these limitations.</div></div>","PeriodicalId":50365,"journal":{"name":"Information Processing & Management","volume":"62 1","pages":"Article 103900"},"PeriodicalIF":7.4,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142323906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"管理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information Processing & Management
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1