首页 > 最新文献

arXiv - CS - Artificial Intelligence最新文献

英文 中文
Pairing Analogy-Augmented Generation with Procedural Memory for Procedural Q&A 将类比增强生成与程序性记忆配对用于程序性问答
Pub Date : 2024-09-02 DOI: arxiv-2409.01344
K Roth, Rushil Gupta, Simon Halle, Bang Liu
While LLMs in the RAG paradigm have shown remarkable performance on a varietyof tasks, they still under-perform on unseen domains, especially on complextasks like procedural question answering. In this work, we introduce a novelformalism and structure for manipulating text-based procedures. Based on thisformalism, we further present a novel dataset called LCStep, scraped from theLangChain Python docs. Moreover, we extend the traditional RAG system topropose a novel system called analogy-augmented generation (AAG), that drawsinspiration from human analogical reasoning and ability to assimilate pastexperiences to solve unseen problems. The proposed method uses a frozenlanguage model with a custom procedure memory store to adapt to specializedknowledge. We demonstrate that AAG outperforms few-shot and RAG baselines onLCStep, RecipeNLG, and CHAMP datasets under a pairwise LLM-based evaluation,corroborated by human evaluation in the case of RecipeNLG.
虽然 RAG 范式中的 LLM 在各种任务中都表现出了不俗的性能,但它们在未知领域中的表现仍然不佳,尤其是在像程序问题解答这样的完整任务中。在这项工作中,我们引入了一种新颖的形式主义和结构,用于处理基于文本的程序。基于这种形式主义,我们进一步提出了一种名为 LCStep 的新型数据集,该数据集是从 LangChain Python 文档中提取的。此外,我们还对传统的 RAG 系统进行了扩展,提出了一种名为类比增强生成(AAG)的新系统,该系统从人类的类比推理和吸收粘贴经验的能力中汲取灵感,以解决未曾见过的问题。所提出的方法使用带有自定义过程存储的冻结语言模型,以适应专门的知识。我们证明,在基于成对 LLM 的评估中,AAG 在 LCStep、RecipeNLG 和 CHAMP 数据集上的表现优于 few-shot 和 RAG 基线,而在 RecipeNLG 数据集上,人类评估也证实了这一点。
{"title":"Pairing Analogy-Augmented Generation with Procedural Memory for Procedural Q&A","authors":"K Roth, Rushil Gupta, Simon Halle, Bang Liu","doi":"arxiv-2409.01344","DOIUrl":"https://doi.org/arxiv-2409.01344","url":null,"abstract":"While LLMs in the RAG paradigm have shown remarkable performance on a variety\u0000of tasks, they still under-perform on unseen domains, especially on complex\u0000tasks like procedural question answering. In this work, we introduce a novel\u0000formalism and structure for manipulating text-based procedures. Based on this\u0000formalism, we further present a novel dataset called LCStep, scraped from the\u0000LangChain Python docs. Moreover, we extend the traditional RAG system to\u0000propose a novel system called analogy-augmented generation (AAG), that draws\u0000inspiration from human analogical reasoning and ability to assimilate past\u0000experiences to solve unseen problems. The proposed method uses a frozen\u0000language model with a custom procedure memory store to adapt to specialized\u0000knowledge. We demonstrate that AAG outperforms few-shot and RAG baselines on\u0000LCStep, RecipeNLG, and CHAMP datasets under a pairwise LLM-based evaluation,\u0000corroborated by human evaluation in the case of RecipeNLG.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AgGym: An agricultural biotic stress simulation environment for ultra-precision management planning AgGym:用于超精确管理规划的农业生物压力模拟环境
Pub Date : 2024-09-01 DOI: arxiv-2409.00735
Mahsa Khosravi, Matthew Carroll, Kai Liang Tan, Liza Van der Laan, Joscif Raigne, Daren S. Mueller, Arti Singh, Aditya Balu, Baskar Ganapathysubramanian, Asheesh Kumar Singh, Soumik Sarkar
Agricultural production requires careful management of inputs such asfungicides, insecticides, and herbicides to ensure a successful crop that ishigh-yielding, profitable, and of superior seed quality. Currentstate-of-the-art field crop management relies on coarse-scale crop managementstrategies, where entire fields are sprayed with pest and disease-controllingchemicals, leading to increased cost and sub-optimal soil and crop management.To overcome these challenges and optimize crop production, we utilize machinelearning tools within a virtual field environment to generate localizedmanagement plans for farmers to manage biotic threats while maximizing profits.Specifically, we present AgGym, a modular, crop and stress agnostic simulationframework to model the spread of biotic stresses in a field and estimate yieldlosses with and without chemical treatments. Our validation with real datashows that AgGym can be customized with limited data to simulate yield outcomesunder various biotic stress conditions. We further demonstrate that deepreinforcement learning (RL) policies can be trained using AgGym for designingultra-precise biotic stress mitigation strategies with potential to increaseyield recovery with less chemicals and lower cost. Our proposed frameworkenables personalized decision support that can transform biotic stressmanagement from being schedule based and reactive to opportunistic andprescriptive. We also release the AgGym software implementation as a communityresource and invite experts to contribute to this open-sourced publiclyavailable modular environment framework. The source code can be accessed at:https://github.com/SCSLabISU/AgGym.
农业生产需要对杀菌剂、杀虫剂和除草剂等投入品进行精心管理,以确保作物高产、盈利和种子质量上乘。目前最先进的大田作物管理依赖于粗放型的作物管理策略,即在整块田地上喷洒病虫害控制化学品,从而导致成本增加,土壤和作物管理达不到最佳状态。为了克服这些挑战并优化作物生产,我们在虚拟田间环境中利用机器学习工具,为农民生成本地化管理计划,在管理生物威胁的同时实现利润最大化。具体来说,我们提出了 AgGym,这是一个模块化、作物和胁迫不可知论的模拟框架,用于模拟生物胁迫在田间的传播,并估算使用和不使用化学处理的产量损失。我们利用真实数据进行的验证表明,AgGym 可以利用有限的数据进行定制,以模拟各种生物胁迫条件下的产量结果。我们进一步证明,使用AgGym可以训练深度强化学习(RL)策略,设计出超精确的生物胁迫缓解策略,从而有可能以更少的化学药剂和更低的成本提高产量。我们提出的框架可提供个性化决策支持,从而将生物胁迫管理从基于时间表的被动式管理转变为机会主义和指令性管理。我们还将 AgGym 软件实现作为社区资源发布,并邀请专家为这一开源的模块化环境框架做出贡献。源代码可从以下网址获取:https://github.com/SCSLabISU/AgGym。
{"title":"AgGym: An agricultural biotic stress simulation environment for ultra-precision management planning","authors":"Mahsa Khosravi, Matthew Carroll, Kai Liang Tan, Liza Van der Laan, Joscif Raigne, Daren S. Mueller, Arti Singh, Aditya Balu, Baskar Ganapathysubramanian, Asheesh Kumar Singh, Soumik Sarkar","doi":"arxiv-2409.00735","DOIUrl":"https://doi.org/arxiv-2409.00735","url":null,"abstract":"Agricultural production requires careful management of inputs such as\u0000fungicides, insecticides, and herbicides to ensure a successful crop that is\u0000high-yielding, profitable, and of superior seed quality. Current\u0000state-of-the-art field crop management relies on coarse-scale crop management\u0000strategies, where entire fields are sprayed with pest and disease-controlling\u0000chemicals, leading to increased cost and sub-optimal soil and crop management.\u0000To overcome these challenges and optimize crop production, we utilize machine\u0000learning tools within a virtual field environment to generate localized\u0000management plans for farmers to manage biotic threats while maximizing profits.\u0000Specifically, we present AgGym, a modular, crop and stress agnostic simulation\u0000framework to model the spread of biotic stresses in a field and estimate yield\u0000losses with and without chemical treatments. Our validation with real data\u0000shows that AgGym can be customized with limited data to simulate yield outcomes\u0000under various biotic stress conditions. We further demonstrate that deep\u0000reinforcement learning (RL) policies can be trained using AgGym for designing\u0000ultra-precise biotic stress mitigation strategies with potential to increase\u0000yield recovery with less chemicals and lower cost. Our proposed framework\u0000enables personalized decision support that can transform biotic stress\u0000management from being schedule based and reactive to opportunistic and\u0000prescriptive. We also release the AgGym software implementation as a community\u0000resource and invite experts to contribute to this open-sourced publicly\u0000available modular environment framework. The source code can be accessed at:\u0000https://github.com/SCSLabISU/AgGym.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstaining Machine Learning -- Philosophical Considerations 放弃机器学习 -- 哲学思考
Pub Date : 2024-09-01 DOI: arxiv-2409.00706
Daniela Schuster
This paper establishes a connection between the fields of machine learning(ML) and philosophy concerning the phenomenon of behaving neutrally. Itinvestigates a specific class of ML systems capable of delivering a neutralresponse to a given task, referred to as abstaining machine learning systems,that has not yet been studied from a philosophical perspective. The paperintroduces and explains various abstaining machine learning systems, andcategorizes them into distinct types. An examination is conducted on howabstention in the different machine learning system types aligns with theepistemological counterpart of suspended judgment, addressing both the natureof suspension and its normative profile. Additionally, a philosophical analysisis suggested on the autonomy and explainability of the abstaining response. Itis argued, specifically, that one of the distinguished types of abstainingsystems is preferable as it aligns more closely with our criteria for suspendedjudgment. Moreover, it is better equipped to autonomously generate abstainingoutputs and offer explanations for abstaining outputs when compared to theother type.
本文在机器学习(ML)和哲学领域之间建立了一种联系,涉及中立行为现象。它研究了一类能够对给定任务做出中立回应的特定机器学习系统,我们称之为 "弃权机器学习系统"。本文介绍并解释了各种弃权机器学习系统,并将它们划分为不同的类型。本文探讨了不同类型机器学习系统中的弃权是如何与与悬置判断相对应的认识论相一致的,同时论述了悬置的性质及其规范性特征。此外,还对弃权反应的自主性和可解释性进行了哲学分析。具体而言,本文认为,弃权系统的其中一种不同类型是可取的,因为它更符合我们对中止判断的标准。此外,与另一种类型相比,它能更好地自主生成弃权输出并对弃权输出做出解释。
{"title":"Abstaining Machine Learning -- Philosophical Considerations","authors":"Daniela Schuster","doi":"arxiv-2409.00706","DOIUrl":"https://doi.org/arxiv-2409.00706","url":null,"abstract":"This paper establishes a connection between the fields of machine learning\u0000(ML) and philosophy concerning the phenomenon of behaving neutrally. It\u0000investigates a specific class of ML systems capable of delivering a neutral\u0000response to a given task, referred to as abstaining machine learning systems,\u0000that has not yet been studied from a philosophical perspective. The paper\u0000introduces and explains various abstaining machine learning systems, and\u0000categorizes them into distinct types. An examination is conducted on how\u0000abstention in the different machine learning system types aligns with the\u0000epistemological counterpart of suspended judgment, addressing both the nature\u0000of suspension and its normative profile. Additionally, a philosophical analysis\u0000is suggested on the autonomy and explainability of the abstaining response. It\u0000is argued, specifically, that one of the distinguished types of abstaining\u0000systems is preferable as it aligns more closely with our criteria for suspended\u0000judgment. Moreover, it is better equipped to autonomously generate abstaining\u0000outputs and offer explanations for abstaining outputs when compared to the\u0000other type.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning 利用异步多代理强化学习进行合作路径规划
Pub Date : 2024-09-01 DOI: arxiv-2409.00754
Jiaming Yin, Weixiong Rao, Yu Xiao, Keshuang Tang
In this paper, we study the shortest path problem (SPP) with multiplesource-destination pairs (MSD), namely MSD-SPP, to minimize average travel timeof all shortest paths. The inherent traffic capacity limits within a roadnetwork contributes to the competition among vehicles. Multi-agentreinforcement learning (MARL) model cannot offer effective and efficient pathplanning cooperation due to the asynchronous decision making setting inMSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routingactions in the previous time step. To tackle the efficiency issue, we proposeto divide an entire road network into multiple sub-graphs and subsequentlyexecute a two-stage process of inter-region and intra-region route planning. Toaddress the asynchronous issue, in the proposed asyn-MARL framework, we firstdesign a global state, which exploits a low-dimensional vector to implicitlyrepresent the joint observations and actions of multi-agents. Then we develop anovel trajectory collection mechanism to decrease the redundancy in trainingtrajectories. Additionally, we design a novel actor network to facilitate thecooperation among vehicles towards the same or close destinations and areachability graph aimed at preventing infinite loops in routing paths. On bothsynthetic and real road networks, our evaluation result demonstrates that ourapproach outperforms state-of-the-art planning approaches.
在本文中,我们研究了具有多来源-目的地对(MSD)的最短路径问题(SPP),即 MSD-SPP,以最小化所有最短路径的平均旅行时间。道路网络中固有的交通容量限制加剧了车辆之间的竞争。由于 MSD-SPP 中的异步决策设置,车辆(又称代理)无法在前一时间步中同时完成路径规划操作,因此多代理强化学习(MARL)模型无法提供有效且高效的路径规划合作。为了解决效率问题,我们建议将整个道路网络划分为多个子图,然后执行区域间和区域内路径规划的两阶段过程。为了解决异步问题,在所提出的 asyn-MARL 框架中,我们首先设计了一个全局状态,利用低维向量来隐式地表示多个代理的联合观测和行动。然后,我们开发了一种新的轨迹收集机制,以减少训练轨迹的冗余。此外,我们还设计了一个新颖的行动者网络,以促进车辆之间朝着相同或相近目的地的合作,并设计了一个可连接性图,旨在防止路由路径中出现无限循环。在合成和真实道路网络上,我们的评估结果表明,我们的方法优于最先进的规划方法。
{"title":"Cooperative Path Planning with Asynchronous Multiagent Reinforcement Learning","authors":"Jiaming Yin, Weixiong Rao, Yu Xiao, Keshuang Tang","doi":"arxiv-2409.00754","DOIUrl":"https://doi.org/arxiv-2409.00754","url":null,"abstract":"In this paper, we study the shortest path problem (SPP) with multiple\u0000source-destination pairs (MSD), namely MSD-SPP, to minimize average travel time\u0000of all shortest paths. The inherent traffic capacity limits within a road\u0000network contributes to the competition among vehicles. Multi-agent\u0000reinforcement learning (MARL) model cannot offer effective and efficient path\u0000planning cooperation due to the asynchronous decision making setting in\u0000MSD-SPP, where vehicles (a.k.a agents) cannot simultaneously complete routing\u0000actions in the previous time step. To tackle the efficiency issue, we propose\u0000to divide an entire road network into multiple sub-graphs and subsequently\u0000execute a two-stage process of inter-region and intra-region route planning. To\u0000address the asynchronous issue, in the proposed asyn-MARL framework, we first\u0000design a global state, which exploits a low-dimensional vector to implicitly\u0000represent the joint observations and actions of multi-agents. Then we develop a\u0000novel trajectory collection mechanism to decrease the redundancy in training\u0000trajectories. Additionally, we design a novel actor network to facilitate the\u0000cooperation among vehicles towards the same or close destinations and a\u0000reachability graph aimed at preventing infinite loops in routing paths. On both\u0000synthetic and real road networks, our evaluation result demonstrates that our\u0000approach outperforms state-of-the-art planning approaches.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Predicting Femicide in Veracruz: A Fuzzy Logic Approach with the Expanded MFM-FEM-VER-CP-2024 Model 预测韦拉克鲁斯州的杀戮女性事件:使用扩展的 MFM-FEM-VER-CP-2024 模型的模糊逻辑方法
Pub Date : 2024-08-31 DOI: arxiv-2409.00359
Carlos Medel-Ramírez, Hilario Medel-López
The article focuses on the urgent issue of femicide in Veracruz, Mexico, andthe development of the MFM_FEM_VER_CP_2024 model, a mathematical frameworkdesigned to predict femicide risk using fuzzy logic. This model addresses thecomplexity and uncertainty inherent in gender based violence by formalizingrisk factors such as coercive control, dehumanization, and the cycle ofviolence. These factors are mathematically modeled through membership functionsthat assess the degree of risk associated with various conditions, includingpersonal relationships and specific acts of violence. The study enhances theoriginal model by incorporating new rules and refining existing membershipfunctions, which significantly improve the model predictive accuracy.
文章重点关注墨西哥韦拉克鲁斯州杀害妇女这一紧迫问题,以及 MFM_FEM_VER_CP_2024 模型的开发情况,该模型是一个数学框架,旨在利用模糊逻辑预测杀害妇女的风险。该模型通过将强制控制、非人化和暴力循环等风险因素正规化,解决了性别暴力中固有的复杂性和不确定性。通过成员函数对这些因素进行数学建模,评估与各种条件(包括人际关系和具体暴力行为)相关的风险程度。本研究通过纳入新的规则和完善现有的成员函数对原模型进行了改进,从而大大提高了模型的预测准确性。
{"title":"Predicting Femicide in Veracruz: A Fuzzy Logic Approach with the Expanded MFM-FEM-VER-CP-2024 Model","authors":"Carlos Medel-Ramírez, Hilario Medel-López","doi":"arxiv-2409.00359","DOIUrl":"https://doi.org/arxiv-2409.00359","url":null,"abstract":"The article focuses on the urgent issue of femicide in Veracruz, Mexico, and\u0000the development of the MFM_FEM_VER_CP_2024 model, a mathematical framework\u0000designed to predict femicide risk using fuzzy logic. This model addresses the\u0000complexity and uncertainty inherent in gender based violence by formalizing\u0000risk factors such as coercive control, dehumanization, and the cycle of\u0000violence. These factors are mathematically modeled through membership functions\u0000that assess the degree of risk associated with various conditions, including\u0000personal relationships and specific acts of violence. The study enhances the\u0000original model by incorporating new rules and refining existing membership\u0000functions, which significantly improve the model predictive accuracy.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193908","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts MERIT 数据集:建模并高效地渲染可解释的记录誊本
Pub Date : 2024-08-31 DOI: arxiv-2409.00447
I. de Rodrigo, A. Sanchez-Cuadrado, J. Boal, A. J. Lopez-Lopez
This paper introduces the MERIT Dataset, a multimodal (text + image + layout)fully labeled dataset within the context of school reports. Comprising over 400labels and 33k samples, the MERIT Dataset is a valuable resource for trainingmodels in demanding Visually-rich Document Understanding (VrDU) tasks. By itsnature (student grade reports), the MERIT Dataset can potentially includebiases in a controlled way, making it a valuable tool to benchmark biasesinduced in Language Models (LLMs). The paper outlines the dataset's generationpipeline and highlights its main features in the textual, visual, layout, andbias domains. To demonstrate the dataset's utility, we present a benchmark withtoken classification models, showing that the dataset poses a significantchallenge even for SOTA models and that these would greatly benefit fromincluding samples from the MERIT Dataset in their pretraining phase.
本文介绍了 MERIT 数据集,这是一个以学校报告为背景的多模态(文本+图像+布局)全标记数据集。MERIT 数据集包含 400 多个标签和 33k 个样本,是在要求苛刻的视觉丰富文档理解(VrDU)任务中训练模型的宝贵资源。由于其性质(学生成绩报告),MERIT 数据集有可能以可控的方式包含偏差,从而成为对语言模型(LLM)中的偏差进行基准测试的重要工具。本文概述了该数据集的生成流程,并重点介绍了其在文本、视觉、布局和偏差领域的主要特点。为了证明该数据集的实用性,我们用口语分类模型进行了一次基准测试,结果表明该数据集甚至对 SOTA 模型也构成了巨大挑战,在预训练阶段加入 MERIT 数据集的样本将使这些模型受益匪浅。
{"title":"The MERIT Dataset: Modelling and Efficiently Rendering Interpretable Transcripts","authors":"I. de Rodrigo, A. Sanchez-Cuadrado, J. Boal, A. J. Lopez-Lopez","doi":"arxiv-2409.00447","DOIUrl":"https://doi.org/arxiv-2409.00447","url":null,"abstract":"This paper introduces the MERIT Dataset, a multimodal (text + image + layout)\u0000fully labeled dataset within the context of school reports. Comprising over 400\u0000labels and 33k samples, the MERIT Dataset is a valuable resource for training\u0000models in demanding Visually-rich Document Understanding (VrDU) tasks. By its\u0000nature (student grade reports), the MERIT Dataset can potentially include\u0000biases in a controlled way, making it a valuable tool to benchmark biases\u0000induced in Language Models (LLMs). The paper outlines the dataset's generation\u0000pipeline and highlights its main features in the textual, visual, layout, and\u0000bias domains. To demonstrate the dataset's utility, we present a benchmark with\u0000token classification models, showing that the dataset poses a significant\u0000challenge even for SOTA models and that these would greatly benefit from\u0000including samples from the MERIT Dataset in their pretraining phase.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193910","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the Effect of Explanation Content and Format on User Comprehension and Trust 探索说明内容和形式对用户理解和信任的影响
Pub Date : 2024-08-30 DOI: arxiv-2408.17401
Antonio Rago, Bence Palfi, Purin Sukpanichnant, Hannibal Nabli, Kavyesh Vivek, Olga Kostopoulou, James Kinross, Francesca Toni
In recent years, various methods have been introduced for explaining theoutputs of "black-box" AI models. However, it is not well understood whetherusers actually comprehend and trust these explanations. In this paper, we focuson explanations for a regression tool for assessing cancer risk and examine theeffect of the explanations' content and format on the user-centric metrics ofcomprehension and trust. Regarding content, we experiment with two explanationmethods: the popular SHAP, based on game-theoretic notions and thus potentiallycomplex for everyday users to comprehend, and occlusion-1, based on featureocclusion which may be more comprehensible. Regarding format, we present SHAPexplanations as charts (SC), as is conventional, and occlusion-1 explanationsas charts (OC) as well as text (OT), to which their simpler nature also lendsitself. The experiments amount to user studies questioning participants, withtwo different levels of expertise (the general population and those with somemedical training), on their subjective and objective comprehension of and trustin explanations for the outputs of the regression tool. In both studies wefound a clear preference in terms of subjective comprehension and trust forocclusion-1 over SHAP explanations in general, when comparing based on content.However, direct comparisons of explanations when controlling for format onlyrevealed evidence for OT over SC explanations in most cases, suggesting thatthe dominance of occlusion-1 over SHAP explanations may be driven by apreference for text over charts as explanations. Finally, we found no evidenceof a difference between the explanation types in terms of objectivecomprehension. Thus overall, the choice of the content and format ofexplanations needs careful attention, since in some contexts format, ratherthan content, may play the critical role in improving user experience.
近年来,人们引入了各种方法来解释 "黑箱 "人工智能模型的输出结果。然而,人们对用户是否真正理解和信任这些解释并不十分了解。在本文中,我们将重点放在评估癌症风险的回归工具的解释上,并研究解释的内容和格式对以用户为中心的理解度和信任度指标的影响。在内容方面,我们尝试了两种解释方法:一种是流行的 SHAP,它基于博弈论概念,因此对于日常用户来说可能比较复杂;另一种是闭塞-1,它基于特征闭塞,可能更容易理解。在格式方面,我们将 SHAP 解释以图表(SC)的形式呈现,这是传统的做法;将闭塞-1 解释以图表(OC)和文本(OT)的形式呈现,其简单的性质也适合这种做法。实验相当于用户研究,询问两种不同专业水平的参与者(普通人和受过一定医学培训的人)对回归工具输出解释的主观和客观理解及信任程度。在这两项研究中,如果根据内容进行比较,我们发现在主观理解力和信任度方面,闭塞-1解释明显优于SHAP解释。然而,在控制格式的情况下,对解释的直接比较在大多数情况下只显示了OT解释优于SC解释的证据,这表明闭塞-1解释优于SHAP解释可能是由于文字解释优于图表解释。最后,在客观理解方面,我们没有发现解释类型之间存在差异的证据。因此,总体而言,解释内容和格式的选择需要仔细斟酌,因为在某些情况下,格式而非内容可能在改善用户体验方面起到关键作用。
{"title":"Exploring the Effect of Explanation Content and Format on User Comprehension and Trust","authors":"Antonio Rago, Bence Palfi, Purin Sukpanichnant, Hannibal Nabli, Kavyesh Vivek, Olga Kostopoulou, James Kinross, Francesca Toni","doi":"arxiv-2408.17401","DOIUrl":"https://doi.org/arxiv-2408.17401","url":null,"abstract":"In recent years, various methods have been introduced for explaining the\u0000outputs of \"black-box\" AI models. However, it is not well understood whether\u0000users actually comprehend and trust these explanations. In this paper, we focus\u0000on explanations for a regression tool for assessing cancer risk and examine the\u0000effect of the explanations' content and format on the user-centric metrics of\u0000comprehension and trust. Regarding content, we experiment with two explanation\u0000methods: the popular SHAP, based on game-theoretic notions and thus potentially\u0000complex for everyday users to comprehend, and occlusion-1, based on feature\u0000occlusion which may be more comprehensible. Regarding format, we present SHAP\u0000explanations as charts (SC), as is conventional, and occlusion-1 explanations\u0000as charts (OC) as well as text (OT), to which their simpler nature also lends\u0000itself. The experiments amount to user studies questioning participants, with\u0000two different levels of expertise (the general population and those with some\u0000medical training), on their subjective and objective comprehension of and trust\u0000in explanations for the outputs of the regression tool. In both studies we\u0000found a clear preference in terms of subjective comprehension and trust for\u0000occlusion-1 over SHAP explanations in general, when comparing based on content.\u0000However, direct comparisons of explanations when controlling for format only\u0000revealed evidence for OT over SC explanations in most cases, suggesting that\u0000the dominance of occlusion-1 over SHAP explanations may be driven by a\u0000preference for text over charts as explanations. Finally, we found no evidence\u0000of a difference between the explanation types in terms of objective\u0000comprehension. Thus overall, the choice of the content and format of\u0000explanations needs careful attention, since in some contexts format, rather\u0000than content, may play the critical role in improving user experience.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Symbolic XAI -- Explanation Through Human Understandable Logical Relationships Between Features 迈向符号 XAI -- 通过人类可理解的特征之间的逻辑关系进行解释
Pub Date : 2024-08-30 DOI: arxiv-2408.17198
Thomas Schnake, Farnoush Rezaei Jafaria, Jonas Lederer, Ping Xiong, Shinichi Nakajima, Stefan Gugler, Grégoire Montavon, Klaus-Robert Müller
Explainable Artificial Intelligence (XAI) plays a crucial role in fosteringtransparency and trust in AI systems, where traditional XAI approachestypically offer one level of abstraction for explanations, often in the form ofheatmaps highlighting single or multiple input features. However, we askwhether abstract reasoning or problem-solving strategies of a model may also berelevant, as these align more closely with how humans approach solutions toproblems. We propose a framework, called Symbolic XAI, that attributesrelevance to symbolic queries expressing logical relationships between inputfeatures, thereby capturing the abstract reasoning behind a model'spredictions. The methodology is built upon a simple yet general multi-orderdecomposition of model predictions. This decomposition can be specified usinghigher-order propagation-based relevance methods, such as GNN-LRP, orperturbation-based explanation methods commonly used in XAI. The effectivenessof our framework is demonstrated in the domains of natural language processing(NLP), vision, and quantum chemistry (QC), where abstract symbolic domainknowledge is abundant and of significant interest to users. The Symbolic XAIframework provides an understanding of the model's decision-making process thatis both flexible for customization by the user and human-readable throughlogical formulas.
可解释的人工智能(XAI)在促进人工智能系统的透明度和信任度方面发挥着至关重要的作用,传统的 XAI 方法通常提供一个抽象层次的解释,通常以热图的形式突出单个或多个输入特征。然而,我们不禁要问,模型的抽象推理或问题解决策略是否也与此有关,因为这些策略与人类解决问题的方法更为接近。我们提出了一个名为 "符号 XAI "的框架,将相关性归因于表达输入特征之间逻辑关系的符号查询,从而捕捉模型预测背后的抽象推理。该方法建立在对模型预测进行简单而通用的多阶分解的基础上。可以使用基于高阶传播的相关性方法(如 GNN-LRP)或 XAI 中常用的基于扰动的解释方法来指定这种分解。我们的框架在自然语言处理(NLP)、视觉和量子化学(QC)等领域的有效性得到了验证,在这些领域中,抽象的符号领域知识非常丰富,而且对用户具有重大意义。符号 XAI 框架提供了对模型决策过程的理解,这种理解既能灵活地由用户定制,又能通过逻辑公式让人类读懂。
{"title":"Towards Symbolic XAI -- Explanation Through Human Understandable Logical Relationships Between Features","authors":"Thomas Schnake, Farnoush Rezaei Jafaria, Jonas Lederer, Ping Xiong, Shinichi Nakajima, Stefan Gugler, Grégoire Montavon, Klaus-Robert Müller","doi":"arxiv-2408.17198","DOIUrl":"https://doi.org/arxiv-2408.17198","url":null,"abstract":"Explainable Artificial Intelligence (XAI) plays a crucial role in fostering\u0000transparency and trust in AI systems, where traditional XAI approaches\u0000typically offer one level of abstraction for explanations, often in the form of\u0000heatmaps highlighting single or multiple input features. However, we ask\u0000whether abstract reasoning or problem-solving strategies of a model may also be\u0000relevant, as these align more closely with how humans approach solutions to\u0000problems. We propose a framework, called Symbolic XAI, that attributes\u0000relevance to symbolic queries expressing logical relationships between input\u0000features, thereby capturing the abstract reasoning behind a model's\u0000predictions. The methodology is built upon a simple yet general multi-order\u0000decomposition of model predictions. This decomposition can be specified using\u0000higher-order propagation-based relevance methods, such as GNN-LRP, or\u0000perturbation-based explanation methods commonly used in XAI. The effectiveness\u0000of our framework is demonstrated in the domains of natural language processing\u0000(NLP), vision, and quantum chemistry (QC), where abstract symbolic domain\u0000knowledge is abundant and of significant interest to users. The Symbolic XAI\u0000framework provides an understanding of the model's decision-making process that\u0000is both flexible for customization by the user and human-readable through\u0000logical formulas.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A methodological framework for Resilience as a Service (RaaS) in multimodal urban transportation networks 多式联运城市交通网络中的复原力即服务(RaaS)方法框架
Pub Date : 2024-08-30 DOI: arxiv-2408.17233
Sara JaberUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France and VEDECOM, mobiLAB, Department of new solutions of mobility services and shared energy, Versailles, France, Mostafa AmeliUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France, S. M. Hassan MahdaviVEDECOM, mobiLAB, Department of new solutions of mobility services and shared energy, Versailles, France, Neila BhouriUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France
Public transportation systems are experiencing an increase in commutertraffic. This increase underscores the need for resilience strategies to manageunexpected service disruptions, ensuring rapid and effective responses thatminimize adverse effects on stakeholders and enhance the system's ability tomaintain essential functions and recover quickly. This study aims to explorethe management of public transport disruptions through resilience as a service(RaaS) strategies, developing an optimization model to effectively allocateresources and minimize the cost for operators and passengers. The proposedmodel includes multiple transportation options, such as buses, taxis, andautomated vans, and evaluates them as bridging alternatives to rail-disruptedservices based on factors such as their availability, capacity, speed, andproximity to the disrupted station. This ensures that the most suitablevehicles are deployed to maintain service continuity. Applied to a case studyin the Ile de France region, Paris and suburbs, complemented by a microscopicsimulation, the model is compared to existing solutions such as bus bridgingand reserve fleets. The results highlight the model's performance in minimizingcosts and enhancing stakeholder satisfaction, optimizing transport managementduring disruptions.
公共交通系统正经历着通勤流量的增长。这种增长凸显了对弹性策略的需求,以管理预期的服务中断,确保快速有效的响应,最大限度地减少对利益相关者的不利影响,并增强系统维持基本功能和快速恢复的能力。本研究旨在探讨如何通过弹性服务(RaaS)策略管理公共交通中断,开发一种优化模型,以有效分配资源,最大限度地降低运营商和乘客的成本。建议的模型包括多种交通选择,如公交车、出租车和自动驾驶货车,并根据其可用性、容量、速度和与中断车站的距离等因素,将它们作为铁路中断服务的桥接替代方案进行评估。这样就能确保部署最合适的车辆来维持服务的连续性。该模型应用于法兰西岛地区、巴黎和郊区的案例研究,并辅以微观模拟,与现有的解决方案(如公交车桥接和备用车队)进行了比较。结果凸显了该模型在降低成本、提高利益相关者满意度、优化中断期间的交通管理方面的性能。
{"title":"A methodological framework for Resilience as a Service (RaaS) in multimodal urban transportation networks","authors":"Sara JaberUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France and VEDECOM, mobiLAB, Department of new solutions of mobility services and shared energy, Versailles, France, Mostafa AmeliUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France, S. M. Hassan MahdaviVEDECOM, mobiLAB, Department of new solutions of mobility services and shared energy, Versailles, France, Neila BhouriUniv. Gustave Eiffel, COSYS, GRETTIA, Paris, France","doi":"arxiv-2408.17233","DOIUrl":"https://doi.org/arxiv-2408.17233","url":null,"abstract":"Public transportation systems are experiencing an increase in commuter\u0000traffic. This increase underscores the need for resilience strategies to manage\u0000unexpected service disruptions, ensuring rapid and effective responses that\u0000minimize adverse effects on stakeholders and enhance the system's ability to\u0000maintain essential functions and recover quickly. This study aims to explore\u0000the management of public transport disruptions through resilience as a service\u0000(RaaS) strategies, developing an optimization model to effectively allocate\u0000resources and minimize the cost for operators and passengers. The proposed\u0000model includes multiple transportation options, such as buses, taxis, and\u0000automated vans, and evaluates them as bridging alternatives to rail-disrupted\u0000services based on factors such as their availability, capacity, speed, and\u0000proximity to the disrupted station. This ensures that the most suitable\u0000vehicles are deployed to maintain service continuity. Applied to a case study\u0000in the Ile de France region, Paris and suburbs, complemented by a microscopic\u0000simulation, the model is compared to existing solutions such as bus bridging\u0000and reserve fleets. The results highlight the model's performance in minimizing\u0000costs and enhancing stakeholder satisfaction, optimizing transport management\u0000during disruptions.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction 可解释的人工智能:需求、技术、应用和未来方向调查
Pub Date : 2024-08-30 DOI: arxiv-2409.00265
Melkamu Mersha, Khang Lam, Joseph Wood, Ali AlShami, Jugal Kalita
Artificial intelligence models encounter significant challenges due to theirblack-box nature, particularly in safety-critical domains such as healthcare,finance, and autonomous vehicles. Explainable Artificial Intelligence (XAI)addresses these challenges by providing explanations for how these models makedecisions and predictions, ensuring transparency, accountability, and fairness.Existing studies have examined the fundamental concepts of XAI, its generalprinciples, and the scope of XAI techniques. However, there remains a gap inthe literature as there are no comprehensive reviews that delve into thedetailed mathematical representations, design methodologies of XAI models, andother associated aspects. This paper provides a comprehensive literature reviewencompassing common terminologies and definitions, the need for XAI,beneficiaries of XAI, a taxonomy of XAI methods, and the application of XAImethods in different application areas. The survey is aimed at XAI researchers,XAI practitioners, AI model developers, and XAI beneficiaries who areinterested in enhancing the trustworthiness, transparency, accountability, andfairness of their AI models.
人工智能模型因其黑箱性质而面临巨大挑战,尤其是在医疗保健、金融和自动驾驶汽车等安全关键领域。可解释的人工智能(XAI)通过解释这些模型如何做出决策和预测来应对这些挑战,从而确保透明度、问责制和公平性。然而,文献中仍然存在空白,因为没有全面的综述深入探讨 XAI 模型的详细数学表达、设计方法和其他相关方面。本文提供了一份全面的文献综述,包括常用术语和定义、XAI 的需求、XAI 的受益者、XAI 方法分类以及 XAI 方法在不同应用领域的应用。该调查面向XAI研究人员、XAI从业人员、人工智能模型开发人员和XAI受益人,他们都对提高人工智能模型的可信度、透明度、问责制和公平性感兴趣。
{"title":"Explainable Artificial Intelligence: A Survey of Needs, Techniques, Applications, and Future Direction","authors":"Melkamu Mersha, Khang Lam, Joseph Wood, Ali AlShami, Jugal Kalita","doi":"arxiv-2409.00265","DOIUrl":"https://doi.org/arxiv-2409.00265","url":null,"abstract":"Artificial intelligence models encounter significant challenges due to their\u0000black-box nature, particularly in safety-critical domains such as healthcare,\u0000finance, and autonomous vehicles. Explainable Artificial Intelligence (XAI)\u0000addresses these challenges by providing explanations for how these models make\u0000decisions and predictions, ensuring transparency, accountability, and fairness.\u0000Existing studies have examined the fundamental concepts of XAI, its general\u0000principles, and the scope of XAI techniques. However, there remains a gap in\u0000the literature as there are no comprehensive reviews that delve into the\u0000detailed mathematical representations, design methodologies of XAI models, and\u0000other associated aspects. This paper provides a comprehensive literature review\u0000encompassing common terminologies and definitions, the need for XAI,\u0000beneficiaries of XAI, a taxonomy of XAI methods, and the application of XAI\u0000methods in different application areas. The survey is aimed at XAI researchers,\u0000XAI practitioners, AI model developers, and XAI beneficiaries who are\u0000interested in enhancing the trustworthiness, transparency, accountability, and\u0000fairness of their AI models.","PeriodicalId":501479,"journal":{"name":"arXiv - CS - Artificial Intelligence","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2024-08-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193946","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - CS - Artificial Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1