Applied AI letters最新文献

英文中文

Explaining autonomous drones: An XAI journey 解释自主无人机:XAI之旅

Applied AI letters

Pub Date : 2021-11-22 DOI: 10.1002/ail2.54

Mark Stefik, Michael Youngblood, Peter Pirolli, Christian Lebiere, Robert Thomson, Robert Price, Lester D. Nelson, Robert Krivacic, Jacob Le, Konstantinos Mitsopoulos, Sterling Somers, Joel Schooler

COGLE (COmmon Ground Learning and Explanation) is an explainable artificial intelligence (XAI) system where autonomous drones deliver supplies to field units in mountainous areas. The mission risks vary with topography, flight decisions, and mission goals. The missions engage a human plus AI team where users determine which of two AI-controlled drones is better for each mission. This article reports on the technical approach and findings of the project and reflects on challenges that complex combinatorial problems present for users, machine learning, user studies, and the context of use for XAI systems. COGLE creates explanations in multiple modalities. Narrative “What” explanations compare what each drone does on a mission and “Why” based on drone competencies determined from experiments using counterfactuals. Visual “Where” explanations highlight risks on maps to help users to interpret flight plans. One branch of the research studied whether the explanations helped users to predict drone performance. In this branch, a model induction user study showed that post-decision explanations had only a small effect in teaching users to determine by themselves which drone is better for a mission. Subsequent reflection suggests that supporting human plus AI decision making with pre-decision explanations is a better context for benefiting from explanations on combinatorial tasks.

COGLE (COmmon Ground Learning and explain)是一种可解释的人工智能(XAI)系统，由无人驾驶飞机向山区野战部队运送物资。任务风险随地形、飞行决策和任务目标而变化。这些任务由人类和人工智能团队组成，用户可以决定两架人工智能控制的无人机中哪一架更适合每个任务。本文报告了该项目的技术方法和发现，并反映了复杂组合问题对用户、机器学习、用户研究和XAI系统使用环境的挑战。COGLE以多种方式创建解释。叙述性的“什么”解释比较了每架无人机在执行任务时所做的事情，以及基于反事实实验确定的无人机能力的“为什么”。可视化的“地点”解释在地图上突出了风险，以帮助用户理解飞行计划。该研究的一个分支是研究这些解释是否有助于用户预测无人机的性能。在这个分支中，一项模型归纳用户研究表明，决策后解释在教导用户自己确定哪种无人机更适合执行任务方面只有很小的作用。随后的反思表明，通过决策前解释来支持人类和人工智能的决策是一个更好的环境，可以从组合任务的解释中受益。

{"title":"Explaining autonomous drones: An XAI journey","authors":"Mark Stefik, Michael Youngblood, Peter Pirolli, Christian Lebiere, Robert Thomson, Robert Price, Lester D. Nelson, Robert Krivacic, Jacob Le, Konstantinos Mitsopoulos, Sterling Somers, Joel Schooler","doi":"10.1002/ail2.54","DOIUrl":"10.1002/ail2.54","url":null,"abstract":"COGLE (COmmon Ground Learning and Explanation) is an explainable artificial intelligence (XAI) system where autonomous drones deliver supplies to field units in mountainous areas. The mission risks vary with topography, flight decisions, and mission goals. The missions engage a human plus AI team where users determine which of two AI-controlled drones is better for each mission. This article reports on the technical approach and findings of the project and reflects on challenges that complex combinatorial problems present for users, machine learning, user studies, and the context of use for XAI systems. COGLE creates explanations in multiple modalities. Narrative “What” explanations compare what each drone does on a mission and “Why” based on drone competencies determined from experiments using counterfactuals. Visual “Where” explanations highlight risks on maps to help users to interpret flight plans. One branch of the research studied whether the explanations helped users to predict drone performance. In this branch, a model induction user study showed that post-decision explanations had only a small effect in teaching users to determine by themselves which drone is better for a mission. Subsequent reflection suggests that supporting human plus AI decision making with pre-decision explanations is a better context for benefiting from explanations on combinatorial tasks.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.54","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47270045","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Explaining robot policies 解释机器人政策

Applied AI letters

Pub Date : 2021-11-13 DOI: 10.1002/ail2.52

Olivia Watkins, Sandy Huang, Julius Frost, Kush Bhatia, Eric Weiner, Pieter Abbeel, Trevor Darrell, Bryan Plummer, Kate Saenko, Anca Dragan

In order to interact with a robot or make wise decisions about where and how to deploy it in the real world, humans need to have an accurate mental model of how the robot acts in different situations. We propose to improve users' mental model of a robot by showing them examples of how the robot behaves in informative scenarios. We explore this in two settings. First, we show that when there are many possible environment states, users can more quickly understand the robot's policy if they are shown critical states where taking a particular action is important. Second, we show that when there is a distribution shift between training and test environment distributions, then it is more effective to show exploratory states that the robot does not visit naturally.

为了与机器人互动或做出明智的决定，决定在现实世界中部署机器人的位置和方式，人类需要对机器人在不同情况下的行为有一个准确的心理模型。我们建议通过向用户展示机器人在信息场景中的行为来改善用户对机器人的心理模型。我们在两种情况下对此进行探讨。首先，我们表明，当有许多可能的环境状态时，如果用户看到采取特定行动很重要的关键状态，他们可以更快地理解机器人的策略。其次，我们证明了当训练环境和测试环境之间的分布发生变化时，显示机器人不自然访问的探索状态更有效。

引用次数: 4

Methods and standards for research on explainable artificial intelligence: Lessons from intelligent tutoring systems 可解释人工智能的研究方法与标准:来自智能辅导系统的经验教训

Applied AI letters

Pub Date : 2021-11-13 DOI: 10.1002/ail2.53

William J. Clancey, Robert R. Hoffman

The DARPA Explainable Artificial Intelligence (AI) (XAI) Program focused on generating explanations for AI programs that use machine learning techniques. This article highlights progress during the DARPA Program (2017-2021) relative to research since the 1970s in the field of intelligent tutoring systems (ITSs). ITS researchers learned a great deal about explanation that is directly relevant to XAI. We suggest opportunities for future XAI research deriving from ITS methods, and consider the challenges shared by both ITS and XAI in using AI to assist people in solving difficult problems effectively and efficiently.

DARPA可解释人工智能(AI) (XAI)项目专注于为使用机器学习技术的人工智能程序生成解释。本文重点介绍了DARPA计划(2017-2021)期间相对于20世纪70年代以来智能辅导系统(ITSs)领域的研究进展。ITS研究人员学到了很多与XAI直接相关的解释。我们提出了未来人工智能研究从ITS方法衍生出来的机会，并考虑了ITS和人工智能在使用人工智能帮助人们有效和高效地解决难题方面所面临的共同挑战。

引用次数: 0

Generating and evaluating explanations of attended and error-inducing input regions for VQA models 生成和评估VQA模型的参与和错误诱导输入区域的解释

Applied AI letters

Pub Date : 2021-11-12 DOI: 10.1002/ail2.51

Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas

Attention maps, a popular heatmap-based explanation method for Visual Question Answering, are supposed to help users understand the model by highlighting portions of the image/question used by the model to infer answers. However, we see that users are often misled by current attention map visualizations that point to relevant regions despite the model producing an incorrect answer. Hence, we propose Error Maps that clarify the error by highlighting image regions where the model is prone to err. Error maps can indicate when a correctly attended region may be processed incorrectly leading to an incorrect answer, and hence, improve users' understanding of those cases. To evaluate our new explanations, we further introduce a metric that simulates users' interpretation of explanations to evaluate their potential helpfulness to understand model correctness. We finally conduct user studies to see that our new explanations help users understand model correctness better than baselines by an expected 30% and that our proxy helpfulness metrics correlate strongly ( $� ρ � > � 0.97$ ) with how well users can predict model correctness.

注意图是一种流行的基于热图的可视化问答解释方法，它通过突出显示模型用来推断答案的图像/问题的部分来帮助用户理解模型。然而，我们发现用户经常被当前的注意力地图可视化所误导，尽管模型产生了错误的答案，但它们指向了相关的区域。因此，我们提出了错误地图，通过突出显示模型容易出错的图像区域来澄清错误。错误映射可以指示正确参与的区域何时可能被错误地处理，从而导致错误的答案，从而提高用户对这些情况的理解。为了评估我们的新解释，我们进一步引入了一个度量，该度量模拟用户对解释的解释，以评估它们对理解模型正确性的潜在帮助。我们最终进行了用户研究，发现我们的新解释帮助用户比基线更好地理解模型正确性，预期高出30%，并且我们的代理帮助度量具有很强的相关性(ρ >0.97)与用户预测模型正确性的程度有关。

{"title":"Generating and evaluating explanations of attended and error-inducing input regions for VQA models","authors":"Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas","doi":"10.1002/ail2.51","DOIUrl":"https://doi.org/10.1002/ail2.51","url":null,"abstract":"Attention maps, a popular heatmap-based explanation method for Visual Question Answering, are supposed to help users understand the model by highlighting portions of the image/question used by the model to infer answers. However, we see that users are often misled by current attention map visualizations that point to relevant regions despite the model producing an incorrect answer. Hence, we propose Error Maps that clarify the error by highlighting image regions where the model is prone to err. Error maps can indicate when a correctly attended region may be processed incorrectly leading to an incorrect answer, and hence, improve users' understanding of those cases. To evaluate our new explanations, we further introduce a metric that simulates users' interpretation of explanations to evaluate their potential helpfulness to understand model correctness. We finally conduct user studies to see that our new explanations help users understand model correctness better than baselines by an expected 30% and that our proxy helpfulness metrics correlate strongly (<math>\u0000 <mi>ρ</mi>\u0000 <mo>></mo>\u0000 <mn>0.97</mn></math>) with how well users can predict model correctness.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.51","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137830967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Computer Vision and Machine Learning Techniques for Quantification and Predictive Modeling of Intracellular Anti‐Cancer Drug Delivery by Nanocarriers 计算机视觉和机器学习技术用于纳米载体细胞内抗癌症药物递送的定量和预测建模

Applied AI letters

Pub Date : 2021-11-10 DOI: 10.1002/ail2.50

S. Goswami, Kshama D. Dhobale, R. Wavhale, B. Goswami, S. Banerjee

引用次数: 0

How level of explanation detail affects human performance in interpretable intelligent systems: A study on explainable fact checking 在可解释的智能系统中，解释细节的水平如何影响人类的表现:一项关于可解释事实核查的研究

Applied AI letters

Pub Date : 2021-11-08 DOI: 10.1002/ail2.49

Rhema Linder, Sina Mohseni, Fan Yang, Shiva K. Pentyala, Eric D. Ragan, Xia Ben Hu

Explainable artificial intelligence (XAI) systems aim to provide users with information to help them better understand computational models and reason about why outputs were generated. However, there are many different ways an XAI interface might present explanations, which makes designing an appropriate and effective interface an important and challenging task. Our work investigates how different types and amounts of explanatory information affect user ability to utilize explanations to understand system behavior and improve task performance. The presented research employs a system for detecting the truthfulness of news statements. In a controlled experiment, participants were tasked with using the system to assess news statements as well as to learn to predict the output of the AI. Our experiment compares various levels of explanatory information to contribute empirical data about how explanation detail can influence utility. The results show that more explanation information improves participant understanding of AI models, but the benefits come at the cost of time and attention needed to make sense of the explanation.

可解释的人工智能(XAI)系统旨在为用户提供信息，帮助他们更好地理解计算模型和产生输出的原因。然而，XAI界面可能有许多不同的解释方式，这使得设计一个适当而有效的界面成为一项重要而具有挑战性的任务。我们的工作调查了不同类型和数量的解释信息如何影响用户利用解释来理解系统行为和提高任务性能的能力。本研究采用了一种检测新闻陈述真实性的系统。在一项对照实验中，参与者的任务是使用该系统评估新闻声明，并学习预测人工智能的输出。我们的实验比较了不同层次的解释信息，以提供关于解释细节如何影响效用的经验数据。结果表明，更多的解释信息可以提高参与者对人工智能模型的理解，但这些好处是以理解解释所需的时间和注意力为代价的。

{"title":"How level of explanation detail affects human performance in interpretable intelligent systems: A study on explainable fact checking","authors":"Rhema Linder, Sina Mohseni, Fan Yang, Shiva K. Pentyala, Eric D. Ragan, Xia Ben Hu","doi":"10.1002/ail2.49","DOIUrl":"10.1002/ail2.49","url":null,"abstract":"Explainable artificial intelligence (XAI) systems aim to provide users with information to help them better understand computational models and reason about why outputs were generated. However, there are many different ways an XAI interface might present explanations, which makes designing an appropriate and effective interface an important and challenging task. Our work investigates how different types and amounts of explanatory information affect user ability to utilize explanations to understand system behavior and improve task performance. The presented research employs a system for detecting the truthfulness of news statements. In a controlled experiment, participants were tasked with using the system to assess news statements as well as to learn to predict the output of the AI. Our experiment compares various levels of explanatory information to contribute empirical data about how explanation detail can influence utility. The results show that more explanation information improves participant understanding of AI models, but the benefits come at the cost of time and attention needed to make sense of the explanation.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.49","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"46870560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

From heatmaps to structured explanations of image classifiers 从热图到图像分类器的结构化解释

Applied AI letters

Pub Date : 2021-11-06 DOI: 10.1002/ail2.46

Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern

This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed integrated-gradient optimized saliency (I-GOS) and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has led to our recent development of structured attention graphs, an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning.

本文总结了我们过去几年在解释图像分类器方面的努力，目的是包括负面结果和我们获得的见解。本文首先描述了可解释神经网络(XNN)，它试图纯粹从深度网络中提取和可视化几个高级概念，而不依赖于人类的语言概念。这有助于用户理解不太直观的网络分类，并大大提高用户在区分不同种类海鸥的困难细粒度分类任务上的性能。意识到一个重要的缺失部分是一个可靠的热图可视化工具，我们开发了集成梯度优化显着性(I-GOS)和igos++，利用集成梯度来避免热图生成中的局部最优，从而提高了所有分辨率下的性能。在这些可视化的开发过程中，我们意识到，对于大量的图像，分类器有多条不同的路径来达到一个自信的预测。这导致我们最近开发了结构化注意力图，这种方法利用光束搜索来定位单个图像的多个粗热图，并通过捕获图像区域的不同组合如何影响分类器的置信度来紧凑地可视化一组热图。通过研究过程，我们学到了很多关于构建深度网络解释的见解，多重解释的存在和频率，以及使解释起作用的各种交易技巧。在本文中，我们试图与读者分享这些见解和观点，希望其中一些能够为未来可解释深度学习的研究人员提供信息。

{"title":"From heatmaps to structured explanations of image classifiers","authors":"Li Fuxin, Zhongang Qi, Saeed Khorram, Vivswan Shitole, Prasad Tadepalli, Minsuk Kahng, Alan Fern","doi":"10.1002/ail2.46","DOIUrl":"10.1002/ail2.46","url":null,"abstract":"This paper summarizes our endeavors in the past few years in terms of explaining image classifiers, with the aim of including negative results and insights we have gained. The paper starts with describing the explainable neural network (XNN), which attempts to extract and visualize several high-level concepts purely from the deep network, without relying on human linguistic concepts. This helps users understand network classifications that are less intuitive and substantially improves user performance on a difficult fine-grained classification task of discriminating among different species of seagulls. Realizing that an important missing piece is a reliable heatmap visualization tool, we have developed integrated-gradient optimized saliency (I-GOS) and iGOS++ utilizing integrated gradients to avoid local optima in heatmap generation, which improved the performance across all resolutions. During the development of those visualizations, we realized that for a significant number of images, the classifier has multiple different paths to reach a confident prediction. This has led to our recent development of structured attention graphs, an approach that utilizes beam search to locate multiple coarse heatmaps for a single image, and compactly visualizes a set of heatmaps by capturing how different combinations of image regions impact the confidence of a classifier. Through the research process, we have learned much about insights in building deep network explanations, the existence and frequency of multiple explanations, and various tricks of the trade that make explanations work. In this paper, we attempt to share those insights and opinions with the readers with the hope that some of them will be informative for future researchers on explainable deep learning.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.46","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73608694","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

Improving users' mental model with attention-directed counterfactual edits 通过注意力导向的反事实编辑改善用户的心智模型

Applied AI letters

Pub Date : 2021-11-06 DOI: 10.1002/ail2.47

Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas

In the domain of visual question answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain image-question (IQ) pairs. In this work, we show that showing controlled counterfactual IQ examples are more effective at improving the mental model of users as compared to simply showing random examples. We compare a generative approach and a retrieval-based approach to show counterfactual examples. We use recent advances in generative adversarial networks to generate counterfactual images by deleting and inpainting certain regions of interest in the image. We then expose users to changes in the VQA system's answer on those altered images. To select the region of interest for inpainting, we experiment with using both human-annotated attention maps and a fully automatic method that uses the VQA system's attention values. Finally, we test the user's mental model by asking them to predict the model's performance on a test counterfactual image. We note an overall improvement in users' accuracy to predict answer change when shown counterfactual explanations. While realistic retrieved counterfactuals obviously are the most effective at improving the mental model, we show that a generative approach can also be equally effective.

在视觉问答(VQA)领域，研究表明，当用户接触到这些系统如何回答某些图像问题(IQ)对的示例时，他们对VQA系统的心理模型有所改善。在这项工作中，我们表明，与简单地展示随机示例相比，展示受控的反事实智商示例在改善用户的心智模型方面更有效。我们比较了生成方法和基于检索的方法来展示反事实的例子。我们使用生成对抗网络的最新进展，通过删除和涂上图像中感兴趣的某些区域来生成反事实图像。然后，我们向用户展示VQA系统对这些改变后的图像的答案的变化。为了选择感兴趣的区域进行绘制，我们尝试使用人工注释的注意力图和使用VQA系统的注意力值的全自动方法。最后，我们通过要求用户预测模型在测试反事实图像上的表现来测试用户的心理模型。我们注意到，当显示反事实解释时，用户预测答案变化的准确性总体上有所提高。虽然现实检索的反事实显然在改进心智模型方面是最有效的，但我们表明生成方法也同样有效。

{"title":"Improving users' mental model with attention-directed counterfactual edits","authors":"Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas","doi":"10.1002/ail2.47","DOIUrl":"https://doi.org/10.1002/ail2.47","url":null,"abstract":"In the domain of visual question answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain image-question (IQ) pairs. In this work, we show that showing controlled counterfactual IQ examples are more effective at improving the mental model of users as compared to simply showing random examples. We compare a generative approach and a retrieval-based approach to show counterfactual examples. We use recent advances in generative adversarial networks to generate counterfactual images by deleting and inpainting certain regions of interest in the image. We then expose users to changes in the VQA system's answer on those altered images. To select the region of interest for inpainting, we experiment with using both human-annotated attention maps and a fully automatic method that uses the VQA system's attention values. Finally, we test the user's mental model by asking them to predict the model's performance on a test counterfactual image. We note an overall improvement in users' accuracy to predict answer change when shown counterfactual explanations. While realistic retrieved counterfactuals obviously are the most effective at improving the mental model, we show that a generative approach can also be equally effective.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.47","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"137648971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Neural response time analysis: Explainable artificial intelligence using only a stopwatch 神经反应时间分析:可解释的人工智能只用一个秒表

Applied AI letters

Pub Date : 2021-11-06 DOI: 10.1002/ail2.48

J. Eric T. Taylor, Shashank Shekhar, Graham W. Taylor

How would you describe the features that a deep learning model composes if you were restricted to measuring observable behaviours? Explainable artificial intelligence (XAI) methods rely on privileged access to model architecture and parameters that is not always feasible for most users, practitioners and regulators. Inspired by cognitive psychology research on humans, we present a case for measuring response times (RTs) of a forward pass using only the system clock as a technique for XAI. Our method applies to the growing class of models that use input-adaptive dynamic inference and we also extend our approach to standard models that are converted to dynamic inference post hoc. The experimental logic is simple: If the researcher can contrive a stimulus set where variability among input features is tightly controlled, differences in RT for those inputs can be attributed to the way the model composes those features. First, we show that RT is sensitive to difficult, complex features by comparing RTs from ObjectNet and ImageNet. Next, we make specific a priori predictions about RT for abstract features present in the SCEGRAM data set, where object recognition in humans depends on complex intrascene object-object relationships. Finally, we show that RT profiles bear specificity for class identity and therefore the features that define classes. These results cast light on the model's feature space without opening the black box.

如果你被限制在测量可观察的行为，你会如何描述深度学习模型所构成的特征?可解释的人工智能(XAI)方法依赖于对模型架构和参数的特权访问，这对于大多数用户、从业者和监管者来说并不总是可行的。受人类认知心理学研究的启发，我们提出了一个仅使用系统时钟作为XAI技术来测量向前传递的响应时间(RTs)的案例。我们的方法适用于越来越多的使用输入自适应动态推理的模型，我们也将我们的方法扩展到转换为动态推理的标准模型。实验逻辑很简单:如果研究人员可以设计一个刺激集，其中输入特征之间的可变性受到严格控制，那么这些输入的RT差异可以归因于模型组成这些特征的方式。首先，我们通过比较ObjectNet和ImageNet的RT，证明RT对困难、复杂的特征很敏感。接下来，我们对sceggram数据集中存在的抽象特征的RT进行具体的先验预测，其中人类的对象识别依赖于复杂的内部对象-对象关系。最后，我们展示了RT概要文件具有类标识的特异性，因此具有定义类的特性。这些结果揭示了模型的特征空间，而无需打开黑盒。

{"title":"Neural response time analysis: Explainable artificial intelligence using only a stopwatch","authors":"J. Eric T. Taylor, Shashank Shekhar, Graham W. Taylor","doi":"10.1002/ail2.48","DOIUrl":"10.1002/ail2.48","url":null,"abstract":"How would you describe the features that a deep learning model composes if you were restricted to measuring observable behaviours? Explainable artificial intelligence (XAI) methods rely on privileged access to model architecture and parameters that is not always feasible for most users, practitioners and regulators. Inspired by cognitive psychology research on humans, we present a case for measuring response times (RTs) of a forward pass using only the system clock as a technique for XAI. Our method applies to the growing class of models that use input-adaptive dynamic inference and we also extend our approach to standard models that are converted to dynamic inference post hoc. The experimental logic is simple: If the researcher can contrive a stimulus set where variability among input features is tightly controlled, differences in RT for those inputs can be attributed to the way the model composes those features. First, we show that RT is sensitive to difficult, complex features by comparing RTs from ObjectNet and ImageNet. Next, we make specific a priori predictions about RT for abstract features present in the SCEGRAM data set, where object recognition in humans depends on complex intrascene object-object relationships. Finally, we show that RT profiles bear specificity for class identity and therefore the features that define classes. These results cast light on the model's feature space without opening the black box.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.48","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"47752665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Measuring and characterizing generalization in deep reinforcement learning 深度强化学习中泛化的测量和表征

Applied AI letters

Pub Date : 2021-11-05 DOI: 10.1002/ail2.45

Sam Witty, Jun K. Lee, Emma Tosch, Akanksha Atrey, Kaleigh Clary, Michael L. Littman, David Jensen

Deep reinforcement learning (RL) methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re-examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on-policy, off-policy, and unreachable states. We propose a set of practical methods for evaluating agents with these definitions of generalization. We demonstrate these techniques on a common benchmark task for deep RL, and we show that the learned networks make poor decisions for states that differ only slightly from on-policy states, even though those states are not selected adversarially. We focus our analyses on the deep Q-networks (DQNs) that kicked off the modern era of deep RL. Taken together, these results call into question the extent to which DQNs learn generalized representations, and suggest that more experimentation and analysis is necessary before claims of representation learning can be supported.

深度强化学习(RL)方法在具有挑战性的控制任务上取得了显著的成绩。对结果行为的观察给人的印象是，代理已经构建了一个支持有洞察力的行动决策的广义表示。我们重新审视了强化学习中泛化的含义，并根据智能体在on-policy、off-policy和不可达状态下的表现提出了几个定义。我们提出了一套实用的方法来评估具有这些泛化定义的智能体。我们在深度强化学习的一个常见基准任务上展示了这些技术，并且我们表明，学习到的网络对与政策状态略有不同的状态做出了糟糕的决策，即使这些状态不是对抗性选择的。综上所述，这些结果对dqn学习广义表征的程度提出了质疑，并表明在支持表征学习的主张之前，需要进行更多的实验和分析。

{"title":"Measuring and characterizing generalization in deep reinforcement learning","authors":"Sam Witty, Jun K. Lee, Emma Tosch, Akanksha Atrey, Kaleigh Clary, Michael L. Littman, David Jensen","doi":"10.1002/ail2.45","DOIUrl":"10.1002/ail2.45","url":null,"abstract":"Deep reinforcement learning (RL) methods have achieved remarkable performance on challenging control tasks. Observations of the resulting behavior give the impression that the agent has constructed a generalized representation that supports insightful action decisions. We re-examine what is meant by generalization in RL, and propose several definitions based on an agent's performance in on-policy, off-policy, and unreachable states. We propose a set of practical methods for evaluating agents with these definitions of generalization. We demonstrate these techniques on a common benchmark task for deep RL, and we show that the learned networks make poor decisions for states that differ only slightly from on-policy states, even though those states are not selected adversarially. We focus our analyses on the deep Q-networks (DQNs) that kicked off the modern era of deep RL. Taken together, these results call into question the extent to which DQNs learn generalized representations, and suggest that more experimentation and analysis is necessary before claims of representation learning can be supported.","PeriodicalId":72253,"journal":{"name":"Applied AI letters","volume":"2 4","pages":""},"PeriodicalIF":0.0,"publicationDate":"2021-11-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/ail2.45","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75079317","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 46

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Applied AI letters

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀