首页 > 最新文献

Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society最新文献

英文 中文
Human Comprehension of Fairness in Machine Learning 机器学习中人类对公平的理解
Pub Date : 2019-12-17 DOI: 10.1145/3375627.3375819
Debjani Saha, Candice Schumann, Duncan C. McElfresh, John P. Dickerson, Michelle L. Mazurek, Michael Carl Tschantz
Bias in machine learning has manifested injustice in several areas, with notable examples including gender bias in job-related ads [4], racial bias in evaluating names on resumes [3], and racial bias in predicting criminal recidivism [1]. In response, research into algorithmic fairness has grown in both importance and volume over the past few years. Different metrics and approaches to algorithmic fairness have been proposed, many of which are based on prior legal and philosophical concepts [2]. The rapid expansion of this field makes it difficult for professionals to keep up, let alone the general public. Furthermore, misinformation about notions of fairness can have significant legal implications. Computer scientists have largely focused on developing mathematical notions of fairness and incorporating them in fielded ML systems. A much smaller collection of studies has measured public perception of bias and (un)fairness in algorithmic decision-making. However, one major question underlying the study of ML fairness remains unanswered in the literature: Does the general public understand mathematical definitions of ML fairness and their behavior in ML applications? We take a first step towards answering this question by studying non-expert comprehension and perceptions of one popular definition of ML fairness, demographic parity [5]. Specifically, we developed an online survey to address the following: (1) Does a non-technical audience comprehend the definition and implications of demographic parity? (2) Do demographics play a role in comprehension? (3) How are comprehension and sentiment related? (4) Does the application scenario affect comprehension?
机器学习中的偏见在几个领域都表现出不公正,值得注意的例子包括与工作相关的广告中的性别偏见[4],评估简历上姓名的种族偏见[3],以及预测犯罪累犯的种族偏见[1]。作为回应,在过去几年里,对算法公平性的研究在重要性和数量上都有所增长。已经提出了不同的算法公平性指标和方法,其中许多是基于先前的法律和哲学概念[2]。这一领域的迅速扩张使得专业人士很难跟上,更不用说普通大众了。此外,关于公平概念的错误信息可能会产生重大的法律影响。计算机科学家主要致力于发展公平的数学概念,并将其纳入领域机器学习系统。一项规模小得多的研究测量了公众对算法决策中的偏见和(不)公平的看法。然而,关于机器学习公平性研究的一个主要问题在文献中仍未得到解答:公众是否理解机器学习公平性的数学定义及其在机器学习应用中的行为?我们通过研究非专家对机器学习公平性的一个流行定义——人口均等(demographic parity)的理解和看法,迈出了回答这个问题的第一步[5]。具体来说,我们开发了一项在线调查来解决以下问题:(1)非技术受众是否理解人口平等的定义和含义?(2)人口统计学在理解中起作用吗?(3)理解和情感是如何关联的?(4)应用场景是否影响理解?
{"title":"Human Comprehension of Fairness in Machine Learning","authors":"Debjani Saha, Candice Schumann, Duncan C. McElfresh, John P. Dickerson, Michelle L. Mazurek, Michael Carl Tschantz","doi":"10.1145/3375627.3375819","DOIUrl":"https://doi.org/10.1145/3375627.3375819","url":null,"abstract":"Bias in machine learning has manifested injustice in several areas, with notable examples including gender bias in job-related ads [4], racial bias in evaluating names on resumes [3], and racial bias in predicting criminal recidivism [1]. In response, research into algorithmic fairness has grown in both importance and volume over the past few years. Different metrics and approaches to algorithmic fairness have been proposed, many of which are based on prior legal and philosophical concepts [2]. The rapid expansion of this field makes it difficult for professionals to keep up, let alone the general public. Furthermore, misinformation about notions of fairness can have significant legal implications. Computer scientists have largely focused on developing mathematical notions of fairness and incorporating them in fielded ML systems. A much smaller collection of studies has measured public perception of bias and (un)fairness in algorithmic decision-making. However, one major question underlying the study of ML fairness remains unanswered in the literature: Does the general public understand mathematical definitions of ML fairness and their behavior in ML applications? We take a first step towards answering this question by studying non-expert comprehension and perceptions of one popular definition of ML fairness, demographic parity [5]. Specifically, we developed an online survey to address the following: (1) Does a non-technical audience comprehend the definition and implications of demographic parity? (2) Do demographics play a role in comprehension? (3) How are comprehension and sentiment related? (4) Does the application scenario affect comprehension?","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"69 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78540446","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
AI and Holistic Review: Informing Human Reading in College Admissions 人工智能和整体评论:在大学招生中告知人类阅读
Pub Date : 2019-12-17 DOI: 10.1145/3375627.3375871
AJ Alvero, Noah Arthurs, A. Antonio, B. Domingue, Ben Gebre-Medhin, Sonia Giebel, M. Stevens
College admissions in the United States is carried out by a human-centered method of evaluation known as holistic review, which typically involves reading original narrative essays submitted by each applicant. The legitimacy and fairness of holistic review, which gives human readers significant discretion over determining each applicant's fitness for admission, has been repeatedly challenged in courtrooms and the public sphere. Using a unique corpus of 283,676 application essays submitted to a large, selective, state university system between 2015 and 2016, we assess the extent to which applicant demographic characteristics can be inferred from application essays. We find a relatively interpretable classifier (logistic regression) was able to predict gender and household income with high levels of accuracy. Findings suggest that data auditing might be useful in informing holistic review, and perhaps other evaluative systems, by checking potential bias in human or computational readings.
美国的大学录取是通过一种以人为本的评估方法进行的,这种方法被称为整体审查,通常包括阅读每个申请人提交的原创叙述文章。整体审查的合法性和公正性在法庭和公共领域不断受到挑战,它赋予人类读者很大的自由裁量权来决定每个申请人是否适合入学。使用2015年至2016年间提交给大型,选择性的州立大学系统的283,676份申请论文的独特语料库,我们评估了从申请论文中推断申请人人口特征的程度。我们发现一个相对可解释的分类器(逻辑回归)能够以较高的准确性预测性别和家庭收入。研究结果表明,通过检查人类或计算读数中的潜在偏差,数据审计可能有助于为整体审查提供信息,也许还有助于其他评估系统。
{"title":"AI and Holistic Review: Informing Human Reading in College Admissions","authors":"AJ Alvero, Noah Arthurs, A. Antonio, B. Domingue, Ben Gebre-Medhin, Sonia Giebel, M. Stevens","doi":"10.1145/3375627.3375871","DOIUrl":"https://doi.org/10.1145/3375627.3375871","url":null,"abstract":"College admissions in the United States is carried out by a human-centered method of evaluation known as holistic review, which typically involves reading original narrative essays submitted by each applicant. The legitimacy and fairness of holistic review, which gives human readers significant discretion over determining each applicant's fitness for admission, has been repeatedly challenged in courtrooms and the public sphere. Using a unique corpus of 283,676 application essays submitted to a large, selective, state university system between 2015 and 2016, we assess the extent to which applicant demographic characteristics can be inferred from application essays. We find a relatively interpretable classifier (logistic regression) was able to predict gender and household income with high levels of accuracy. Findings suggest that data auditing might be useful in informing holistic review, and perhaps other evaluative systems, by checking potential bias in human or computational readings.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78869068","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
Balancing the Tradeoff Between Clustering Value and Interpretability 平衡聚类值和可解释性之间的权衡
Pub Date : 2019-12-17 DOI: 10.1145/3375627.3375843
Sandhya Saisubramanian, Sainyam Galhotra, S. Zilberstein
Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a β-interpretable clustering algorithm that ensures that at least β fraction of nodes in each cluster share the same feature value. The tunable parameter β is user-specified. We also present a more efficient algorithm for scenarios with β!=!1$ and analyze the theoretical guarantees of the two algorithms. Finally, we empirically demonstrate the benefits of our approaches in generating interpretable clusters using four real-world datasets. The interpretability of the clusters is complemented by generating simple explanations denoting the feature values of the nodes in the clusters, using frequent pattern mining.
图聚类基于相似度对实体(图的顶点)进行分组,通常在大量特征上使用复杂的距离函数。自动决策支持系统中聚类方法的成功集成取决于结果聚类的可解释性。本文通过优化可解释性以及常见的聚类目标,解决了生成可解释性聚类的问题,给出了最终用户感兴趣的可解释性特征。我们提出了一种β-可解释聚类算法,该算法确保每个聚类中至少β分数的节点共享相同的特征值。可调参数β由用户指定。我们还提出了一个更有效的算法,用于β!=!并分析了这两种算法的理论保证。最后,我们通过经验证明了我们的方法在使用四个真实数据集生成可解释的聚类方面的好处。通过使用频繁的模式挖掘,生成表示聚类中节点特征值的简单解释,补充了聚类的可解释性。
{"title":"Balancing the Tradeoff Between Clustering Value and Interpretability","authors":"Sandhya Saisubramanian, Sainyam Galhotra, S. Zilberstein","doi":"10.1145/3375627.3375843","DOIUrl":"https://doi.org/10.1145/3375627.3375843","url":null,"abstract":"Graph clustering groups entities -- the vertices of a graph -- based on their similarity, typically using a complex distance function over a large number of features. Successful integration of clustering approaches in automated decision-support systems hinges on the interpretability of the resulting clusters. This paper addresses the problem of generating interpretable clusters, given features of interest that signify interpretability to an end-user, by optimizing interpretability in addition to common clustering objectives. We propose a β-interpretable clustering algorithm that ensures that at least β fraction of nodes in each cluster share the same feature value. The tunable parameter β is user-specified. We also present a more efficient algorithm for scenarios with β!=!1$ and analyze the theoretical guarantees of the two algorithms. Finally, we empirically demonstrate the benefits of our approaches in generating interpretable clusters using four real-world datasets. The interpretability of the clusters is complemented by generating simple explanations denoting the feature values of the nodes in the clusters, using frequent pattern mining.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81039764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Learning Norms from Stories: A Prior for Value Aligned Agents 从故事中学习规范:价值一致主体的先验
Pub Date : 2019-12-07 DOI: 10.1145/3375627.3375825
Spencer Frazier, Md Sultan Al Nahian, Mark O. Riedl, Brent Harrison
Value alignment is a property of an intelligent agent indicating that it can only pursue goals and activities that are beneficial to humans. Traditional approaches to value alignment use imitation learning or preference learning to infer the values of humans by observing their behavior. We introduce a complementary technique in which a value-aligned prior is learned from naturally occurring stories which encode societal norms. Training data is sourced from the children's educational comic strip, Goofus & Gallant. In this work, we train multiple machine learning models to classify natural language descriptions of situations found in the comic strip as normative or non-normative by identifying if they align with the main characters' behavior. We also report the models' performance when transferring to two unrelated tasks with little to no additional training on the new task.
价值一致性是智能代理的一种属性,表明它只能追求对人类有益的目标和活动。传统的价值定位方法使用模仿学习或偏好学习来通过观察人类的行为来推断人类的价值观。我们引入了一种补充技术,在这种技术中,从编码社会规范的自然发生的故事中学习到与价值一致的先验。训练数据来源于儿童教育连环漫画《Goofus & Gallant》。在这项工作中,我们训练了多个机器学习模型,通过识别它们是否与主角的行为一致,将漫画中发现的情景的自然语言描述分类为规范或非规范。我们还报告了模型在转移到两个不相关的任务时的性能,在新任务上几乎没有额外的训练。
{"title":"Learning Norms from Stories: A Prior for Value Aligned Agents","authors":"Spencer Frazier, Md Sultan Al Nahian, Mark O. Riedl, Brent Harrison","doi":"10.1145/3375627.3375825","DOIUrl":"https://doi.org/10.1145/3375627.3375825","url":null,"abstract":"Value alignment is a property of an intelligent agent indicating that it can only pursue goals and activities that are beneficial to humans. Traditional approaches to value alignment use imitation learning or preference learning to infer the values of humans by observing their behavior. We introduce a complementary technique in which a value-aligned prior is learned from naturally occurring stories which encode societal norms. Training data is sourced from the children's educational comic strip, Goofus & Gallant. In this work, we train multiple machine learning models to classify natural language descriptions of situations found in the comic strip as normative or non-normative by identifying if they align with the main characters' behavior. We also report the models' performance when transferring to two unrelated tasks with little to no additional training on the new task.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"3 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-12-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86092689","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 28
Hard Choices in Artificial Intelligence: Addressing Normative Uncertainty through Sociotechnical Commitments 人工智能中的艰难选择:通过社会技术承诺解决规范不确定性
Pub Date : 2019-11-20 DOI: 10.1145/3375627.3375861
Roel Dobbe, T. Gilbert, Yonatan Dov Mintz
The implementation of AI systems has led to new forms of harm in various sensitive social domains. We analyze these as problems How to address these harms remains at the center of controversial debate. In this paper, we discuss the inherent normative uncertainty and political debates surrounding the safety of AI systems.of vagueness to illustrate the shortcomings of current technical approaches in the AI Safety literature, crystallized in three dilemmas that remain in the design, training and deployment of AI systems. We argue that resolving normative uncertainty to render a system 'safe' requires a sociotechnical orientation that combines quantitative and qualitative methods and that assigns design and decision power across affected stakeholders to navigate these dilemmas through distinct channels for dissent. We propose a set of sociotechnical commitments and related virtues to set a bar for declaring an AI system 'human-compatible', implicating broader interdisciplinary design approaches.
人工智能系统的实施在各种敏感的社会领域导致了新形式的危害。我们分析这些问题,如何解决这些危害仍然是有争议的辩论的中心。在本文中,我们讨论了围绕人工智能系统安全性的固有规范不确定性和政治辩论。在人工智能安全文献中,当前技术方法的缺点是模糊的,具体体现在人工智能系统的设计、培训和部署中仍然存在的三个困境。我们认为,解决规范不确定性以使系统“安全”需要一种结合定量和定性方法的社会技术取向,并在受影响的利益相关者之间分配设计和决策权,以通过不同的异议渠道来应对这些困境。我们提出了一系列社会技术承诺和相关美德,为宣布人工智能系统“与人类兼容”设定了一个标准,这意味着更广泛的跨学科设计方法。
{"title":"Hard Choices in Artificial Intelligence: Addressing Normative Uncertainty through Sociotechnical Commitments","authors":"Roel Dobbe, T. Gilbert, Yonatan Dov Mintz","doi":"10.1145/3375627.3375861","DOIUrl":"https://doi.org/10.1145/3375627.3375861","url":null,"abstract":"The implementation of AI systems has led to new forms of harm in various sensitive social domains. We analyze these as problems How to address these harms remains at the center of controversial debate. In this paper, we discuss the inherent normative uncertainty and political debates surrounding the safety of AI systems.of vagueness to illustrate the shortcomings of current technical approaches in the AI Safety literature, crystallized in three dilemmas that remain in the design, training and deployment of AI systems. We argue that resolving normative uncertainty to render a system 'safe' requires a sociotechnical orientation that combines quantitative and qualitative methods and that assigns design and decision power across affected stakeholders to navigate these dilemmas through distinct channels for dissent. We propose a set of sociotechnical commitments and related virtues to set a bar for declaring an AI system 'human-compatible', implicating broader interdisciplinary design approaches.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88777700","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
The AI Liability Puzzle and a Fund-Based Work-Around 人工智能责任难题和基于基金的解决方案
Pub Date : 2019-11-18 DOI: 10.1145/3375627.3375806
Olivia J. Erd'elyi, G'abor Erd'elyi
Certainty around the regulatory environment is crucial to facilitate responsible AI innovation and its social acceptance. However, the existing legal liability system is inapt to assign responsibility where a potentially harmful conduct and/or the harm itself are unforeseeable, yet some instantiations of AI and/or the harms they may trigger are not foreseeable in the legal sense. The unpredictability of how courts would handle such cases makes the risks involved in the investment and use of AI incalculable, creating an environment that is not conducive to innovation and may deprive society of some benefits AI could provide. To tackle this problem, we propose to draw insights from financial regulatory best-practices and establish a system of AI guarantee schemes. We envisage the system to form part of the broader market-structuring regulatory framework, with the primary function to provide a readily available, clear, and transparent funding mechanism to compensate claims that are either extremely hard or impossible to realize via conventional litigation. We propose at least partial industry-funding, with funding arrangements depending on whether it would pursue other potential policy goals.
监管环境的确定性对于促进负责任的人工智能创新及其社会接受度至关重要。然而,现有的法律责任制度无法在潜在的有害行为和/或伤害本身不可预见的情况下分配责任,而人工智能的一些实例和/或它们可能引发的伤害在法律意义上是不可预见的。法院如何处理此类案件的不可预测性使得投资和使用人工智能所涉及的风险无法估量,创造了一个不利于创新的环境,并可能剥夺人工智能可以提供的一些好处。为解决这一问题,我们建议借鉴金融监管最佳实践,建立人工智能担保机制体系。我们设想该系统将成为更广泛的市场结构监管框架的一部分,其主要功能是提供一个现成的、清晰和透明的融资机制,以补偿通过传统诉讼极其困难或不可能实现的索赔。我们建议至少提供部分行业资金,资金安排取决于该公司是否会追求其他潜在的政策目标。
{"title":"The AI Liability Puzzle and a Fund-Based Work-Around","authors":"Olivia J. Erd'elyi, G'abor Erd'elyi","doi":"10.1145/3375627.3375806","DOIUrl":"https://doi.org/10.1145/3375627.3375806","url":null,"abstract":"Certainty around the regulatory environment is crucial to facilitate responsible AI innovation and its social acceptance. However, the existing legal liability system is inapt to assign responsibility where a potentially harmful conduct and/or the harm itself are unforeseeable, yet some instantiations of AI and/or the harms they may trigger are not foreseeable in the legal sense. The unpredictability of how courts would handle such cases makes the risks involved in the investment and use of AI incalculable, creating an environment that is not conducive to innovation and may deprive society of some benefits AI could provide. To tackle this problem, we propose to draw insights from financial regulatory best-practices and establish a system of AI guarantee schemes. We envisage the system to form part of the broader market-structuring regulatory framework, with the primary function to provide a readily available, clear, and transparent funding mechanism to compensate claims that are either extremely hard or impossible to realize via conventional litigation. We propose at least partial industry-funding, with funding arrangements depending on whether it would pursue other potential policy goals.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"54 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80985175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
"How do I fool you?": Manipulating User Trust via Misleading Black Box Explanations “我怎么骗得了你?”:通过误导性黑匣子解释操纵用户信任
Pub Date : 2019-11-15 DOI: 10.1145/3375627.3375833
Himabindu Lakkaraju, O. Bastani
As machine learning black boxes are increasingly being deployed in critical domains such as healthcare and criminal justice, there has been a growing emphasis on developing techniques for explaining these black boxes in a human interpretable manner. There has been recent concern that a high-fidelity explanation of a black box ML model may not accurately reflect the biases in the black box. As a consequence, explanations have the potential to mislead human users into trusting a problematic black box. In this work, we rigorously explore the notion of misleading explanations and how they influence user trust in black box models. Specifically, we propose a novel theoretical framework for understanding and generating misleading explanations, and carry out a user study with domain experts to demonstrate how these explanations can be used to mislead users. Our work is the first to empirically establish how user trust in black box models can be manipulated via misleading explanations.
随着机器学习黑盒子越来越多地应用于医疗保健和刑事司法等关键领域,人们越来越重视开发以人类可解释的方式解释这些黑盒子的技术。最近有人担心,对黑箱ML模型的高保真度解释可能无法准确反映黑箱中的偏差。因此,解释有可能误导人类用户相信一个有问题的黑匣子。在这项工作中,我们严格探索了误导解释的概念,以及它们如何影响黑盒模型中的用户信任。具体来说,我们提出了一个新的理论框架来理解和产生误导性的解释,并与领域专家一起进行了用户研究,以证明这些解释如何被用来误导用户。我们的工作是第一个从经验上确定用户对黑盒模型的信任是如何通过误导性解释来操纵的。
{"title":"\"How do I fool you?\": Manipulating User Trust via Misleading Black Box Explanations","authors":"Himabindu Lakkaraju, O. Bastani","doi":"10.1145/3375627.3375833","DOIUrl":"https://doi.org/10.1145/3375627.3375833","url":null,"abstract":"As machine learning black boxes are increasingly being deployed in critical domains such as healthcare and criminal justice, there has been a growing emphasis on developing techniques for explaining these black boxes in a human interpretable manner. There has been recent concern that a high-fidelity explanation of a black box ML model may not accurately reflect the biases in the black box. As a consequence, explanations have the potential to mislead human users into trusting a problematic black box. In this work, we rigorously explore the notion of misleading explanations and how they influence user trust in black box models. Specifically, we propose a novel theoretical framework for understanding and generating misleading explanations, and carry out a user study with domain experts to demonstrate how these explanations can be used to mislead users. Our work is the first to empirically establish how user trust in black box models can be manipulated via misleading explanations.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"46 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72766489","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 184
Fair Allocation through Selective Information Acquisition 通过选择性信息获取实现公平分配
Pub Date : 2019-11-07 DOI: 10.1145/3375627.3375823
William Cai, Johann D. Gaebler, Nikhil Garg, Sharad Goel
Public and private institutions must often allocate scarce resources under uncertainty. Banks, for example, extend credit to loan applicants based in part on their estimated likelihood of repaying a loan. But when the quality of information differs across candidates (e.g., if some applicants lack traditional credit histories), common lending strategies can lead to disparities across groups. Here we consider a setting in which decision makers---before allocating resources---can choose to spend some of their limited budget further screening select individuals. We present a computationally efficient algorithm for deciding whom to screen that maximizes a standard measure of social welfare. Intuitively, decision makers should screen candidates on the margin, for whom the additional information could plausibly alter the allocation. We formalize this idea by showing the problem can be reduced to solving a series of linear programs. Both on synthetic and real-world datasets, this strategy improves utility, illustrating the value of targeted information acquisition in such decisions. Further, when there is social value for distributing resources to groups for whom we have a priori poor information---like those without credit scores---our approach can substantially improve the allocation of limited assets.
公共和私人机构往往必须在不确定的情况下分配稀缺资源。例如,银行在一定程度上根据贷款申请人偿还贷款的估计可能性向他们发放信贷。但是,当候选人的信息质量不同时(例如,如果一些申请人缺乏传统的信用记录),共同的贷款策略可能导致群体之间的差异。在这里,我们考虑这样一种情况:在分配资源之前,决策者可以选择将有限的预算中的一部分用于进一步筛选选定的个人。我们提出了一种计算效率高的算法来决定筛选谁,从而使社会福利的标准度量最大化。直觉上,决策者应该在差额范围内筛选候选人,因为额外的信息可能会合理地改变分配。我们通过证明这个问题可以简化为求解一系列线性规划来形式化这个想法。无论是在合成数据集还是真实数据集上,该策略都提高了实用性,说明了在此类决策中目标信息获取的价值。此外,当将资源分配给我们先验信息贫乏的群体(比如那些没有信用评分的群体)具有社会价值时,我们的方法可以大大改善有限资产的分配。
{"title":"Fair Allocation through Selective Information Acquisition","authors":"William Cai, Johann D. Gaebler, Nikhil Garg, Sharad Goel","doi":"10.1145/3375627.3375823","DOIUrl":"https://doi.org/10.1145/3375627.3375823","url":null,"abstract":"Public and private institutions must often allocate scarce resources under uncertainty. Banks, for example, extend credit to loan applicants based in part on their estimated likelihood of repaying a loan. But when the quality of information differs across candidates (e.g., if some applicants lack traditional credit histories), common lending strategies can lead to disparities across groups. Here we consider a setting in which decision makers---before allocating resources---can choose to spend some of their limited budget further screening select individuals. We present a computationally efficient algorithm for deciding whom to screen that maximizes a standard measure of social welfare. Intuitively, decision makers should screen candidates on the margin, for whom the additional information could plausibly alter the allocation. We formalize this idea by showing the problem can be reduced to solving a series of linear programs. Both on synthetic and real-world datasets, this strategy improves utility, illustrating the value of targeted information acquisition in such decisions. Further, when there is social value for distributing resources to groups for whom we have a priori poor information---like those without credit scores---our approach can substantially improve the allocation of limited assets.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86362598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods 愚弄LIME和SHAP:对事后解释方法的对抗性攻击
Pub Date : 2019-11-06 DOI: 10.1145/3375627.3375830
Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju
As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in an interpretable manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. In this paper, we demonstrate that post hoc explanations techniques that rely on input perturbations, such as LIME and SHAP, are not reliable. Specifically, we propose a novel scaffolding technique that effectively hides the biases of any given classifier by allowing an adversarial entity to craft an arbitrary desired explanation. Our approach can be used to scaffold any biased classifier in such a way that its predictions on the input data distribution still remain biased, but the post hoc explanations of the scaffolded classifier look innocuous. Using extensive evaluation with multiple real world datasets (including COMPAS), we demonstrate how extremely biased (racist) classifiers crafted by our framework can easily fool popular explanation techniques such as LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases.
随着机器学习黑盒子越来越多地部署在医疗保健和刑事司法等领域,人们越来越重视构建工具和技术,以可解释的方式解释这些黑盒子。这些解释正被领域专家用来诊断系统错误和黑箱的潜在偏见。在本文中,我们证明了依赖于输入扰动的事后解释技术,如LIME和SHAP,是不可靠的。具体来说,我们提出了一种新的脚手架技术,通过允许敌对实体制作任意期望的解释,有效地隐藏任何给定分类器的偏差。我们的方法可以用来支撑任何有偏差的分类器,这样它对输入数据分布的预测仍然是有偏差的,但是支架分类器的事后解释看起来是无害的。通过对多个真实世界数据集(包括COMPAS)的广泛评估,我们展示了由我们的框架制作的极端偏见(种族主义)分类器如何轻松地欺骗流行的解释技术,如LIME和SHAP,以生成无害的解释,这些解释不反映潜在的偏见。
{"title":"Fooling LIME and SHAP: Adversarial Attacks on Post hoc Explanation Methods","authors":"Dylan Slack, Sophie Hilgard, Emily Jia, Sameer Singh, Himabindu Lakkaraju","doi":"10.1145/3375627.3375830","DOIUrl":"https://doi.org/10.1145/3375627.3375830","url":null,"abstract":"As machine learning black boxes are increasingly being deployed in domains such as healthcare and criminal justice, there is growing emphasis on building tools and techniques for explaining these black boxes in an interpretable manner. Such explanations are being leveraged by domain experts to diagnose systematic errors and underlying biases of black boxes. In this paper, we demonstrate that post hoc explanations techniques that rely on input perturbations, such as LIME and SHAP, are not reliable. Specifically, we propose a novel scaffolding technique that effectively hides the biases of any given classifier by allowing an adversarial entity to craft an arbitrary desired explanation. Our approach can be used to scaffold any biased classifier in such a way that its predictions on the input data distribution still remain biased, but the post hoc explanations of the scaffolded classifier look innocuous. Using extensive evaluation with multiple real world datasets (including COMPAS), we demonstrate how extremely biased (racist) classifiers crafted by our framework can easily fool popular explanation techniques such as LIME and SHAP into generating innocuous explanations which do not reflect the underlying biases.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77347056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 516
CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models 一个提供解释和分析黑盒模型公平性和鲁棒性的通用框架
Pub Date : 2019-05-20 DOI: 10.1145/3375627.3375812
Shubham Sharma, Jette Henderson, Joydeep Ghosh
Concerns within the machine learning community and external pressures from regulators over the vulnerabilities of machine learning algorithms have spurred on the fields of explainability, robustness, and fairness. Often, issues in explainability, robustness, and fairness are confined to their specific sub-fields and few tools exist for model developers to use to simultaneously build their modeling pipelines in a transparent, accountable, and fair way. This can lead to a bottleneck on the model developer's side as they must juggle multiple methods to evaluate their algorithms. In this paper, we present a single framework for analyzing the robustness, fairness, and explainability of a classifier. The framework, which is based on the generation of counterfactual explanations through a custom genetic algorithm, is flexible, model-agnostic, and does not require access to model internals. The framework allows the user to calculate robustness and fairness scores for individual models and generate explanations for individual predictions which provide a means for actionable recourse (changes to an input to help get a desired outcome). This is the first time that a unified tool has been developed to address three key issues pertaining towards building a responsible artificial intelligence system.
机器学习社区内部的担忧以及监管机构对机器学习算法脆弱性的外部压力,刺激了可解释性、鲁棒性和公平性等领域的发展。通常,可解释性、健壮性和公平性方面的问题局限于它们特定的子领域,并且很少有工具可供模型开发人员使用,以透明、负责和公平的方式同时构建他们的建模管道。这可能会导致模型开发人员的瓶颈,因为他们必须同时使用多种方法来评估他们的算法。在本文中,我们提出了一个单一的框架来分析分类器的鲁棒性,公平性和可解释性。该框架是基于通过自定义遗传算法生成反事实解释的,它是灵活的、模型不可知的,并且不需要访问模型内部。该框架允许用户计算单个模型的稳健性和公平性分数,并为单个预测生成解释,从而为可操作的追索权提供手段(更改输入以帮助获得期望的结果)。这是第一次开发一个统一的工具来解决与建立一个负责任的人工智能系统有关的三个关键问题。
{"title":"CERTIFAI: A Common Framework to Provide Explanations and Analyse the Fairness and Robustness of Black-box Models","authors":"Shubham Sharma, Jette Henderson, Joydeep Ghosh","doi":"10.1145/3375627.3375812","DOIUrl":"https://doi.org/10.1145/3375627.3375812","url":null,"abstract":"Concerns within the machine learning community and external pressures from regulators over the vulnerabilities of machine learning algorithms have spurred on the fields of explainability, robustness, and fairness. Often, issues in explainability, robustness, and fairness are confined to their specific sub-fields and few tools exist for model developers to use to simultaneously build their modeling pipelines in a transparent, accountable, and fair way. This can lead to a bottleneck on the model developer's side as they must juggle multiple methods to evaluate their algorithms. In this paper, we present a single framework for analyzing the robustness, fairness, and explainability of a classifier. The framework, which is based on the generation of counterfactual explanations through a custom genetic algorithm, is flexible, model-agnostic, and does not require access to model internals. The framework allows the user to calculate robustness and fairness scores for individual models and generate explanations for individual predictions which provide a means for actionable recourse (changes to an input to help get a desired outcome). This is the first time that a unified tool has been developed to address three key issues pertaining towards building a responsible artificial intelligence system.","PeriodicalId":93612,"journal":{"name":"Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society","volume":"22 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"75945728","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 143
期刊
Proceedings of the AAAI/ACM Conference on AI, Ethics, and Society
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1