{"title":"The Value of Trustworthy AI","authors":"D. Danks","doi":"10.1145/3306618.3314228","DOIUrl":null,"url":null,"abstract":"Trust is one of the most critical relations in our human lives, whether trust in one another, trust in the artifacts that we use everyday, or trust of an AI system. Even a cursory examination of the literatures in human-computer interaction, human-robot interaction, and numerous other disciplines reveals a deep, persistent concern with the nature of trust in AI, and the conditions under which it can be generated, reduced, repaired, or influenced. At a high level, we often understand trust as a relation in which the trustor makes oneself vulnerable based on positive expectations about the behavior or intentions of the trustee [1]. For example, when I trust my car to start in the morning, I make myself vulnerable (e.g., I risk that I will be late to work if it does not start) because I have the positive expectation that it actually will start. This high-level characterization is relatively unhelpful, however, particularly given the wide range of disciplines that have examined the relation of trust, ranging from organizational behavior to game theory to ethics to cognitive science. The picture that emerges from, for example, social psychology (i.e., two distinct kinds of trust depending on whether one knows the trustee's behaviors or intentions/ values) appears to be quite different from the one that emerges from moral philosophy (i.e., a single, highly-moralized notion), even though both are consistent with this high-level characterization. This talk first introduces that diversity of types of 'trust', but then argues that we can make progress towards a unified characterization by focusing on the function of trust. That is, we should ask why care whether we can trust our artifacts, AI, or fellow humans, as that can help to illuminate features of trust that are shared across domains, trustors, and trustees. I contend that one reason to desire trust is an \"almost-necessary\" condition on ethical action: namely, that the user has a reasonable belief that the system (whether human or machine) will behave approximately as intended. This condition is obviously not sufficient for ethical use, nor is it strictly necessary since the best available option might nonetheless be one for which the user lacks appropriate reasonable beliefs. Nonetheless, it provides a reasonable starting point for an analysis of 'trust'. More precisely, I propose that this condition indicates a role for trust as providing precisely those reasonable beliefs, at least when we have appropriately grounded trust. That is, we can understand 'appropriate trust' as obtaining when the trustor has justified beliefs that the trustee has suitable dispositions. As there is variation in the trustor's goals and values, and also the openness of the context of use, then different specific versions of 'appropriate trust' result as those variations lead to different types of focal dispositions, specific dispositions, or observability of dispositions, respectively. For example, in an open context (i.e., one where the possibilities cannot be exhaustively enumerated), the trustee's full dispositions will not be directly observable, but rather must be inferred from observations. This framework provides a unification of the different theories of 'trust' developed in different disciplines. Moreover, it provides clarity about one key function of trust, and thereby helps us to understand the value of (appropriate) trust. We need to trust our AI systems because that is a precondition for the ethical, responsible use of them.","PeriodicalId":418125,"journal":{"name":"Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society","volume":"78 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"16","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2019 AAAI/ACM Conference on AI, Ethics, and Society","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3306618.3314228","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 16
Abstract
Trust is one of the most critical relations in our human lives, whether trust in one another, trust in the artifacts that we use everyday, or trust of an AI system. Even a cursory examination of the literatures in human-computer interaction, human-robot interaction, and numerous other disciplines reveals a deep, persistent concern with the nature of trust in AI, and the conditions under which it can be generated, reduced, repaired, or influenced. At a high level, we often understand trust as a relation in which the trustor makes oneself vulnerable based on positive expectations about the behavior or intentions of the trustee [1]. For example, when I trust my car to start in the morning, I make myself vulnerable (e.g., I risk that I will be late to work if it does not start) because I have the positive expectation that it actually will start. This high-level characterization is relatively unhelpful, however, particularly given the wide range of disciplines that have examined the relation of trust, ranging from organizational behavior to game theory to ethics to cognitive science. The picture that emerges from, for example, social psychology (i.e., two distinct kinds of trust depending on whether one knows the trustee's behaviors or intentions/ values) appears to be quite different from the one that emerges from moral philosophy (i.e., a single, highly-moralized notion), even though both are consistent with this high-level characterization. This talk first introduces that diversity of types of 'trust', but then argues that we can make progress towards a unified characterization by focusing on the function of trust. That is, we should ask why care whether we can trust our artifacts, AI, or fellow humans, as that can help to illuminate features of trust that are shared across domains, trustors, and trustees. I contend that one reason to desire trust is an "almost-necessary" condition on ethical action: namely, that the user has a reasonable belief that the system (whether human or machine) will behave approximately as intended. This condition is obviously not sufficient for ethical use, nor is it strictly necessary since the best available option might nonetheless be one for which the user lacks appropriate reasonable beliefs. Nonetheless, it provides a reasonable starting point for an analysis of 'trust'. More precisely, I propose that this condition indicates a role for trust as providing precisely those reasonable beliefs, at least when we have appropriately grounded trust. That is, we can understand 'appropriate trust' as obtaining when the trustor has justified beliefs that the trustee has suitable dispositions. As there is variation in the trustor's goals and values, and also the openness of the context of use, then different specific versions of 'appropriate trust' result as those variations lead to different types of focal dispositions, specific dispositions, or observability of dispositions, respectively. For example, in an open context (i.e., one where the possibilities cannot be exhaustively enumerated), the trustee's full dispositions will not be directly observable, but rather must be inferred from observations. This framework provides a unification of the different theories of 'trust' developed in different disciplines. Moreover, it provides clarity about one key function of trust, and thereby helps us to understand the value of (appropriate) trust. We need to trust our AI systems because that is a precondition for the ethical, responsible use of them.