In Explainable Artificial Intelligence (XAI) research, various local model-agnostic methods have been proposed to explain individual predictions to users in order to increase the transparency of the underlying Artificial Intelligence (AI) systems. However, the user perspective has received less attention in XAI research, leading to a (1) lack of involvement of users in the design process of local model-agnostic explanations representations and (2) a limited understanding of how users visually attend them. Against this backdrop, we refined representations of local explanations from four well-established model-agnostic XAI methods in an iterative design process with users. Moreover, we evaluated the refined explanation representations in a laboratory experiment using eye-tracking technology as well as self-reports and interviews. Our results show that users do not necessarily prefer simple explanations and that their individual characteristics, such as gender and previous experience with AI systems, strongly influence their preferences. In addition, users find that some explanations are only useful in certain scenarios making the selection of an appropriate explanation highly dependent on context. With our work, we contribute to ongoing research to improve transparency in AI.
Word vector embeddings have been shown to contain and amplify biases in the data they are extracted from. Consequently, many techniques have been proposed to identify, mitigate, and attenuate these biases in word representations. In this paper, we utilize interactive visualization to increase the interpretability and accessibility of a collection of state-of-the-art debiasing techniques. To aid this, we present the Visualization of Embedding Representations for deBiasing (“VERB”) system, an open-source web-based visualization tool that helps users gain a technical understanding and visual intuition of the inner workings of debiasing techniques, with a focus on their geometric properties. In particular, VERB offers easy-to-follow examples that explore the effects of these debiasing techniques on the geometry of high-dimensional word vectors. To help understand how various debiasing techniques change the underlying geometry, VERB decomposes each technique into interpretable sequences of primitive transformations and highlights their effect on the word vectors using dimensionality reduction and interactive visual exploration. VERB is designed to target natural language processing (NLP) practitioners who are designing decision-making systems on top of word embeddings, and also researchers working with the fairness and ethics of machine learning systems in NLP. It can also serve as a visual medium for education, which helps an NLP novice understand and mitigate biases in word embeddings.
We present an approach that shows all relevant subspaces of categorical data condensed in a single picture. We model the categorical values of the attributes as co-occurrences with data partitions generated from structured data using pattern mining. We show that these co-occurrences are a-priori, allowing us to greatly reduce the search space, effectively generating the condensed picture where conventional approaches filter out several subspaces as these are deemed insignificant. The task of identifying interesting subspaces is common but difficult due to exponential search spaces and the curse of dimensionality. One application of such a task might be identifying a cohort of patients defined by attributes such as gender, age, and diabetes type that share a common patient history, which is modeled as event sequences. Filtering the data by these attributes is common but cumbersome and often does not allow a comparison of subspaces. We contribute a powerful multi-dimensional pattern exploration approach (MDPE-approach) agnostic to the structured data type that models multiple attributes and their characteristics as co-occurrences, allowing the user to identify and compare thousands of subspaces of interest in a single picture. In our MDPE-approach, we introduce two methods to dramatically reduce the search space, outputting only the boundaries of the search space in the form of two tables. We implement the MDPE-approach in an interactive visual interface (MDPE-vis) that provides a scalable, pixel-based visualization design allowing the identification, comparison, and sense-making of subspaces in structured data. Our case studies using a gold-standard dataset and external domain experts confirm our approach’s and implementation’s applicability. A third use case sheds light on the scalability of our approach and a user study with 15 participants underlines its usefulness and power.
Ethics of Artificial Intelligence (AI) is a growing research field that has emerged in response to the challenges related to AI. Transparency poses a key challenge for implementing AI ethics in practice. One solution to transparency issues is AI systems that can explain their decisions. Explainable AI (XAI) refers to AI systems that are interpretable or understandable to humans. The research fields of AI ethics and XAI lack a common framework and conceptualization. There is no clarity of the field’s depth and versatility. A systematic approach to understanding the corpus is needed. A systematic review offers an opportunity to detect research gaps and focus points. This paper presents the results of a systematic mapping study (SMS) of the research field of the Ethics of AI. The focus is on understanding the role of XAI and how the topic has been studied empirically. An SMS is a tool for performing a repeatable and continuable literature search. This paper contributes to the research field with a Systematic Map that visualizes what, how, when, and why XAI has been studied empirically in the field of AI ethics. The mapping reveals research gaps in the area. Empirical contributions are drawn from the analysis. The contributions are reflected on in regards to theoretical and practical implications. As the scope of the SMS is a broader research area of AI ethics the collected dataset opens possibilities to continue the mapping process in other directions.
Quality-sensitive applications of machine learning (ML) require quality assurance (QA) by humans before the predictions of an ML model can be deployed. QA for ML (QA4ML) interfaces require users to view a large amount of data and perform many interactions to correct errors made by the ML model. An optimized user interface (UI) can significantly reduce interaction costs. While UI optimization can be informed by user studies evaluating design options, this approach is not scalable because there are typically numerous small variations that can affect the efficiency of a QA4ML interface. Hence, we propose using simulation to evaluate and aid the optimization of QA4ML interfaces. In particular, we focus on simulating the combined effects of human intelligence in initiating appropriate interaction commands and machine intelligence in providing algorithmic assistance for accelerating QA4ML processes. As QA4ML is usually labor-intensive, we use the simulated task completion time as the metric for UI optimization under different interface and algorithm setups. We demonstrate the usage of this UI design method in several QA4ML applications.
Relating explicit psychological mechanisms and observable behaviours is a central aim of psychological and behavioural science. One of the challenges is to understand and model the role of consciousness and, in particular, its subjective perspective as an internal level of representation (including for social cognition) in the governance of behaviour. Toward this aim, we implemented the principles of the Projective Consciousness Model (PCM) into artificial agents embodied as virtual humans, extending a previous implementation of the model. Our goal was to offer a proof-of-concept, based purely on simulations, as a basis for a future methodological framework. Its overarching aim is to be able to assess hidden psychological parameters in human participants, based on a model relevant to consciousness research, in the context of experiments in virtual reality. As an illustration of the approach, we focused on simulating the role of Theory of Mind (ToM) in the choice of strategic behaviours of approach and avoidance to optimise the satisfaction of agents’ preferences. We designed a main experiment in a virtual environment that could be used with real humans, allowing us to classify behaviours as a function of order of ToM, up to the second order. We show that agents using the PCM demonstrated expected behaviours with consistent parameters of ToM in this experiment. We also show that the agents could be used to estimate correctly each other’s order of ToM. Furthermore, in a supplementary experiment, we demonstrated how the agents could simultaneously estimate order of ToM and preferences attributed to others to optimize behavioural outcomes. Future studies will empirically assess and fine tune the framework with real humans in virtual reality experiments.
Smart home environments are designed to provide services that help improve the quality of life for the occupant via a variety of sensors and actuators installed throughout the space. Many automated actions taken by a smart home are governed by the output of an underlying activity recognition system. However, activity recognition systems may not be perfectly accurate, and therefore inconsistencies in smart home operations can lead users reliant on smart home predictions to wonder “Why did the smart home do that?” In this work, we build on insights from Explainable Artificial Intelligence (XAI) techniques and introduce an explainable activity recognition framework in which we leverage leading XAI methods (Local Interpretable Model-agnostic Explanations, SHapley Additive exPlanations (SHAP), Anchors) to generate natural language explanations that explain what about an activity led to the given classification. We evaluate our framework in the context of a commonly targeted smart home scenario: autonomous remote caregiver monitoring for individuals who are living alone or need assistance. Within the context of remote caregiver monitoring, we perform a two-step evaluation: (a) utilize Machine Learning experts to assess the sensibility of explanations and (b) recruit non-experts in two user remote caregiver monitoring scenarios, synchronous and asynchronous, to assess the effectiveness of explanations generated via our framework. Our results show that the XAI approach, SHAP, has a 92% success rate in generating sensible explanations. Moreover, in 83% of sampled scenarios users preferred natural language explanations over a simple activity label, underscoring the need for explainable activity recognition systems. Finally, we show that explanations generated by some XAI methods can lead users to lose confidence in the accuracy of the underlying activity recognition model, while others lead users to gain confidence. Taking all studied factors into consideration, we make a recommendation regarding which existing XAI method leads to the best performance in the domain of smart home automation and discuss a range of topics for future work to further improve explainable activity recognition.