Approximately 50% of development resources are devoted to user interface (UI) development tasks [9]. Occupying a large proportion of development resources, developing icons can be a time-consuming task, because developers need to consider not only effective implementation methods but also easy-to-understand descriptions. In this article, we present
Recent years have witnessed the growing literature in empirical evaluation of explainable AI (XAI) methods. This study contributes to this ongoing conversation by presenting a comparison on the effects of a set of established XAI methods in AI-assisted decision making. Based on our review of previous literature, we highlight three desirable properties that ideal AI explanations should satisfy — improve people’s understanding of the AI model, help people recognize the model uncertainty, and support people’s calibrated trust in the model. Through three randomized controlled experiments, we evaluate whether four types of common model-agnostic explainable AI methods satisfy these properties on two types of AI models of varying levels of complexity, and in two kinds of decision making contexts where people perceive themselves as having different levels of domain expertise. Our results demonstrate that many AI explanations do not satisfy any of the desirable properties when used on decision making tasks that people have little domain expertise in. On decision making tasks that people are more knowledgeable, the feature contribution explanation is shown to satisfy more desiderata of AI explanations, even when the AI model is inherently complex. We conclude by discussing the implications of our study for improving the design of XAI methods to better support human decision making, and for advancing more rigorous empirical evaluation of XAI methods.
Semantically rich information from multiple modalities—text, code, images, categorical and numerical data—co-exist in the user interface (UI) design of mobile applications. Moreover, each UI design is composed of inter-linked UI entities that support different functions of an application, e.g., a UI screen comprising a UI taskbar, a menu, and multiple button elements. Existing UI representation learning methods unfortunately are not designed to capture multi-modal and linkage structure between UI entities. To support effective search and recommendation applications over mobile UIs, we need UI representations that integrate latent semantics present in both multi-modal information and linkages between UI entities. In this article, we present a novel self-supervised model—Multi-modal Attention-based Attributed Network Embedding (MAAN) model. MAAN is designed to capture structural network information present within the linkages between UI entities, as well as multi-modal attributes of the UI entity nodes. Based on the variational autoencoder framework, MAAN learns semantically rich UI embeddings in a self-supervised manner by reconstructing the attributes of UI entities and the linkages between them. The generated embeddings can be applied to a variety of downstream tasks: predicting UI elements associated with UI screens, inferring missing UI screen and element attributes, predicting UI user ratings, and retrieving UIs. Extensive experiments, including user evaluations, conducted on datasets from RICO, a rich real-world mobile UI repository, demonstrate that MAAN out-performs other state-of-the-art models. The number of linkages between UI entities can provide further information on the role of different UI entities in UI designs. However, MAAN does not capture edge attributes. To extend and generalize MAAN to learn even richer UI embeddings, we further propose EMAAN to capture edge attributes. We conduct additional extensive experiments on EMAAN, which show that it improves the performance of MAAN and similarly out-performs state-of-the-art models.
We propose a new method for generating explanations with Artificial Intelligence (AI) and a tool to test its expressive power within a user interface. In order to bridge the gap between philosophy and human-computer interfaces, we show a new approach for the generation of interactive explanations based on a sophisticated pipeline of AI algorithms for structuring natural language documents into knowledge graphs, answering questions effectively and satisfactorily. With this work, we aim to prove that the philosophical theory of explanations presented by Achinstein can be actually adapted for being implemented into a concrete software application, as an interactive and illocutionary process of answering questions. Specifically, our contribution is an approach to frame illocution in a computer-friendly way, to achieve user-centrality with statistical question answering. Indeed, we frame the illocution of an explanatory process as that mechanism responsible for anticipating the needs of the explainee in the form of unposed, implicit, archetypal questions, hence improving the user-centrality of the underlying explanatory process. Therefore, we hypothesise that if an explanatory process is an illocutionary act of providing content-giving answers to questions, and illocution is as we defined it, the more explicit and implicit questions can be answered by an explanatory tool, the more usable (as per ISO 9241-210) its explanations. We tested our hypothesis with a user-study involving more than 60 participants, on two XAI-based systems, one for credit approval (finance) and one for heart disease prediction (healthcare). The results showed that increasing the illocutionary power of an explanatory tool can produce statistically significant improvements (hence with a P value lower than .05) on effectiveness. This, combined with a visible alignment between the increments in effectiveness and satisfaction, suggests that our understanding of illocution can be correct, giving evidence in favour of our theory.
Online research is a frequent and important activity people perform on the Internet, yet current support for this task is basic, fragmented and not well integrated into web browser experiences. Guided by sensemaking theory, we present ForSense, a browser extension for accelerating people’s online research experience. The two primary sources of novelty of ForSense are the integration of multiple stages of online research and providing machine assistance to the user by leveraging recent advances in neural-driven machine reading. We use ForSense as a design probe to explore (1) the benefits of integrating multiple stages of online research, (2) the opportunities to accelerate online research using current advances in machine reading, (3) the opportunities to support online research tasks in the presence of imprecise machine suggestions, and (4) insights about the behaviors people exhibit when performing online research, the pages they visit, and the artifacts they create. Through our design probe, we observe people performing online research tasks, and see that they benefit from ForSense’s integration and machine support for online research. From the information and insights we collected, we derive and share key recommendations for designing and supporting imprecise machine assistance for research tasks.
People spend an enormous amount of time and effort looking for lost objects. To help remind people of the location of lost objects, various computational systems that provide information on their locations have been developed. However, prior systems for assisting people in finding objects require users to register the target objects in advance. This requirement imposes a cumbersome burden on the users, and the system cannot help remind them of unexpectedly lost objects. We propose GO-Finder (“Generic Object Finder”), a registration-free wearable camera-based system for assisting people in finding an arbitrary number of objects based on two key features: automatic discovery of hand-held objects and image-based candidate selection. Given a video taken from a wearable camera, GO-Finder automatically detects and groups hand-held objects to form a visual timeline of the objects. Users can retrieve the last appearance of the object by browsing the timeline through a smartphone app. We conducted user studies to investigate how users benefit from using GO-Finder. In the first study, we asked participants to perform an object retrieval task and confirmed improved accuracy and reduced mental load in the object search task by providing clear visual cues on object locations. In the second study, the system’s usability on a longer and more realistic scenario was verified, accompanied by an additional feature of context-based candidate filtering. Participant feedback suggested the usefulness of GO-Finder also in realistic scenarios where more than one hundred objects appear.
Open-domain chatbots engage with users in natural conversations to socialize and establish bonds. However, designing and developing an effective open-domain chatbot is challenging. It is unclear what qualities of a chatbot most correspond to users’ expectations and preferences. Even though existing work has considered a wide range of aspects, some key components are still missing. For example, the role of chatbots’ ability to communicate with humans at the emotional level remains an open subject of study. Furthermore, these trait qualities are likely to cover several dimensions. It is crucial to understand how the different qualities relate and interact with each other and what the core aspects would be. For this purpose, we first designed an exploratory user study aimed at gaining a basic understanding of the desired qualities of chatbots with a special focus on their emotional intelligence. Using the findings from the first study, we constructed a model of the desired traits by carefully selecting a set of features. With the help of a large-scale survey and structural equation modeling, we further validated the model using data collected from the survey. The final outcome is called the PEACE model (Politeness, Entertainment, Attentive Curiosity, and Empathy). By analyzing the dependencies between the different PEACE constructs, we shed light on the importance of and interplay between the chatbots’ qualities and the effect of users’ attitudes and concerns on their expectations of the technology. Not only PEACE defines the key ingredients of the social qualities of a chatbot, it also helped us derive a set of design implications useful for the development of socially adequate and emotionally aware open-domain chatbots.