Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.
{"title":"An Empirical Study of Model Errors and User Error Discovery and Repair Strategies in Natural Language Database Queries","authors":"Zheng Ning, Zheng Zhang, Tianyi Sun, Yuan Tian, Tianyi Zhang, Toby Jia-Jun Li","doi":"10.1145/3581641.3584067","DOIUrl":"https://doi.org/10.1145/3581641.3584067","url":null,"abstract":"Recent advances in machine learning (ML) and natural language processing (NLP) have led to significant improvement in natural language interfaces for structured databases (NL2SQL). Despite the great strides, the overall accuracy of NL2SQL models is still far from being perfect (∼ 75% on the Spider benchmark). In practice, this requires users to discern incorrect SQL queries generated by a model and manually fix them when using NL2SQL models. Currently, there is a lack of comprehensive understanding about the common errors in auto-generated SQLs and the effective strategies to recognize and fix such errors. To bridge the gap, we (1) performed an in-depth analysis of errors made by three state-of-the-art NL2SQL models; (2) distilled a taxonomy of NL2SQL model errors; and (3) conducted a within-subjects user study with 26 participants to investigate the effectiveness of three representative interactive mechanisms for error discovery and repair in NL2SQL. Findings from this paper shed light on the design of future error discovery and repair strategies for natural language data query interfaces.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134274027","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeesu Jung, H. Seo, S. Jung, Riwoo Chung, Hwijung Ryu, Du-Seong Chang
Summarization is one of the important tasks of natural language processing used to distill information. Recently, the sequence-to-sequence method was applied, in a general manner, to summarization tasks. The problem is that a large amount of information must be pre-trained for a specific domain, and information other than input statements cannot be utilized. To compensate for this shortcoming, controllable summarization has recently been in the spotlight. We introduced three properties into controllable summarization: 1) a new human-machine communication input format, 2) a robust constraint-sensitive summarization method for these formats, and 3) a practical interactive summarization interface available to the user. Experiments on the Wizard-of-Wikipedia dataset show that applying this input format and the constraint-sensitive method enhances summarization performance compared to the typical method. A user study shows that the interactive summarization interface is practical and that participants are evaluating it positively.
{"title":"Interactive User Interface for Dialogue Summarization","authors":"Jeesu Jung, H. Seo, S. Jung, Riwoo Chung, Hwijung Ryu, Du-Seong Chang","doi":"10.1145/3581641.3584057","DOIUrl":"https://doi.org/10.1145/3581641.3584057","url":null,"abstract":"Summarization is one of the important tasks of natural language processing used to distill information. Recently, the sequence-to-sequence method was applied, in a general manner, to summarization tasks. The problem is that a large amount of information must be pre-trained for a specific domain, and information other than input statements cannot be utilized. To compensate for this shortcoming, controllable summarization has recently been in the spotlight. We introduced three properties into controllable summarization: 1) a new human-machine communication input format, 2) a robust constraint-sensitive summarization method for these formats, and 3) a practical interactive summarization interface available to the user. Experiments on the Wizard-of-Wikipedia dataset show that applying this input format and the constraint-sensitive method enhances summarization performance compared to the typical method. A user study shows that the interactive summarization interface is practical and that participants are evaluating it positively.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114797827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yu Liang, Aditya Ponnada, Paul Lamere, Nediyana Daskalova
Content recommender systems often rely on modeling users’ past behavioral data to provide personalized recommendations - a practice that works well for suggesting more of the same and for media that require little time investment from users, such as music tracks. However, this approach can be further optimized for media where the user investment is higher, such as podcasts, because there is a broader space of user goals that might not be captured by the implicit signals of their past behavior. Allowing users to directly specify their goals might help narrow the space of possible recommendations. Thus, in this paper, we explore how we can enable goal-focused exploration in recommender systems by leveraging explicit input from users about their personal goals. Using podcast consumption as an example use-case, and informed by a large-scale survey (N=68k), we developed GoalPods, an interactive prototype that allows users to set personal goals and build playlists of podcast episode recommendations to meet those goals. We evaluated GoalPods with 14 participants where participants set a goal and spent a week listening to the episode playlist created for that goal. From the study, we identified two types of user goals: low-involvement (e.g. “combat boredom”) and high-involvement (e.g. “learn something new”) goals. Users found it easy to identify relevant recommendations for low-involvement goals, but they needed more structure and support to set high-involvement goals. By anchoring users on their personal goals to explore recommendations, GoalPods (and goal-focused podcast consumption) led to insightful content discovery outside the users’ filter bubbles. Based on our findings, we discuss opportunities for designing recommender systems that guide exploration via interactive goal-setting as well as implications for providing better recommendations by accounting for users’ personal goals.
{"title":"Enabling Goal-Focused Exploration of Podcasts in Interactive Recommender Systems","authors":"Yu Liang, Aditya Ponnada, Paul Lamere, Nediyana Daskalova","doi":"10.1145/3581641.3584032","DOIUrl":"https://doi.org/10.1145/3581641.3584032","url":null,"abstract":"Content recommender systems often rely on modeling users’ past behavioral data to provide personalized recommendations - a practice that works well for suggesting more of the same and for media that require little time investment from users, such as music tracks. However, this approach can be further optimized for media where the user investment is higher, such as podcasts, because there is a broader space of user goals that might not be captured by the implicit signals of their past behavior. Allowing users to directly specify their goals might help narrow the space of possible recommendations. Thus, in this paper, we explore how we can enable goal-focused exploration in recommender systems by leveraging explicit input from users about their personal goals. Using podcast consumption as an example use-case, and informed by a large-scale survey (N=68k), we developed GoalPods, an interactive prototype that allows users to set personal goals and build playlists of podcast episode recommendations to meet those goals. We evaluated GoalPods with 14 participants where participants set a goal and spent a week listening to the episode playlist created for that goal. From the study, we identified two types of user goals: low-involvement (e.g. “combat boredom”) and high-involvement (e.g. “learn something new”) goals. Users found it easy to identify relevant recommendations for low-involvement goals, but they needed more structure and support to set high-involvement goals. By anchoring users on their personal goals to explore recommendations, GoalPods (and goal-focused podcast consumption) led to insightful content discovery outside the users’ filter bubbles. Based on our findings, we discuss opportunities for designing recommender systems that guide exploration via interactive goal-setting as well as implications for providing better recommendations by accounting for users’ personal goals.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123901851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Y. Prakash, Mohan Sunkara, H. Lee, S. Jayarathna, V. Ashok
Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing the summaries of the items, and ‘Details’ pages containing full information about the items. While this organization of data mitigates information overload and visual cluttering for sighted users, it however increases the interaction overhead and effort for blind users, as back-and-forth navigation between webpages using screen reader assistive technology is tedious and cumbersome. Existing usability-enhancing solutions are unable to provide adequate support in this regard as they predominantly focus on enabling efficient content access within a single webpage, and as such are not tailored for content distributed across multiple webpages. As an initial step towards addressing this issue, we developed AutoDesc, a browser extension that leverages a custom extraction model to automatically detect and pull out additional item descriptions from the ‘details’ pages, and then proactively inject the extracted information into the ‘Query-Results’ page, thereby reducing the amount of back-and-forth screen reader navigation between the two webpages. In a study with 16 blind users, we observed that within the same time duration, the participants were able to peruse significantly more data items on average with AutoDesc, compared to that with their preferred screen readers as well as with a state-of-the-art solution.
{"title":"AutoDesc: Facilitating Convenient Perusal of Web Data Items for Blind Users","authors":"Y. Prakash, Mohan Sunkara, H. Lee, S. Jayarathna, V. Ashok","doi":"10.1145/3581641.3584049","DOIUrl":"https://doi.org/10.1145/3581641.3584049","url":null,"abstract":"Web data items such as shopping products, classifieds, and job listings are indispensable components of most e-commerce websites. The information on the data items are typically distributed over two or more webpages, e.g., a ‘Query-Results’ page showing the summaries of the items, and ‘Details’ pages containing full information about the items. While this organization of data mitigates information overload and visual cluttering for sighted users, it however increases the interaction overhead and effort for blind users, as back-and-forth navigation between webpages using screen reader assistive technology is tedious and cumbersome. Existing usability-enhancing solutions are unable to provide adequate support in this regard as they predominantly focus on enabling efficient content access within a single webpage, and as such are not tailored for content distributed across multiple webpages. As an initial step towards addressing this issue, we developed AutoDesc, a browser extension that leverages a custom extraction model to automatically detect and pull out additional item descriptions from the ‘details’ pages, and then proactively inject the extracted information into the ‘Query-Results’ page, thereby reducing the amount of back-and-forth screen reader navigation between the two webpages. In a study with 16 blind users, we observed that within the same time duration, the participants were able to peruse significantly more data items on average with AutoDesc, compared to that with their preferred screen readers as well as with a state-of-the-art solution.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122297904","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patrick Hemmer, Monika Westphal, M. Schemmer, S. Vetter, Michael Vossing, G. Satzger
Recent work has proposed artificial intelligence (AI) models that can learn to decide whether to make a prediction for an instance of a task or to delegate it to a human by considering both parties’ capabilities. In simulations with synthetically generated or context-independent human predictions, delegation can help improve the performance of human-AI teams—compared to humans or the AI model completing the task alone. However, so far, it remains unclear how humans perform and how they perceive the task when they are aware that an AI model delegated task instances to them. In an experimental study with 196 participants, we show that task performance and task satisfaction improve through AI delegation, regardless of whether humans are aware of the delegation. Additionally, we identify humans’ increased levels of self-efficacy as the underlying mechanism for these improvements in performance and satisfaction. Our findings provide initial evidence that allowing AI models to take over more management responsibilities can be an effective form of human-AI collaboration in workplaces.
{"title":"Human-AI Collaboration: The Effect of AI Delegation on Human Task Performance and Task Satisfaction","authors":"Patrick Hemmer, Monika Westphal, M. Schemmer, S. Vetter, Michael Vossing, G. Satzger","doi":"10.1145/3581641.3584052","DOIUrl":"https://doi.org/10.1145/3581641.3584052","url":null,"abstract":"Recent work has proposed artificial intelligence (AI) models that can learn to decide whether to make a prediction for an instance of a task or to delegate it to a human by considering both parties’ capabilities. In simulations with synthetically generated or context-independent human predictions, delegation can help improve the performance of human-AI teams—compared to humans or the AI model completing the task alone. However, so far, it remains unclear how humans perform and how they perceive the task when they are aware that an AI model delegated task instances to them. In an experimental study with 196 participants, we show that task performance and task satisfaction improve through AI delegation, regardless of whether humans are aware of the delegation. Additionally, we identify humans’ increased levels of self-efficacy as the underlying mechanism for these improvements in performance and satisfaction. Our findings provide initial evidence that allowing AI models to take over more management responsibilities can be an effective form of human-AI collaboration in workplaces.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115810634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria — rubric — on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions.
{"title":"IRIS: Interpretable Rubric-Informed Segmentation for Action Quality Assessment","authors":"Hitoshi Matsuyama, Nobuo Kawaguchi, Brian Y. Lim","doi":"10.1145/3581641.3584048","DOIUrl":"https://doi.org/10.1145/3581641.3584048","url":null,"abstract":"AI-driven Action Quality Assessment (AQA) of sports videos can mimic Olympic judges to help score performances as a second opinion or for training. However, these AI methods are uninterpretable and do not justify their scores, which is important for algorithmic accountability. Indeed, to account for their decisions, instead of scoring subjectively, sports judges use a consistent set of criteria — rubric — on multiple actions in each performance sequence. Therefore, we propose IRIS to perform Interpretable Rubric-Informed Segmentation on action sequences for AQA. We investigated IRIS for scoring videos of figure skating performance. IRIS predicts (1) action segments, (2) technical element score differences of each segment relative to base scores, (3) multiple program component scores, and (4) the summed final score. In a modeling study, we found that IRIS performs better than non-interpretable, state-of-the-art models. In a formative user study, practicing figure skaters agreed with the rubric-informed explanations, found them useful, and trusted AI judgments more. This work highlights the importance of using judgment rubrics to account for AI decisions.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130324258","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Researchers have widely acknowledged the potential of control mechanisms with which end-users of recommender systems can better tailor recommendations. However, few e-learning environments so far incorporate such mechanisms, for example for steering recommended exercises. In addition, studies with adolescents in this context are rare. To address these limitations, we designed a control mechanism and a visualisation of the control’s impact through an iterative design process with adolescents and teachers. Then, we investigated how these functionalities affect adolescents’ trust in an e-learning platform that recommends maths exercises. A randomised controlled experiment with 76 middle school and high school adolescents showed that visualising the impact of exercised control significantly increases trust. Furthermore, having control over their mastery level seemed to inspire adolescents to reasonably challenge themselves and reflect upon the underlying recommendation algorithm. Finally, a significant increase in perceived transparency suggested that visualising steering actions can indirectly explain why recommendations are suitable, which opens interesting research tracks for the broader field of explainable AI.
{"title":"Steering Recommendations and Visualising Its Impact: Effects on Adolescents’ Trust in E-Learning Platforms","authors":"Jeroen Ooge, L. Dereu, K. Verbert","doi":"10.1145/3581641.3584046","DOIUrl":"https://doi.org/10.1145/3581641.3584046","url":null,"abstract":"Researchers have widely acknowledged the potential of control mechanisms with which end-users of recommender systems can better tailor recommendations. However, few e-learning environments so far incorporate such mechanisms, for example for steering recommended exercises. In addition, studies with adolescents in this context are rare. To address these limitations, we designed a control mechanism and a visualisation of the control’s impact through an iterative design process with adolescents and teachers. Then, we investigated how these functionalities affect adolescents’ trust in an e-learning platform that recommends maths exercises. A randomised controlled experiment with 76 middle school and high school adolescents showed that visualising the impact of exercised control significantly increases trust. Furthermore, having control over their mastery level seemed to inspire adolescents to reasonably challenge themselves and reflect upon the underlying recommendation algorithm. Finally, a significant increase in perceived transparency suggested that visualising steering actions can indirectly explain why recommendations are suitable, which opens interesting research tracks for the broader field of explainable AI.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131986331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
UX practitioners face novel challenges when designing user interfaces for machine learning (ML)-enabled applications. Interactive ML paradigms, like AutoML and interactive machine teaching, lower the barrier for non-expert end users to create, understand, and use ML models, but their application to UX practice is largely unstudied. We conducted a task-based design study with 27 UX practitioners where we asked them to propose a proof-of-concept design for a new ML-enabled application. During the task, our participants were given opportunities to create, test, and modify ML models as part of their workflows. Through a qualitative analysis of our post-task interview, we found that direct, interactive experimentation with ML allowed UX practitioners to tie ML capabilities and underlying data to user goals, compose affordances to enhance end-user interactions with ML, and identify ML-related ethical risks and challenges. We discuss our findings in the context of previously established human-AI guidelines. We also identify some limitations of interactive ML in UX processes and propose research-informed machine teaching as a supplement to future design tools alongside interactive ML.
{"title":"Addressing UX Practitioners’ Challenges in Designing ML Applications: an Interactive Machine Learning Approach","authors":"K. J. Kevin Feng, David W. Mcdonald","doi":"10.1145/3581641.3584064","DOIUrl":"https://doi.org/10.1145/3581641.3584064","url":null,"abstract":"UX practitioners face novel challenges when designing user interfaces for machine learning (ML)-enabled applications. Interactive ML paradigms, like AutoML and interactive machine teaching, lower the barrier for non-expert end users to create, understand, and use ML models, but their application to UX practice is largely unstudied. We conducted a task-based design study with 27 UX practitioners where we asked them to propose a proof-of-concept design for a new ML-enabled application. During the task, our participants were given opportunities to create, test, and modify ML models as part of their workflows. Through a qualitative analysis of our post-task interview, we found that direct, interactive experimentation with ML allowed UX practitioners to tie ML capabilities and underlying data to user goals, compose affordances to enhance end-user interactions with ML, and identify ML-related ethical risks and challenges. We discuss our findings in the context of previously established human-AI guidelines. We also identify some limitations of interactive ML in UX processes and propose research-informed machine teaching as a supplement to future design tools alongside interactive ML.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128260688","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Aditya Bhattacharya, Jeroen Ooge, G. Štiglic, K. Verbert
Explainable artificial intelligence is increasingly used in machine learning (ML) based decision-making systems in healthcare. However, little research has compared the utility of different explanation methods in guiding healthcare experts for patient care. Moreover, it is unclear how useful, understandable, actionable and trustworthy these methods are for healthcare experts, as they often require technical ML knowledge. This paper presents an explanation dashboard that predicts the risk of diabetes onset and explains those predictions with data-centric, feature-importance, and example-based explanations. We designed an interactive dashboard to assist healthcare experts, such as nurses and physicians, in monitoring the risk of diabetes onset and recommending measures to minimize risk. We conducted a qualitative study with 11 healthcare experts and a mixed-methods study with 45 healthcare experts and 51 diabetic patients to compare the different explanation methods in our dashboard in terms of understandability, usefulness, actionability, and trust. Results indicate that our participants preferred our representation of data-centric explanations that provide local explanations with a global overview over other methods. Therefore, this paper highlights the importance of visually directive data-centric explanation method for assisting healthcare experts to gain actionable insights from patient health records. Furthermore, we share our design implications for tailoring the visual representation of different explanation methods for healthcare experts.
{"title":"Directive Explanations for Monitoring the Risk of Diabetes Onset: Introducing Directive Data-Centric Explanations and Combinations to Support What-If Explorations","authors":"Aditya Bhattacharya, Jeroen Ooge, G. Štiglic, K. Verbert","doi":"10.1145/3581641.3584075","DOIUrl":"https://doi.org/10.1145/3581641.3584075","url":null,"abstract":"Explainable artificial intelligence is increasingly used in machine learning (ML) based decision-making systems in healthcare. However, little research has compared the utility of different explanation methods in guiding healthcare experts for patient care. Moreover, it is unclear how useful, understandable, actionable and trustworthy these methods are for healthcare experts, as they often require technical ML knowledge. This paper presents an explanation dashboard that predicts the risk of diabetes onset and explains those predictions with data-centric, feature-importance, and example-based explanations. We designed an interactive dashboard to assist healthcare experts, such as nurses and physicians, in monitoring the risk of diabetes onset and recommending measures to minimize risk. We conducted a qualitative study with 11 healthcare experts and a mixed-methods study with 45 healthcare experts and 51 diabetic patients to compare the different explanation methods in our dashboard in terms of understandability, usefulness, actionability, and trust. Results indicate that our participants preferred our representation of data-centric explanations that provide local explanations with a global overview over other methods. Therefore, this paper highlights the importance of visually directive data-centric explanation method for assisting healthcare experts to gain actionable insights from patient health records. Furthermore, we share our design implications for tailoring the visual representation of different explanation methods for healthcare experts.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123219803","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
D. Weidele, S. Afzal, Abel N. Valente, Cole Makuch, Owen Cornec, Long Vu, D. Subramanian, Werner Geyer, Rahul Nair, Inge Vejsbjerg, Radu Marinescu, Paulito Palmes, Elizabeth M. Daly, Loraine Franke, D. Haehn
We present AutoDOViz, an interactive user interface for automated decision optimization (AutoDO) using reinforcement learning (RL). Decision optimization (DO) has classically being practiced by dedicated DO researchers [43] where experts need to spend long periods of time fine tuning a solution through trial-and-error. AutoML pipeline search has sought to make it easier for a data scientist to find the best machine learning pipeline by leveraging automation to search and tune the solution. More recently, these advances have been applied to the domain of AutoDO [36], with a similar goal to find the best reinforcement learning pipeline through algorithm selection and parameter tuning. However, Decision Optimization requires significantly more complex problem specification when compared to an ML problem. AutoDOViz seeks to lower the barrier of entry for data scientists in problem specification for reinforcement learning problems, leverage the benefits of AutoDO algorithms for RL pipeline search and finally, create visualizations and policy insights in order to facilitate the typical interactive nature when communicating problem formulation and solution proposals between DO experts and domain experts. In this paper, we report our findings from semi-structured expert interviews with DO practitioners as well as business consultants, leading to design requirements for human-centered automation for DO with RL. We evaluate a system implementation with data scientists and find that they are significantly more open to engage in DO after using our proposed solution. AutoDOViz further increases trust in RL agent models and makes the automated training and evaluation process more comprehensible. As shown for other automation in ML tasks [33, 59], we also conclude automation of RL for DO can benefit from user and vice-versa when the interface promotes human-in-the-loop.
{"title":"AutoDOViz: Human-Centered Automation for Decision Optimization","authors":"D. Weidele, S. Afzal, Abel N. Valente, Cole Makuch, Owen Cornec, Long Vu, D. Subramanian, Werner Geyer, Rahul Nair, Inge Vejsbjerg, Radu Marinescu, Paulito Palmes, Elizabeth M. Daly, Loraine Franke, D. Haehn","doi":"10.1145/3581641.3584094","DOIUrl":"https://doi.org/10.1145/3581641.3584094","url":null,"abstract":"We present AutoDOViz, an interactive user interface for automated decision optimization (AutoDO) using reinforcement learning (RL). Decision optimization (DO) has classically being practiced by dedicated DO researchers [43] where experts need to spend long periods of time fine tuning a solution through trial-and-error. AutoML pipeline search has sought to make it easier for a data scientist to find the best machine learning pipeline by leveraging automation to search and tune the solution. More recently, these advances have been applied to the domain of AutoDO [36], with a similar goal to find the best reinforcement learning pipeline through algorithm selection and parameter tuning. However, Decision Optimization requires significantly more complex problem specification when compared to an ML problem. AutoDOViz seeks to lower the barrier of entry for data scientists in problem specification for reinforcement learning problems, leverage the benefits of AutoDO algorithms for RL pipeline search and finally, create visualizations and policy insights in order to facilitate the typical interactive nature when communicating problem formulation and solution proposals between DO experts and domain experts. In this paper, we report our findings from semi-structured expert interviews with DO practitioners as well as business consultants, leading to design requirements for human-centered automation for DO with RL. We evaluate a system implementation with data scientists and find that they are significantly more open to engage in DO after using our proposed solution. AutoDOViz further increases trust in RL agent models and makes the automated training and evaluation process more comprehensible. As shown for other automation in ML tasks [33, 59], we also conclude automation of RL for DO can benefit from user and vice-versa when the interface promotes human-in-the-loop.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"388 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124810355","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}