Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, Jinwook Seo
Large-scale Text-to-image Generation Models (LTGMs) (e.g., DALL-E), self-supervised deep learning models trained on a huge dataset, have demonstrated the capacity for generating high-quality open-domain images from multi-modal input. Although they can even produce anthropomorphized versions of objects and animals, combine irrelevant concepts in reasonable ways, and give variation to any user-provided images, we witnessed such rapid technological advancement left many visual artists disoriented in leveraging LTGMs more actively in their creative works. Our goal in this work is to understand how visual artists would adopt LTGMs to support their creative works. To this end, we conducted an interview study as well as a systematic literature review of 72 system/application papers for a thorough examination. A total of 28 visual artists covering 35 distinct visual art domains acknowledged LTGMs’ versatile roles with high usability to support creative works in automating the creation process (i.e., automation), expanding their ideas (i.e., exploration), and facilitating or arbitrating in communication (i.e., mediation). We conclude by providing four design guidelines that future researchers can refer to in making intelligent user interfaces using LTGMs.
{"title":"Large-scale Text-to-Image Generation Models for Visual Artists’ Creative Works","authors":"Hyung-Kwon Ko, Gwanmo Park, Hyeon Jeon, Jaemin Jo, Juho Kim, Jinwook Seo","doi":"10.1145/3581641.3584078","DOIUrl":"https://doi.org/10.1145/3581641.3584078","url":null,"abstract":"Large-scale Text-to-image Generation Models (LTGMs) (e.g., DALL-E), self-supervised deep learning models trained on a huge dataset, have demonstrated the capacity for generating high-quality open-domain images from multi-modal input. Although they can even produce anthropomorphized versions of objects and animals, combine irrelevant concepts in reasonable ways, and give variation to any user-provided images, we witnessed such rapid technological advancement left many visual artists disoriented in leveraging LTGMs more actively in their creative works. Our goal in this work is to understand how visual artists would adopt LTGMs to support their creative works. To this end, we conducted an interview study as well as a systematic literature review of 72 system/application papers for a thorough examination. A total of 28 visual artists covering 35 distinct visual art domains acknowledged LTGMs’ versatile roles with high usability to support creative works in automating the creation process (i.e., automation), expanding their ideas (i.e., exploration), and facilitating or arbitrating in communication (i.e., mediation). We conclude by providing four design guidelines that future researchers can refer to in making intelligent user interfaces using LTGMs.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124099773","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Advait Bhat, Saaket Agashe, Niharika Mohile, Parth Oberoi, R. Jangir, Anirudha N. Joshi
Writing with next-phrase suggestions powered by large language models is becoming more pervasive by the day. However, research to understand writers’ interaction and decision-making processes while engaging with such systems is still emerging. We conducted a qualitative study to shed light on writers’ cognitive processes while writing with next-phrase suggestion systems. To do so, we recruited 14 amateur writers to write two movie reviews each, one without suggestions and one with suggestions. Additionally, we also positively and negatively biased the suggestion system to get a diverse range of instances where writers’ opinions and the bias in the language model align or misalign to varying degrees. We found that writers interact with next-phrase suggestions in various complex ways: Writers abstracted and extracted multiple parts of the suggestions and incorporated them within their writing, even when they disagreed with the suggestion as a whole; along with evaluating the suggestions on various criteria. The suggestion system also had various effects on the writing process, such as altering the writer’s usual writing plans, leading to higher levels of distraction etc. Based on our qualitative analysis using the cognitive process model of writing by Hayes [35] as a lens, we propose a theoretical model of ’writer-suggestion interaction’ for writing with GPT-2 (and causal language models in general) for a movie review writing task, followed by directions for future research and design.
{"title":"Interacting with Next-Phrase Suggestions: How Suggestion Systems Aid and Influence the Cognitive Processes of Writing","authors":"Advait Bhat, Saaket Agashe, Niharika Mohile, Parth Oberoi, R. Jangir, Anirudha N. Joshi","doi":"10.1145/3581641.3584060","DOIUrl":"https://doi.org/10.1145/3581641.3584060","url":null,"abstract":"Writing with next-phrase suggestions powered by large language models is becoming more pervasive by the day. However, research to understand writers’ interaction and decision-making processes while engaging with such systems is still emerging. We conducted a qualitative study to shed light on writers’ cognitive processes while writing with next-phrase suggestion systems. To do so, we recruited 14 amateur writers to write two movie reviews each, one without suggestions and one with suggestions. Additionally, we also positively and negatively biased the suggestion system to get a diverse range of instances where writers’ opinions and the bias in the language model align or misalign to varying degrees. We found that writers interact with next-phrase suggestions in various complex ways: Writers abstracted and extracted multiple parts of the suggestions and incorporated them within their writing, even when they disagreed with the suggestion as a whole; along with evaluating the suggestions on various criteria. The suggestion system also had various effects on the writing process, such as altering the writer’s usual writing plans, leading to higher levels of distraction etc. Based on our qualitative analysis using the cognitive process model of writing by Hayes [35] as a lens, we propose a theoretical model of ’writer-suggestion interaction’ for writing with GPT-2 (and causal language models in general) for a movie review writing task, followed by directions for future research and design.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125039089","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yi He, Xi Yang, Chia-Ming Chang, Haoran Xie, T. Igarashi
Attention guidance is used to address dataset bias in deep learning, where the model relies on incorrect features to make decisions. Focusing on image classification tasks, we propose an efficient human-in-the-loop system to interactively direct the attention of classifiers to regions specified by users, thereby reducing the effect of co-occurrence bias and improving the transferability and interpretability of a deep neural network (DNN). Previous approaches for attention guidance require the preparation of pixel-level annotations and are not designed as interactive systems. We herein present a new interactive method that allows users to annotate images via simple clicks. Additionally, we identify a novel active learning strategy that can significantly reduce the number of annotations. We conduct both numerical evaluations and a user study to evaluate the proposed system using multiple datasets. Compared with the existing non-active-learning approach, which typically relies on considerable amounts of polygon-based segmentation masks to fine-tune or train the DNNs, our system can obtain fine-tuned networks on biased datasets in a more time- and cost-efficient manner and offers a more user-friendly experience. Our experimental results show that the proposed system is efficient, reasonable, and reliable. Our code is publicly available at https://github.com/ultratykis/Guiding-DNNs-Attention.
{"title":"Efficient Human-in-the-loop System for Guiding DNNs Attention","authors":"Yi He, Xi Yang, Chia-Ming Chang, Haoran Xie, T. Igarashi","doi":"10.1145/3581641.3584074","DOIUrl":"https://doi.org/10.1145/3581641.3584074","url":null,"abstract":"Attention guidance is used to address dataset bias in deep learning, where the model relies on incorrect features to make decisions. Focusing on image classification tasks, we propose an efficient human-in-the-loop system to interactively direct the attention of classifiers to regions specified by users, thereby reducing the effect of co-occurrence bias and improving the transferability and interpretability of a deep neural network (DNN). Previous approaches for attention guidance require the preparation of pixel-level annotations and are not designed as interactive systems. We herein present a new interactive method that allows users to annotate images via simple clicks. Additionally, we identify a novel active learning strategy that can significantly reduce the number of annotations. We conduct both numerical evaluations and a user study to evaluate the proposed system using multiple datasets. Compared with the existing non-active-learning approach, which typically relies on considerable amounts of polygon-based segmentation masks to fine-tune or train the DNNs, our system can obtain fine-tuned networks on biased datasets in a more time- and cost-efficient manner and offers a more user-friendly experience. Our experimental results show that the proposed system is efficient, reasonable, and reliable. Our code is publicly available at https://github.com/ultratykis/Guiding-DNNs-Attention.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129305275","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Andrew Head, Marti A. Hearst, Daniel S. Weld
Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim – or rapidly review – a paper to attain a cursory understanding of its contents. Scim supports the skimming process by highlighting salient paper contents in order to direct a reader’s attention. The system’s highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by readers at both the global and local level. We evaluate Scim with both an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper. We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools.
{"title":"Scim: Intelligent Skimming Support for Scientific Papers","authors":"Raymond Fok, Hita Kambhamettu, Luca Soldaini, Jonathan Bragg, Kyle Lo, Andrew Head, Marti A. Hearst, Daniel S. Weld","doi":"10.1145/3581641.3584034","DOIUrl":"https://doi.org/10.1145/3581641.3584034","url":null,"abstract":"Scholars need to keep up with an exponentially increasing flood of scientific papers. To aid this challenge, we introduce Scim, a novel intelligent interface that helps experienced researchers skim – or rapidly review – a paper to attain a cursory understanding of its contents. Scim supports the skimming process by highlighting salient paper contents in order to direct a reader’s attention. The system’s highlights are faceted by content type, evenly distributed across a paper, and have a density configurable by readers at both the global and local level. We evaluate Scim with both an in-lab usability study and a longitudinal diary study, revealing how its highlights facilitate the more efficient construction of a conceptualization of a paper. We conclude by discussing design considerations and tensions for the design of future intelligent skimming tools.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"193 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122513499","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intelligent decision support (IDS) systems leverage artificial intelligence techniques to generate recommendations that guide human users through the decision making phases of a task. However, a key challenge is that IDS systems are not perfect, and in complex real-world scenarios may produce suboptimal output or fail to work altogether. The field of explainable AI (XAI) has sought to develop techniques that improve the interpretability of black-box systems. While most XAI work has focused on single-classification tasks, the subfield of explainable AI planning (XAIP) has sought to develop techniques that make sequential decision making AI systems explainable to domain experts. Critically, prior work in applying XAIP techniques to IDS systems has assumed that the plan being proposed by the planner is always optimal, and therefore the action or plan being recommended as decision support to the user is always optimal. In this work, we examine novice user interactions with a non-robust IDS system – one that occasionally recommends suboptimal actions, and one that may become unavailable after users have become accustomed to its guidance. We introduce a new explanation type, subgoal-based explanations, for plan-based IDS systems, that supplements traditional IDS output with information about the subgoal toward which the recommended action would contribute. We demonstrate that subgoal-based explanations lead to improved user task performance in the presence of IDS recommendations, improve user ability to distinguish optimal and suboptimal IDS recommendations, and are preferred by users. Additionally, we demonstrate that subgoal-based explanations enable more robust user performance in the case of IDS failure, showing the significant benefit of training users for an underlying task with subgoal-based explanations.
{"title":"Subgoal-Based Explanations for Unreliable Intelligent Decision Support Systems","authors":"Devleena Das, Been Kim, S. Chernova","doi":"10.1145/3581641.3584055","DOIUrl":"https://doi.org/10.1145/3581641.3584055","url":null,"abstract":"Intelligent decision support (IDS) systems leverage artificial intelligence techniques to generate recommendations that guide human users through the decision making phases of a task. However, a key challenge is that IDS systems are not perfect, and in complex real-world scenarios may produce suboptimal output or fail to work altogether. The field of explainable AI (XAI) has sought to develop techniques that improve the interpretability of black-box systems. While most XAI work has focused on single-classification tasks, the subfield of explainable AI planning (XAIP) has sought to develop techniques that make sequential decision making AI systems explainable to domain experts. Critically, prior work in applying XAIP techniques to IDS systems has assumed that the plan being proposed by the planner is always optimal, and therefore the action or plan being recommended as decision support to the user is always optimal. In this work, we examine novice user interactions with a non-robust IDS system – one that occasionally recommends suboptimal actions, and one that may become unavailable after users have become accustomed to its guidance. We introduce a new explanation type, subgoal-based explanations, for plan-based IDS systems, that supplements traditional IDS output with information about the subgoal toward which the recommended action would contribute. We demonstrate that subgoal-based explanations lead to improved user task performance in the presence of IDS recommendations, improve user ability to distinguish optimal and suboptimal IDS recommendations, and are preferred by users. Additionally, we demonstrate that subgoal-based explanations enable more robust user performance in the case of IDS failure, showing the significant benefit of training users for an underlying task with subgoal-based explanations.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121939924","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hannah K. Bako, Alisha Varma, Anuoluwapo Faboro, Mahreen Haider, Favour Nerrise, B. Kenah, John P. Dickerson, L. Battle
Templates have emerged as an effective approach to simplifying the visualization design and programming process. For example, they enable users to quickly generate multiple visualization designs even when using complex toolkits like D3. However, these templates are often treated as rigid artifacts that respond poorly to changes made outside of the template’s established parameters, limiting user creativity. Preserving the user’s creative flow requires a more dynamic approach to template-based visualization design, where tools can respond gracefully to users’ edits when they modify templates in unexpected ways. In this paper, we leverage the structural similarities revealed by templates to design resilient support features for prototyping D3 visualizations: recommendations to suggest complementary interactions for a users’ D3 program; and code augmentation to implement recommended interactions with a single click, even when users deviate from pre-defined templates. We demonstrate the utility of these features in Mirny, a design-focused prototyping environment for D3. In a user study with 20 D3 users, we find that these automated features enable participants to prototype their design ideas with significantly fewer programming iterations. We also characterize key modification strategies used by participants to customize D3 templates. Informed by our findings and participants’ feedback, we discuss the key implications of the use of templates for interleaving visualization programming and design.
{"title":"User-Driven Support for Visualization Prototyping in D3","authors":"Hannah K. Bako, Alisha Varma, Anuoluwapo Faboro, Mahreen Haider, Favour Nerrise, B. Kenah, John P. Dickerson, L. Battle","doi":"10.1145/3581641.3584041","DOIUrl":"https://doi.org/10.1145/3581641.3584041","url":null,"abstract":"Templates have emerged as an effective approach to simplifying the visualization design and programming process. For example, they enable users to quickly generate multiple visualization designs even when using complex toolkits like D3. However, these templates are often treated as rigid artifacts that respond poorly to changes made outside of the template’s established parameters, limiting user creativity. Preserving the user’s creative flow requires a more dynamic approach to template-based visualization design, where tools can respond gracefully to users’ edits when they modify templates in unexpected ways. In this paper, we leverage the structural similarities revealed by templates to design resilient support features for prototyping D3 visualizations: recommendations to suggest complementary interactions for a users’ D3 program; and code augmentation to implement recommended interactions with a single click, even when users deviate from pre-defined templates. We demonstrate the utility of these features in Mirny, a design-focused prototyping environment for D3. In a user study with 20 D3 users, we find that these automated features enable participants to prototype their design ideas with significantly fewer programming iterations. We also characterize key modification strategies used by participants to customize D3 templates. Informed by our findings and participants’ feedback, we discuss the key implications of the use of templates for interleaving visualization programming and design.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-12-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114291801","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Proceedings of the 28th International Conference on Intelligent User Interfaces","authors":"","doi":"10.1145/3581641","DOIUrl":"https://doi.org/10.1145/3581641","url":null,"abstract":"","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1900-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114673026","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}