Matthew Jörke, Yasaman S. Sefidgar, Talie Massachi, Jina Suh, Gonzalo A. Ramos
Reflection on one’s personal data can be an effective tool for supporting wellbeing. However, current wellbeing reflection support tools tend to offer a one-size-fits-all approach, ignoring the diversity of people’s wellbeing goals and their agency in the self-reflection process. In this work, we identify an opportunity to help people work toward their wellbeing goals by empowering them to reflect on their data on their own terms. Through a formative study, we inform the design and implementation of Pearl, a workplace wellbeing reflection support tool that allows users to explore their personal data in relation to their wellbeing goal. Pearl is a calendar-based interactive machine teaching system that allows users to visualize data sources and tag regions of interest on their calendar. In return, the system provides insights about these tags that can be saved to a reflection journal. We used Pearl as a technology probe with 12 participants without data science expertise and found that all participants successfully gained insights into their workplace wellbeing. In our analysis, we discuss how Pearl’s capabilities facilitate insights, the role of machine assistance in the self-reflection process, and the data sources that participants found most insightful. We conclude with design dimensions for intelligent reflection support systems as inspiration for future work.
{"title":"Pearl: A Technology Probe for Machine-Assisted Reflection on Personal Data","authors":"Matthew Jörke, Yasaman S. Sefidgar, Talie Massachi, Jina Suh, Gonzalo A. Ramos","doi":"10.1145/3581641.3584054","DOIUrl":"https://doi.org/10.1145/3581641.3584054","url":null,"abstract":"Reflection on one’s personal data can be an effective tool for supporting wellbeing. However, current wellbeing reflection support tools tend to offer a one-size-fits-all approach, ignoring the diversity of people’s wellbeing goals and their agency in the self-reflection process. In this work, we identify an opportunity to help people work toward their wellbeing goals by empowering them to reflect on their data on their own terms. Through a formative study, we inform the design and implementation of Pearl, a workplace wellbeing reflection support tool that allows users to explore their personal data in relation to their wellbeing goal. Pearl is a calendar-based interactive machine teaching system that allows users to visualize data sources and tag regions of interest on their calendar. In return, the system provides insights about these tags that can be saved to a reflection journal. We used Pearl as a technology probe with 12 participants without data science expertise and found that all participants successfully gained insights into their workplace wellbeing. In our analysis, we discuss how Pearl’s capabilities facilitate insights, the role of machine assistance in the self-reflection process, and the data sources that participants found most insightful. We conclude with design dimensions for intelligent reflection support systems as inspiration for future work.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"106 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121100724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Patricia K. Kahr, G. Rooks, M. Willemsen, Chris C. P. Snijders
Humans increasingly interact with AI systems, and successful interactions rely on individuals trusting such systems (when appropriate). Considering that trust is fragile and often cannot be restored quickly, we focus on how trust develops over time in a human-AI-interaction scenario. In a 2x2 between-subject experiment, we test how model accuracy (high vs. low) and type of explanation (human-like vs. not) affect trust in AI over time. We study a complex decision-making task in which individuals estimate jail time for 20 criminal law cases with AI advice. Results show that trust is significantly higher for high-accuracy models. Also, behavioral trust does not decline, and subjective trust even increases significantly with high accuracy. Human-like explanations did not generally affect trust but boosted trust in high-accuracy models.
{"title":"It Seems Smart, but It Acts Stupid: Development of Trust in AI Advice in a Repeated Legal Decision-Making Task","authors":"Patricia K. Kahr, G. Rooks, M. Willemsen, Chris C. P. Snijders","doi":"10.1145/3581641.3584058","DOIUrl":"https://doi.org/10.1145/3581641.3584058","url":null,"abstract":"Humans increasingly interact with AI systems, and successful interactions rely on individuals trusting such systems (when appropriate). Considering that trust is fragile and often cannot be restored quickly, we focus on how trust develops over time in a human-AI-interaction scenario. In a 2x2 between-subject experiment, we test how model accuracy (high vs. low) and type of explanation (human-like vs. not) affect trust in AI over time. We study a complex decision-making task in which individuals estimate jail time for 20 criminal law cases with AI advice. Results show that trust is significantly higher for high-accuracy models. Also, behavioral trust does not decline, and subjective trust even increases significantly with high accuracy. Human-like explanations did not generally affect trust but boosted trust in high-accuracy models.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115988761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Kernan Freire, E. Niforatos, Chaofan Wang, Santiago Ruiz-Arenas, Mina Foosherian, S. Wellsandt, A. Bozzon
Learning to operate a complex system, such as an agile production line, can be a daunting task. The high variability in products and frequent reconfigurations make it difficult to keep documentation up-to-date and share new knowledge amongst factory workers. We introduce CLAICA, a Continuously Learning AI Cognitive Assistant that supports workers in the aforementioned scenario. CLAICA learns from (experienced) workers, formalizes new knowledge, stores it in a knowledge base, along with contextual information, and shares it when relevant. We conducted a user study with 83 participants who performed eight knowledge exchange tasks with CLAICA, completed a survey, and provided qualitative feedback. Our results provide a deeper understanding of how prior training, context expertise, and interaction modality affect the user experience of cognitive assistants. We draw on our results to elicit design and evaluation guidelines for cognitive assistants that support knowledge exchange in fast-paced and demanding environments, such as an agile production line.
{"title":"Lessons Learned from Designing and Evaluating CLAICA: A Continuously Learning AI Cognitive Assistant","authors":"Samuel Kernan Freire, E. Niforatos, Chaofan Wang, Santiago Ruiz-Arenas, Mina Foosherian, S. Wellsandt, A. Bozzon","doi":"10.1145/3581641.3584042","DOIUrl":"https://doi.org/10.1145/3581641.3584042","url":null,"abstract":"Learning to operate a complex system, such as an agile production line, can be a daunting task. The high variability in products and frequent reconfigurations make it difficult to keep documentation up-to-date and share new knowledge amongst factory workers. We introduce CLAICA, a Continuously Learning AI Cognitive Assistant that supports workers in the aforementioned scenario. CLAICA learns from (experienced) workers, formalizes new knowledge, stores it in a knowledge base, along with contextual information, and shares it when relevant. We conducted a user study with 83 participants who performed eight knowledge exchange tasks with CLAICA, completed a survey, and provided qualitative feedback. Our results provide a deeper understanding of how prior training, context expertise, and interaction modality affect the user experience of cognitive assistants. We draw on our results to elicit design and evaluation guidelines for cognitive assistants that support knowledge exchange in fast-paced and demanding environments, such as an agile production line.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116858357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
With the emergence of a new generation of embodied AI agents (e.g., cognitive robots), it has become increasingly important to empower these agents with the ability to learn and collaborate with humans through language communication. Despite recent advances, language communication in embodied AI still faces many challenges. Human language not only needs to ground to agents’ perception and action but also needs to facilitate collaboration between humans and agents. To address these challenges, I will introduce several efforts in my lab that study pragmatic communication with embodied agents. I will talk about how language use is shaped by shared experience and knowledge (i.e., common ground) and how collaborative effort is important to mediate perceptual differences and handle exceptions. I will discuss task learning by following language instructions and highlight the need for neuro-symbolic representations for situation awareness and transparency. I will further present explicit modeling of partners’ goals, beliefs, and abilities (i.e., theory of mind) and discuss its role in language communication for situated collaborative tasks.
{"title":"Pragmatic Communication with Embodied Agents","authors":"J. Chai","doi":"10.1145/3581641.3584101","DOIUrl":"https://doi.org/10.1145/3581641.3584101","url":null,"abstract":"With the emergence of a new generation of embodied AI agents (e.g., cognitive robots), it has become increasingly important to empower these agents with the ability to learn and collaborate with humans through language communication. Despite recent advances, language communication in embodied AI still faces many challenges. Human language not only needs to ground to agents’ perception and action but also needs to facilitate collaboration between humans and agents. To address these challenges, I will introduce several efforts in my lab that study pragmatic communication with embodied agents. I will talk about how language use is shaped by shared experience and knowledge (i.e., common ground) and how collaborative effort is important to mediate perceptual differences and handle exceptions. I will discuss task learning by following language instructions and highlight the need for neuro-symbolic representations for situation awareness and transparency. I will further present explicit modeling of partners’ goals, beliefs, and abilities (i.e., theory of mind) and discuss its role in language communication for situated collaborative tasks.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"13 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130806824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Tobias Schneider, J. Hois, Alischa Rosenstein, Sandra Metzl, Ansgar R. S. Gerlicher, Sabiha Ghellal, Steve Love
Autonomous vehicles can behave unexpectedly, as automated systems that rely on data-driven machine learning have shown to infer false predictions or misclassifications, e.g., due to stickers on traffic signs, and thus fail in some situations. In critical situations, system designs must guarantee safety and reliability. However, in non-critical situations, the possibility of failures resulting in unexpected behaviour should be considered, as they negatively impact the passenger’s user experience and acceptance. We analyse if an interactive conversational user interface can mitigate negative experiences when interacting with imperfect artificial intelligence systems. In our quantitative interactive online survey (N=113) and comparative qualitative Wizard of Oz study (N=8), users were able to interact with an autonomous SAE level 5 driving simulation. Our findings demonstrate that increased transparency improves user experience and acceptance. Furthermore, we show that additional information in failure scenarios can lead to an information dilemma and should be implemented carefully.
自动驾驶汽车的行为可能出乎意料,因为依赖于数据驱动的机器学习的自动化系统已经显示出错误的预测或错误的分类,例如,由于交通标志上的贴纸,因此在某些情况下会失败。在紧急情况下,系统设计必须保证安全性和可靠性。然而,在非关键情况下,应考虑故障导致意外行为的可能性,因为它们会对乘客的用户体验和接受度产生负面影响。我们分析了当与不完善的人工智能系统交互时,交互式会话用户界面是否可以减轻负面体验。在我们的定量互动在线调查(N=113)和比较定性的Wizard of Oz研究(N=8)中,用户能够与自动驾驶SAE 5级驾驶模拟进行互动。我们的研究结果表明,增加透明度可以改善用户体验和接受度。此外,我们还表明,故障场景中的附加信息可能导致信息困境,应谨慎实现。
{"title":"Don’t fail me! The Level 5 Autonomous Driving Information Dilemma regarding Transparency and User Experience","authors":"Tobias Schneider, J. Hois, Alischa Rosenstein, Sandra Metzl, Ansgar R. S. Gerlicher, Sabiha Ghellal, Steve Love","doi":"10.1145/3581641.3584085","DOIUrl":"https://doi.org/10.1145/3581641.3584085","url":null,"abstract":"Autonomous vehicles can behave unexpectedly, as automated systems that rely on data-driven machine learning have shown to infer false predictions or misclassifications, e.g., due to stickers on traffic signs, and thus fail in some situations. In critical situations, system designs must guarantee safety and reliability. However, in non-critical situations, the possibility of failures resulting in unexpected behaviour should be considered, as they negatively impact the passenger’s user experience and acceptance. We analyse if an interactive conversational user interface can mitigate negative experiences when interacting with imperfect artificial intelligence systems. In our quantitative interactive online survey (N=113) and comparative qualitative Wizard of Oz study (N=8), users were able to interact with an autonomous SAE level 5 driving simulation. Our findings demonstrate that increased transparency improves user experience and acceptance. Furthermore, we show that additional information in failure scenarios can lead to an information dilemma and should be implemented carefully.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115591543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kota Arai, Yutaro Hirao, Takuji Narumi, Tomohiko Nakamura, Shinnosuke Takamichi, Shigeo Yoshida
Timbre is high-dimensional and sensuous, making it difficult for musical-instrument learners to improve their timbre. Although some systems exist to improve timbre, they require expert labeling for timbre evaluation; however, solely visualizing the results of unsupervised learning lacks the intuitiveness of feedback because human perception is not considered. Therefore, we employ crossmodal correspondences for intuitive visualization of the timbre. We designed TimToShape, a system that visualizes timbre with 2D shapes based on the user’s input of timbre–shape correspondences. TimToShape generates a shape morphed by linear interpolation according to the timbre’s position in the latent space, which is obtained by unsupervised learning with a variational autoencoder (VAE). We confirmed that people perceived shapes generated by TimToShape to correspond more to timbre than randomly generated shapes. Furthermore, a user study of six violin players revealed that TimToShape was well-received in terms of visual clarity and interpretability.
{"title":"TimToShape: Supporting Practice of Musical Instruments by Visualizing Timbre with 2D Shapes based on Crossmodal Correspondences","authors":"Kota Arai, Yutaro Hirao, Takuji Narumi, Tomohiko Nakamura, Shinnosuke Takamichi, Shigeo Yoshida","doi":"10.1145/3581641.3584053","DOIUrl":"https://doi.org/10.1145/3581641.3584053","url":null,"abstract":"Timbre is high-dimensional and sensuous, making it difficult for musical-instrument learners to improve their timbre. Although some systems exist to improve timbre, they require expert labeling for timbre evaluation; however, solely visualizing the results of unsupervised learning lacks the intuitiveness of feedback because human perception is not considered. Therefore, we employ crossmodal correspondences for intuitive visualization of the timbre. We designed TimToShape, a system that visualizes timbre with 2D shapes based on the user’s input of timbre–shape correspondences. TimToShape generates a shape morphed by linear interpolation according to the timbre’s position in the latent space, which is obtained by unsupervised learning with a variational autoencoder (VAE). We confirmed that people perceived shapes generated by TimToShape to correspond more to timbre than randomly generated shapes. Furthermore, a user study of six violin players revealed that TimToShape was well-received in terms of visual clarity and interpretability.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123995533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Michelle Brachman, Qian Pan, H. Do, Casey Dugan, Arunima Chaudhary, James M. Johnson, Priyanshu Rai, T. Chakraborti, T. Gschwind, Jim Laredo, Christoph Miksovic, P. Scotton, Kartik Talamadupula, Gegi Thomas
While natural language systems continue improving, they are still imperfect. If a user has a better understanding of how a system works, they may be able to better accomplish their goals even in imperfect systems. We explored whether explanations can support effective authoring of natural language utterances and how those explanations impact users’ mental models in the context of a natural language system that generates small programs. Through an online study (n=252), we compared two main types of explanations: 1) system-focused, which provide information about how the system processes utterances and matches terms to a knowledge base, and 2) social, which provide information about how other users have successfully interacted with the system. Our results indicate that providing social suggestions of terms to add to an utterance helped users to repair and generate correct flows more than system-focused explanations or social recommendations of words to modify. We also found that participants commonly understood some mechanisms of the natural language system, such as the matching of terms to a knowledge base, but they often lacked other critical knowledge, such as how the system handled structuring and ordering. Based on these findings, we make design recommendations for supporting interactions with and understanding of natural language systems.
{"title":"Follow the Successful Herd: Towards Explanations for Improved Use and Mental Models of Natural Language Systems","authors":"Michelle Brachman, Qian Pan, H. Do, Casey Dugan, Arunima Chaudhary, James M. Johnson, Priyanshu Rai, T. Chakraborti, T. Gschwind, Jim Laredo, Christoph Miksovic, P. Scotton, Kartik Talamadupula, Gegi Thomas","doi":"10.1145/3581641.3584088","DOIUrl":"https://doi.org/10.1145/3581641.3584088","url":null,"abstract":"While natural language systems continue improving, they are still imperfect. If a user has a better understanding of how a system works, they may be able to better accomplish their goals even in imperfect systems. We explored whether explanations can support effective authoring of natural language utterances and how those explanations impact users’ mental models in the context of a natural language system that generates small programs. Through an online study (n=252), we compared two main types of explanations: 1) system-focused, which provide information about how the system processes utterances and matches terms to a knowledge base, and 2) social, which provide information about how other users have successfully interacted with the system. Our results indicate that providing social suggestions of terms to add to an utterance helped users to repair and generate correct flows more than system-focused explanations or social recommendations of words to modify. We also found that participants commonly understood some mechanisms of the natural language system, such as the matching of terms to a knowledge base, but they often lacked other critical knowledge, such as how the system handled structuring and ordering. Based on these findings, we make design recommendations for supporting interactions with and understanding of natural language systems.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124765141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The proponents of these algorithms claim they meet users’ requirements for counterfactual explanations. For instance, many claim that the output of their algorithms work as explanations because they prioritise "plausible", "actionable" or "causally important" features in their generated counterfactuals. However, very few of these claims have been tested in controlled psychological studies, and we know very little about which aspects of counterfactual explanations help users to understand AI system decisions. Furthermore, we do not know whether counterfactual explanations are an advance on more traditional causal explanations that have a much longer history in AI (in explaining expert systems and decision trees). Accordingly, we carried out two user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, and user responses were measured objectively (users’ predictive accuracy) and subjectively (users’ satisfaction and trust judgments). Study 1 (N=127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy of predictions than no-explanation control descriptions but no higher accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust judgments than causal explanations. Study 2 (N=211) found that users were more accurate for categorically-transformed features compared to continuous ones, and also replicated the results of Study 1. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.
{"title":"Categorical and Continuous Features in Counterfactual Explanations of AI Systems","authors":"Greta Warren, R. Byrne, Markt. Keane","doi":"10.1145/3581641.3584090","DOIUrl":"https://doi.org/10.1145/3581641.3584090","url":null,"abstract":"Recently, eXplainable AI (XAI) research has focused on the use of counterfactual explanations to address interpretability, algorithmic recourse, and bias in AI system decision-making. The proponents of these algorithms claim they meet users’ requirements for counterfactual explanations. For instance, many claim that the output of their algorithms work as explanations because they prioritise \"plausible\", \"actionable\" or \"causally important\" features in their generated counterfactuals. However, very few of these claims have been tested in controlled psychological studies, and we know very little about which aspects of counterfactual explanations help users to understand AI system decisions. Furthermore, we do not know whether counterfactual explanations are an advance on more traditional causal explanations that have a much longer history in AI (in explaining expert systems and decision trees). Accordingly, we carried out two user studies to (i) test a fundamental distinction in feature-types, between categorical and continuous features, and (ii) compare the relative effectiveness of counterfactual and causal explanations. The studies used a simulated, automated decision-making app that determined safe driving limits after drinking alcohol, based on predicted blood alcohol content, and user responses were measured objectively (users’ predictive accuracy) and subjectively (users’ satisfaction and trust judgments). Study 1 (N=127) showed that users understand explanations referring to categorical features more readily than those referring to continuous features. It also discovered a dissociation between objective and subjective measures: counterfactual explanations elicited higher accuracy of predictions than no-explanation control descriptions but no higher accuracy than causal explanations, yet counterfactual explanations elicited greater satisfaction and trust judgments than causal explanations. Study 2 (N=211) found that users were more accurate for categorically-transformed features compared to continuous ones, and also replicated the results of Study 1. The findings delineate important boundary conditions for current and future counterfactual explanation methods in XAI.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128663372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yincheng Jin, Seokmin Choi, Yang Gao, Jiyang Li, Zhengxiong Li, Zhanpeng Jin
Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.
{"title":"TransASL: A Smart Glass based Comprehensive ASL Recognizer in Daily Life","authors":"Yincheng Jin, Seokmin Choi, Yang Gao, Jiyang Li, Zhengxiong Li, Zhanpeng Jin","doi":"10.1145/3581641.3584071","DOIUrl":"https://doi.org/10.1145/3581641.3584071","url":null,"abstract":"Sign language is a primary language used by deaf and hard-of-hearing (DHH) communities. However, existing sign language translation solutions primarily focus on recognizing manual markers. The non-manual markers, such as negative head shaking, question markers, and mouthing, are critical grammatical and semantic components of sign language for better usability and generalizability. Considering the significant role of non-manual markers, we propose the TransASL, a real-time, end-to-end system for sign language recognition and translation. TransASL extracts feature from both manual markers and non-manual markers via a customized eyeglasses-style wearable device with two parallel sensing modalities. Manual marker information is collected by two pairs of outward-facing microphones and speakers mounted to the legs of the eyeglasses. In contrast, non-manual marker information is acquired from a pair of inward-facing microphones and speakers connected to the eyeglasses. Both manual and non-manual marker features undergo a multi-modal, multi-channel fusion network and are eventually recognized as comprehensible ASL content. We evaluate the recognition performance of various sign language expressions at both the word and sentence levels. Given 80 frequently used ASL words and 40 meaningful sentences consisting of manual and non-manual markers, TransASL can achieve the WER of 8.3% and 7.1%, respectively. Our proposed work reveals a great potential for convenient ASL recognition in daily communications between ASL signers and hearing people.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131668538","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jeremy Warner, Amy Pavel, Tonya Nguyen, Maneesh Agrawala, Bjoern Hartmann
Presenters often collect audience feedback through practice talks to refine their presentations. In formative interviews, we find that although text feedback and verbal discussions allow presenters to receive feedback, organizing that feedback into actionable presentation revisions remains challenging. Feedback may lack context, be redundant, and be spread across various emails, notes, and conversations. To collate and contextualize both text and verbal feedback, we present SlideSpecs. SlideSpecs lets audience members provide text feedback (e.g., ‘font too small’) while attaching an automatically detected context, including relevant slides (e.g., ‘Slide 7’) or content tags (e.g., ‘slide design’). SlideSpecs also records and transcribes spoken group discussions that commonly occur after practice talks and facilitates linking text critiques to relevant discussion segments. Finally, presenters can use SlideSpecs to review all text and spoken feedback in a single contextually rich interface (e.g., relevant slides, topics, and follow-up discussions). We demonstrate the effectiveness of SlideSpecs by deploying it in eight practice talks with a range of topics and purposes and reporting our findings.
{"title":"SlideSpecs: Automatic and Interactive Presentation Feedback Collation","authors":"Jeremy Warner, Amy Pavel, Tonya Nguyen, Maneesh Agrawala, Bjoern Hartmann","doi":"10.1145/3581641.3584035","DOIUrl":"https://doi.org/10.1145/3581641.3584035","url":null,"abstract":"Presenters often collect audience feedback through practice talks to refine their presentations. In formative interviews, we find that although text feedback and verbal discussions allow presenters to receive feedback, organizing that feedback into actionable presentation revisions remains challenging. Feedback may lack context, be redundant, and be spread across various emails, notes, and conversations. To collate and contextualize both text and verbal feedback, we present SlideSpecs. SlideSpecs lets audience members provide text feedback (e.g., ‘font too small’) while attaching an automatically detected context, including relevant slides (e.g., ‘Slide 7’) or content tags (e.g., ‘slide design’). SlideSpecs also records and transcribes spoken group discussions that commonly occur after practice talks and facilitates linking text critiques to relevant discussion segments. Finally, presenters can use SlideSpecs to review all text and spoken feedback in a single contextually rich interface (e.g., relevant slides, topics, and follow-up discussions). We demonstrate the effectiveness of SlideSpecs by deploying it in eight practice talks with a range of topics and purposes and reporting our findings.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133959713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}