Socially Interactive Agents (SIAs) offer users with interactive face-to-face conversations. They can take the role of a speaker and communicate verbally and nonverbally their intentions and emotional states; but they should also act as active listener and be an interactive partner. In human-human interaction, interlocutors adapt their behaviors reciprocally and dynamically. The endowment of such adaptation capability can allow SIAs to show social and engaging behaviors. In this paper, we focus on modelizing the reciprocal adaptation to generate SIA behaviors for both conversational roles of speaker and listener. We propose the Augmented Self-Attention Pruning (ASAP) neural network model. ASAP incorporates recurrent neural network, attention mechanism of transformers, and pruning technique to learn the reciprocal adaptation via multimodal social signals. We evaluate our work objectively, via several metrics, and subjectively, through a user perception study where the SIA behaviors generated by ASAP is compared with those of other state-of-the-art models. Our results demonstrate that ASAP significantly outperforms the state-of-the-art models and thus shows the importance of reciprocal adaptation modeling.
{"title":"ASAP: Endowing Adaptation Capability to Agent in Human-Agent Interaction","authors":"Jieyeon Woo, C. Pelachaud, C. Achard","doi":"10.1145/3581641.3584081","DOIUrl":"https://doi.org/10.1145/3581641.3584081","url":null,"abstract":"Socially Interactive Agents (SIAs) offer users with interactive face-to-face conversations. They can take the role of a speaker and communicate verbally and nonverbally their intentions and emotional states; but they should also act as active listener and be an interactive partner. In human-human interaction, interlocutors adapt their behaviors reciprocally and dynamically. The endowment of such adaptation capability can allow SIAs to show social and engaging behaviors. In this paper, we focus on modelizing the reciprocal adaptation to generate SIA behaviors for both conversational roles of speaker and listener. We propose the Augmented Self-Attention Pruning (ASAP) neural network model. ASAP incorporates recurrent neural network, attention mechanism of transformers, and pruning technique to learn the reciprocal adaptation via multimodal social signals. We evaluate our work objectively, via several metrics, and subjectively, through a user perception study where the SIA behaviors generated by ASAP is compared with those of other state-of-the-art models. Our results demonstrate that ASAP significantly outperforms the state-of-the-art models and thus shows the importance of reciprocal adaptation modeling.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123016218","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
User interfaces (UI) are shifting from being attention-hungry to being attentive to users’ needs upon interaction. Interfaces developed for robot teleoperation can be particularly complex, often displaying large amounts of information, which can increase the cognitive overload that prejudices the performance of the operator. This paper presents the development of a Physiologically Attentive User Interface (PAUI) prototype preliminary evaluated with six participants. A case study on Urban Search and Rescue (USAR) operations that teleoperate a robot was used although the proposed approach aims to be generic. The robot considered provides an overly complex Graphical User Interface (GUI) which does not allow access to its source code. This represents a recurring and challenging scenario when robots are still in use, but technical updates are no longer offered that usually mean their abandon. A major contribution of the approach is the possibility of recycling old systems while improving the UI made available to end users and considering as input their physiological data. The proposed PAUI analyses physiological data, facial expressions, and eye movements to classify three mental states (rest, workload, and stress). An Attentive User Interface (AUI) is then assembled by recycling a pre-existing GUI, which is dynamically modified according to the predicted mental state to improve the user's focus during mentally demanding situations. In addition to the novelty of the proposed PAUIs that take advantage of pre-existing GUIs, this work also contributes with the design of a user experiment comprising mental state induction tasks that successfully trigger high and low cognitive overload states. Results from the preliminary user evaluation revealed a tendency for improvement in the usefulness and ease of usage of the PAUI, although without statistical significance, due to the reduced number of subjects.
{"title":"Physiologically Attentive User Interface for Improved Robot Teleoperation","authors":"António Tavares, J. L. Silva, R. Ventura","doi":"10.1145/3581641.3584084","DOIUrl":"https://doi.org/10.1145/3581641.3584084","url":null,"abstract":"User interfaces (UI) are shifting from being attention-hungry to being attentive to users’ needs upon interaction. Interfaces developed for robot teleoperation can be particularly complex, often displaying large amounts of information, which can increase the cognitive overload that prejudices the performance of the operator. This paper presents the development of a Physiologically Attentive User Interface (PAUI) prototype preliminary evaluated with six participants. A case study on Urban Search and Rescue (USAR) operations that teleoperate a robot was used although the proposed approach aims to be generic. The robot considered provides an overly complex Graphical User Interface (GUI) which does not allow access to its source code. This represents a recurring and challenging scenario when robots are still in use, but technical updates are no longer offered that usually mean their abandon. A major contribution of the approach is the possibility of recycling old systems while improving the UI made available to end users and considering as input their physiological data. The proposed PAUI analyses physiological data, facial expressions, and eye movements to classify three mental states (rest, workload, and stress). An Attentive User Interface (AUI) is then assembled by recycling a pre-existing GUI, which is dynamically modified according to the predicted mental state to improve the user's focus during mentally demanding situations. In addition to the novelty of the proposed PAUIs that take advantage of pre-existing GUIs, this work also contributes with the design of a user experiment comprising mental state induction tasks that successfully trigger high and low cognitive overload states. Results from the preliminary user evaluation revealed a tendency for improvement in the usefulness and ease of usage of the PAUI, although without statistical significance, due to the reduced number of subjects.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127587877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
T. Chong, Hijung Valentina Shin, Deepali Aneja, T. Igarashi
Animations can come to life when they are synchronized with relevant sounds. Yet, synchronizing animations to audio requires tedious key-framing or programming, which is difficult for novice creators. There are existing tools that support audio-driven live animation, but they focus primarily on speech and have little or no support for non-speech sounds. We present SoundToons, an exemplar-based authoring tool for interactive, audio-driven animation focusing on non-speech sounds. Our tool enables novice creators to author live animations to a wide variety of non-speech sounds, such as clapping and instrumental music. We support two types of audio interactions: (1) discrete interaction, which triggers animations when a discrete sound event is detected, and (2) continuous, which synchronizes an animation to continuous audio parameters. By employing an exemplar-based iterative authoring approach, we empower novice creators to design and quickly refine interactive animations. User evaluations demonstrate that novice users can author and perform live audio-driven animation intuitively. Moreover, compared to other input modalities such as trackpads or foot pedals, users preferred using audio as an intuitive way to drive animation.
{"title":"SoundToons: Exemplar-Based Authoring of Interactive Audio-Driven Animation Sprites","authors":"T. Chong, Hijung Valentina Shin, Deepali Aneja, T. Igarashi","doi":"10.1145/3581641.3584047","DOIUrl":"https://doi.org/10.1145/3581641.3584047","url":null,"abstract":"Animations can come to life when they are synchronized with relevant sounds. Yet, synchronizing animations to audio requires tedious key-framing or programming, which is difficult for novice creators. There are existing tools that support audio-driven live animation, but they focus primarily on speech and have little or no support for non-speech sounds. We present SoundToons, an exemplar-based authoring tool for interactive, audio-driven animation focusing on non-speech sounds. Our tool enables novice creators to author live animations to a wide variety of non-speech sounds, such as clapping and instrumental music. We support two types of audio interactions: (1) discrete interaction, which triggers animations when a discrete sound event is detected, and (2) continuous, which synchronizes an animation to continuous audio parameters. By employing an exemplar-based iterative authoring approach, we empower novice creators to design and quickly refine interactive animations. User evaluations demonstrate that novice users can author and perform live audio-driven animation intuitively. Moreover, compared to other input modalities such as trackpads or foot pedals, users preferred using audio as an intuitive way to drive animation.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128358376","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Zahra Nouri, N. Prakash, U. Gadiraju, Henning Wachsmuth
Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.
{"title":"Supporting Requesters in Writing Clear Crowdsourcing Task Descriptions Through Computational Flaw Assessment","authors":"Zahra Nouri, N. Prakash, U. Gadiraju, Henning Wachsmuth","doi":"10.1145/3581641.3584039","DOIUrl":"https://doi.org/10.1145/3581641.3584039","url":null,"abstract":"Quality control is an, if not the, essential challenge in crowdsourcing. Unsatisfactory responses from crowd workers have been found to particularly result from ambiguous and incomplete task descriptions, often from inexperienced task requesters. However, creating clear task descriptions with sufficient information is a complex process for requesters in crowdsourcing marketplaces. In this paper, we investigate the extent to which requesters can be supported effectively in this process through computational techniques. To this end, we developed a tool that enables requesters to iteratively identify and correct eight common clarity flaws in their task descriptions before deployment on the platform. The tool can be used to write task descriptions from scratch or to assess and improve the clarity of prepared descriptions. It employs machine learning-based natural language processing models trained on real-world task descriptions that score a given task description for the eight clarity flaws. On this basis, the requester can iteratively revise and reassess the task description until it reaches a sufficient level of clarity. In a first user study, we let requesters create task descriptions using the tool and rate the tool’s different aspects of helpfulness thereafter. We then carried out a second user study with crowd workers, as those who are confronted with such descriptions in practice, to rate the clarity of the created task descriptions. According to our results, 65% of the requesters classified the helpfulness of the information provided by the tool high or very high (only 12% as low or very low). The requesters saw some room for improvement though, for example, concerning the display of bad examples. Nevertheless, 76% of the crowd workers believe that the overall clarity of the task descriptions created by the requesters using the tool improves over the initial version. In line with this, the automatically-computed clarity scores of the edited task descriptions were generally higher than those of the initial descriptions, indicating that the tool reliably predicts the clarity of task descriptions in overall terms.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127089503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Novel AI-generated audio samples are evaluated for descriptive qualities such as the smoothness of a morph using crowdsourced human listening tests. However, the methods to design interfaces for such experiments and to effectively articulate the descriptive audio quality under test receive very little attention in the evaluation metrics literature. In this paper, we explore the use of visual metaphors of image-schema to design interfaces to evaluate AI-generated audio. Furthermore, we highlight the importance of framing and contextualizing a descriptive audio quality under measurement using such constructs. Using both pitched sounds and textures, we conduct two sets of experiments to investigate how the quality of responses vary with audio and task complexities. Our results show that, in both cases, by using image-schemas we can improve the quality and consensus of AI-generated audio evaluations. Our findings reinforce the importance of interface design for listening tests and stationary visual constructs to communicate temporal qualities of AI-generated audio samples, especially to naïve listeners on crowdsourced platforms.
{"title":"Evaluating Descriptive Quality of AI-Generated Audio Using Image-Schemas","authors":"Purnima Kamath, Zhuoyao Li, Chitralekha Gupta, Kokil Jaidka, Suranga Nanayakkara, L. Wyse","doi":"10.1145/3581641.3584083","DOIUrl":"https://doi.org/10.1145/3581641.3584083","url":null,"abstract":"Novel AI-generated audio samples are evaluated for descriptive qualities such as the smoothness of a morph using crowdsourced human listening tests. However, the methods to design interfaces for such experiments and to effectively articulate the descriptive audio quality under test receive very little attention in the evaluation metrics literature. In this paper, we explore the use of visual metaphors of image-schema to design interfaces to evaluate AI-generated audio. Furthermore, we highlight the importance of framing and contextualizing a descriptive audio quality under measurement using such constructs. Using both pitched sounds and textures, we conduct two sets of experiments to investigate how the quality of responses vary with audio and task complexities. Our results show that, in both cases, by using image-schemas we can improve the quality and consensus of AI-generated audio evaluations. Our findings reinforce the importance of interface design for listening tests and stationary visual constructs to communicate temporal qualities of AI-generated audio samples, especially to naïve listeners on crowdsourced platforms.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133536815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Accessibility forums enable individuals with visual impairments to connect and collaboratively seek solutions to technical issues, as well as share reviews, best practices, and latest news. However, these forums are presently built on legacy systems that were primarily designed for sighted users, and are difficult to navigate with non-visual assistive technologies like screen-readers. Accessibility forum threads are “entangled”, with multiple sub-conversations interleaved with each other. This does not gel with the predominantly linear navigation of screen-readers. Screen-reader users often listen to reams of irrelevant posts while foraging for nuggets of interest. To address this and improve non-visual interaction efficiency, we present TASER, a browser extension that leverages a state-of-the-art conversation disentanglement algorithm to automatically identify and separate sub-conversations in a forum thread, and then presents these sub-conversations to the user via a custom interface specifically tailored for efficient and usable screen-reader interaction. In a user study with 11 screen-reader users, we observed that TASER significantly reduced the average user input actions and interaction times by and respectively along with a significant drop in cognitive load ( lower NASA-TLX score) compared to the status quo while performing representative information foraging tasks on accessibility forums.
{"title":"Taming Entangled Accessibility Forum Threads for Efficient Screen Reading","authors":"Anand Ravi Aiyer, I. Ramakrishnan, V. Ashok","doi":"10.1145/3581641.3584073","DOIUrl":"https://doi.org/10.1145/3581641.3584073","url":null,"abstract":"Accessibility forums enable individuals with visual impairments to connect and collaboratively seek solutions to technical issues, as well as share reviews, best practices, and latest news. However, these forums are presently built on legacy systems that were primarily designed for sighted users, and are difficult to navigate with non-visual assistive technologies like screen-readers. Accessibility forum threads are “entangled”, with multiple sub-conversations interleaved with each other. This does not gel with the predominantly linear navigation of screen-readers. Screen-reader users often listen to reams of irrelevant posts while foraging for nuggets of interest. To address this and improve non-visual interaction efficiency, we present TASER, a browser extension that leverages a state-of-the-art conversation disentanglement algorithm to automatically identify and separate sub-conversations in a forum thread, and then presents these sub-conversations to the user via a custom interface specifically tailored for efficient and usable screen-reader interaction. In a user study with 11 screen-reader users, we observed that TASER significantly reduced the average user input actions and interaction times by and respectively along with a significant drop in cognitive load ( lower NASA-TLX score) compared to the status quo while performing representative information foraging tasks on accessibility forums.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128100003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Roderick S. Tabalba, Nurit Kirshenbaum, J. Leigh, Abari Bhattacharya, Veronica Grosso, Barbara Di Eugenio, Andrew E. Johnson, Moira Zellner
Natural Language Interfaces that facilitate data exploration tasks are rapidly gaining in interest in the research community because they enable users to focus their attention on the task of inquiry rather than the mechanics of chart construction. Yet, current systems rely solely on processing the user’s explicit commands to generate the user’s intended chart. These commands can be ambiguous due to natural language tendencies such as speech disfluency and underspecification. In this paper, we developed and studied how an always listening interface can help contextualize imprecise queries. Our study revealed that an always listening interface is able to use an on-going conversation to fill in missing properties for imprecise commands, disambiguate inaccurate commands without asking the user for clarification, as well as generate charts without being explicitly asked.
{"title":"An Investigation into an Always Listening Interface to Support Data Exploration","authors":"Roderick S. Tabalba, Nurit Kirshenbaum, J. Leigh, Abari Bhattacharya, Veronica Grosso, Barbara Di Eugenio, Andrew E. Johnson, Moira Zellner","doi":"10.1145/3581641.3584079","DOIUrl":"https://doi.org/10.1145/3581641.3584079","url":null,"abstract":"Natural Language Interfaces that facilitate data exploration tasks are rapidly gaining in interest in the research community because they enable users to focus their attention on the task of inquiry rather than the mechanics of chart construction. Yet, current systems rely solely on processing the user’s explicit commands to generate the user’s intended chart. These commands can be ambiguous due to natural language tendencies such as speech disfluency and underspecification. In this paper, we developed and studied how an always listening interface can help contextualize imprecise queries. Our study revealed that an always listening interface is able to use an on-going conversation to fill in missing properties for imprecise commands, disambiguate inaccurate commands without asking the user for clarification, as well as generate charts without being explicitly asked.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134298860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Jiwon Kim, Jiwon Kang, Taeeun Kim, Hayeon Song, Jinyoung Han
Sketch-based drawing assessments in art therapy are widely used to understand individuals’ cognitive and psychological states, such as cognitive impairment or mental disorders. Along with self-report measures based on a questionnaire, psychological drawing assessments can augment information about an individual psychological state. However, the interpretation of the drawing assessments requires much time and effort, especially in a large-scale group such as schools or companies, and depends on the experience of the art therapists. To address this issue, we propose an AI-based expert support system, AlphaDAPR, to support art therapists and psychologists in conducting a large-scale automatic drawing assessment. Our survey results with 64 art therapists showed that 64.06% of the participants indicated a willingness to use the proposed system. The results of structural equation modeling highlighted the importance of explainable AI embedded in the interface design to affect perceived usefulness, trust, satisfaction, and intention to use eventually. The interview results revealed that most of the art therapists show high levels of intention to use the proposed system while expressing some concerns about AI’s possible limitations and threats as well. Discussion and implications are provided, stressing the importance of clear communication about the collaborative role of AI and users.
{"title":"AlphaDAPR: An AI-based Explainable Expert Support System for Art Therapy","authors":"Jiwon Kim, Jiwon Kang, Taeeun Kim, Hayeon Song, Jinyoung Han","doi":"10.1145/3581641.3584087","DOIUrl":"https://doi.org/10.1145/3581641.3584087","url":null,"abstract":"Sketch-based drawing assessments in art therapy are widely used to understand individuals’ cognitive and psychological states, such as cognitive impairment or mental disorders. Along with self-report measures based on a questionnaire, psychological drawing assessments can augment information about an individual psychological state. However, the interpretation of the drawing assessments requires much time and effort, especially in a large-scale group such as schools or companies, and depends on the experience of the art therapists. To address this issue, we propose an AI-based expert support system, AlphaDAPR, to support art therapists and psychologists in conducting a large-scale automatic drawing assessment. Our survey results with 64 art therapists showed that 64.06% of the participants indicated a willingness to use the proposed system. The results of structural equation modeling highlighted the importance of explainable AI embedded in the interface design to affect perceived usefulness, trust, satisfaction, and intention to use eventually. The interview results revealed that most of the art therapists show high levels of intention to use the proposed system while expressing some concerns about AI’s possible limitations and threats as well. Discussion and implications are provided, stressing the importance of clear communication about the collaborative role of AI and users.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128928761","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Hyunchul Lim, Ruidong Zhang, Samhita Pendyal, J. Jo, Cheng Zhang
This paper presents D-Touch, a neck-mounted wearable sensing system that can recognize and predict how a hand touches the face. It uses a neck-mounted infrared camera (IR), which takes pictures of the head from the neck. These IR camera images are processed and used to train a deep-learning model to recognize and predict touch time and positions. The study showed D-Touch distinguished 17 Facial related Activity (FrA), including 11 face touch positions and 6 other activities, with over 92.1% accuracy and predict the hand-touching T-zone from other FrA activities with an accuracy of 82.12% within 150 ms after the hand appeared in the camera. A study with 10 participants conducted in their homes without any constraints on participants showed that D-Touch can predict the hand-touching T-zone from other FrA activities with an accuracy of 72.3% within 150 ms after the camera saw the hand. Based on the study results, we further discuss the opportunities and challenges of deploying D-Touch in real-world scenarios.
{"title":"D-Touch: Recognizing and Predicting Fine-grained Hand-face Touching Activities Using a Neck-mounted Wearable","authors":"Hyunchul Lim, Ruidong Zhang, Samhita Pendyal, J. Jo, Cheng Zhang","doi":"10.1145/3581641.3584063","DOIUrl":"https://doi.org/10.1145/3581641.3584063","url":null,"abstract":"This paper presents D-Touch, a neck-mounted wearable sensing system that can recognize and predict how a hand touches the face. It uses a neck-mounted infrared camera (IR), which takes pictures of the head from the neck. These IR camera images are processed and used to train a deep-learning model to recognize and predict touch time and positions. The study showed D-Touch distinguished 17 Facial related Activity (FrA), including 11 face touch positions and 6 other activities, with over 92.1% accuracy and predict the hand-touching T-zone from other FrA activities with an accuracy of 82.12% within 150 ms after the hand appeared in the camera. A study with 10 participants conducted in their homes without any constraints on participants showed that D-Touch can predict the hand-touching T-zone from other FrA activities with an accuracy of 72.3% within 150 ms after the camera saw the hand. Based on the study results, we further discuss the opportunities and challenges of deploying D-Touch in real-world scenarios.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125349698","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Intelligent decision support tools (DSTs) hold the promise to improve the quality of human decision-making in challenging situations like diversions in aviation. To achieve these improvements, a common goal in DST design is to calibrate decision makers’ trust in the system. However, this perspective is mostly informed by controlled studies and might not fully reflect the real-world complexity of diversions. In order to understand how DSTs can be beneficial in the view of those who have the best understanding of the complexity of diversions, we interviewed professional pilots. To facilitate discussions, we built two low-fidelity prototypes, each representing a different role a DST could assume: (a) actively suggesting and ranking airports based on pilot-specified criteria, and (b) unobtrusively hinting at data points the pilot should be aware of. We find that while pilots would not blindly trust a DST, they at the same time reject deliberate trust calibration in the moment of the decision. We revisit appropriation as a lens to understand this seeming contradiction as well as a range of means to enable appropriation. Aside from the commonly considered need for transparency, these include directability and continuous support throughout the entire decision process. Based on our design exploration, we encourage to expand the view on DST design beyond trust calibration at the point of the actual decision.
{"title":"Resilience Through Appropriation: Pilots’ View on Complex Decision Support","authors":"Z. Zhang, Cara Storath, Yuanting Liu, A. Butz","doi":"10.1145/3581641.3584056","DOIUrl":"https://doi.org/10.1145/3581641.3584056","url":null,"abstract":"Intelligent decision support tools (DSTs) hold the promise to improve the quality of human decision-making in challenging situations like diversions in aviation. To achieve these improvements, a common goal in DST design is to calibrate decision makers’ trust in the system. However, this perspective is mostly informed by controlled studies and might not fully reflect the real-world complexity of diversions. In order to understand how DSTs can be beneficial in the view of those who have the best understanding of the complexity of diversions, we interviewed professional pilots. To facilitate discussions, we built two low-fidelity prototypes, each representing a different role a DST could assume: (a) actively suggesting and ranking airports based on pilot-specified criteria, and (b) unobtrusively hinting at data points the pilot should be aware of. We find that while pilots would not blindly trust a DST, they at the same time reject deliberate trust calibration in the moment of the decision. We revisit appropriation as a lens to understand this seeming contradiction as well as a range of means to enable appropriation. Aside from the commonly considered need for transparency, these include directability and continuous support throughout the entire decision process. Based on our design exploration, we encourage to expand the view on DST design beyond trust calibration at the point of the actual decision.","PeriodicalId":118159,"journal":{"name":"Proceedings of the 28th International Conference on Intelligent User Interfaces","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126693589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}