首页 > 最新文献

Proceedings of the 2020 International Conference on Multimodal Interaction最新文献

英文 中文
Job Interviewer Android with Elaborate Follow-up Question Generation 面试官安卓精心的后续问题生成
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418839
K. Inoue, Kohei Hara, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, K. Takanashi, Tatsuya Kawahara
A job interview is a domain that takes advantage of an android robot's human-like appearance and behaviors. In this work, our goal is to implement a system in which an android plays the role of an interviewer so that users may practice for a real job interview. Our proposed system generates elaborate follow-up questions based on responses from the interviewee. We conducted an interactive experiment to compare the proposed system against a baseline system that asked only fixed-form questions. We found that this system was significantly better than the baseline system with respect to the impression of the interview and the quality of the questions, and that the presence of the android interviewer was enhanced by the follow-up questions. We also found a similar result when using a virtual agent interviewer, except that presence was not enhanced.
工作面试是一个利用机器人的人形外表和行为的领域。在这项工作中,我们的目标是实现一个由机器人扮演面试官角色的系统,以便用户可以为真正的工作面试进行练习。我们提出的系统根据受访者的回答生成详细的后续问题。我们进行了一个交互实验,将所提出的系统与只询问固定形式问题的基线系统进行比较。我们发现这个系统在面试的印象和问题的质量方面明显优于基线系统,并且通过后续问题增强了机器人面试官的存在。当使用虚拟代理面试官时,我们也发现了类似的结果,除了存在感没有增强。
{"title":"Job Interviewer Android with Elaborate Follow-up Question Generation","authors":"K. Inoue, Kohei Hara, Divesh Lala, Kenta Yamamoto, Shizuka Nakamura, K. Takanashi, Tatsuya Kawahara","doi":"10.1145/3382507.3418839","DOIUrl":"https://doi.org/10.1145/3382507.3418839","url":null,"abstract":"A job interview is a domain that takes advantage of an android robot's human-like appearance and behaviors. In this work, our goal is to implement a system in which an android plays the role of an interviewer so that users may practice for a real job interview. Our proposed system generates elaborate follow-up questions based on responses from the interviewee. We conducted an interactive experiment to compare the proposed system against a baseline system that asked only fixed-form questions. We found that this system was significantly better than the baseline system with respect to the impression of the interview and the quality of the questions, and that the presence of the android interviewer was enhanced by the follow-up questions. We also found a similar result when using a virtual agent interviewer, except that presence was not enhanced.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124107507","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Workshop on Interdisciplinary Insights into Group and Team Dynamics 小组和团队动力学的跨学科见解研讨会
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3419748
H. Hung, Gabriel Murray, G. Varni, N. Lehmann-Willenbrock, Fabiola H. Gerpott, Catharine Oertel
There has been gathering momentum over the last 10 years in the study of group behavior in multimodal multiparty interactions. While many works in the computer science community focus on the analysis of individual or dyadic interactions, we believe that the study of groups adds an additional layer of complexity with respect to how humans cooperate and what outcomes can be achieved in these settings. Moreover, the development of technologies that can help to interpret and enhance group behaviours dynamically is still an emerging field. Social theories that accompany the study of groups dynamics are in their infancy and there is a need for more interdisciplinary dialogue between computer scientists and social scientists on this topic. This workshop has been organised to facilitate those discussions and strengthen the bonds between these overlapping research communities
在过去的十年中,对多模态多方互动中的群体行为的研究势头越来越强。虽然计算机科学界的许多工作都集中在个体或二元相互作用的分析上,但我们认为,对于人类如何合作以及在这些环境中可以实现什么结果,群体的研究增加了额外的复杂性。此外,能够帮助动态解释和增强群体行为的技术的发展仍然是一个新兴领域。伴随群体动力学研究的社会理论还处于起步阶段,计算机科学家和社会科学家之间需要在这个主题上进行更多的跨学科对话。组织这次研讨会是为了促进这些讨论并加强这些重叠的研究团体之间的联系
{"title":"Workshop on Interdisciplinary Insights into Group and Team Dynamics","authors":"H. Hung, Gabriel Murray, G. Varni, N. Lehmann-Willenbrock, Fabiola H. Gerpott, Catharine Oertel","doi":"10.1145/3382507.3419748","DOIUrl":"https://doi.org/10.1145/3382507.3419748","url":null,"abstract":"There has been gathering momentum over the last 10 years in the study of group behavior in multimodal multiparty interactions. While many works in the computer science community focus on the analysis of individual or dyadic interactions, we believe that the study of groups adds an additional layer of complexity with respect to how humans cooperate and what outcomes can be achieved in these settings. Moreover, the development of technologies that can help to interpret and enhance group behaviours dynamically is still an emerging field. Social theories that accompany the study of groups dynamics are in their infancy and there is a need for more interdisciplinary dialogue between computer scientists and social scientists on this topic. This workshop has been organised to facilitate those discussions and strengthen the bonds between these overlapping research communities","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"204 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115468705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
The First International Workshop on Multi-Scale Movement Technologies 首届多尺度运动技术国际研讨会
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3420060
Eleonora Ceccaldi, B. Bardy, N. Bianchi-Berthouze, L. Fadiga, G. Volpe, A. Camurri
Multimodal interfaces pose the challenge of dealing with the multi-ple interactive time-scales characterizing human behavior. To dothis, innovative models and time-adaptive technologies are needed,operating at multiple time-scales and adopting a multi-layered ap-proach. The first International Workshop on Multi-Scale MovementTechnologies, hosted virtually during the 22nd ACM InternationalConference on Multimodal Interaction, is aimed at providing re-searchers from different areas with the opportunity to discuss thistopic. This paper summarizes the activities of the workshop andthe accepted papers
多模态界面对处理表征人类行为的多重交互时间尺度提出了挑战。为此,需要创新的模型和时间适应技术,在多个时间尺度上运行,并采用多层次的方法。在第22届ACM国际多模态交互会议期间举办的第一届多尺度运动技术国际研讨会旨在为来自不同领域的研究人员提供讨论这一主题的机会。本文综述了研讨会的活动和已接受的论文
{"title":"The First International Workshop on Multi-Scale Movement Technologies","authors":"Eleonora Ceccaldi, B. Bardy, N. Bianchi-Berthouze, L. Fadiga, G. Volpe, A. Camurri","doi":"10.1145/3382507.3420060","DOIUrl":"https://doi.org/10.1145/3382507.3420060","url":null,"abstract":"Multimodal interfaces pose the challenge of dealing with the multi-ple interactive time-scales characterizing human behavior. To dothis, innovative models and time-adaptive technologies are needed,operating at multiple time-scales and adopting a multi-layered ap-proach. The first International Workshop on Multi-Scale MovementTechnologies, hosted virtually during the 22nd ACM InternationalConference on Multimodal Interaction, is aimed at providing re-searchers from different areas with the opportunity to discuss thistopic. This paper summarizes the activities of the workshop andthe accepted papers","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131009184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Social Affective Multimodal Interaction for Health 健康的社会情感多模态互动
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3420059
Hiroki Tanaka, Satoshi Nakamura, Jean-Claude Martin, C. Pelachaud
This workshop discusses how interactive, multimodal technology such as virtual agents can be used in social skills training for measuring and training social-affective interactions. Sensing technology now enables analyzing user's behaviors and physiological signals. Various signal processing and machine learning methods can be used for such prediction tasks. Such social signal processing and tools can be applied to measure and reduce social stress in everyday situations, including public speaking at schools and workplaces.
本次研讨会将讨论交互式、多模式技术(如虚拟代理)如何用于社交技能培训,以测量和训练社会情感互动。传感技术现在可以分析用户的行为和生理信号。各种信号处理和机器学习方法可用于此类预测任务。这种社会信号处理和工具可以用于测量和减少日常情况下的社会压力,包括在学校和工作场所的公开演讲。
{"title":"Social Affective Multimodal Interaction for Health","authors":"Hiroki Tanaka, Satoshi Nakamura, Jean-Claude Martin, C. Pelachaud","doi":"10.1145/3382507.3420059","DOIUrl":"https://doi.org/10.1145/3382507.3420059","url":null,"abstract":"This workshop discusses how interactive, multimodal technology such as virtual agents can be used in social skills training for measuring and training social-affective interactions. Sensing technology now enables analyzing user's behaviors and physiological signals. Various signal processing and machine learning methods can be used for such prediction tasks. Such social signal processing and tools can be applied to measure and reduce social stress in everyday situations, including public speaking at schools and workplaces.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128121459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Estimating the Intensity of Facial Expressions Accompanying Feedback Responses in Multiparty Video-Mediated Communication 多方视频媒介交流中伴随反馈反应的面部表情强度估计
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418878
Ryosuke Ueno, Y. Nakano, Jie Zeng, Fumio Nihei
Providing feedback to a speaker is an essential communication signal for maintaining a conversation. In specific feedback, which indicates the listener's reaction to the speaker?s utterances, the facial expression is an effective modality for conveying the listener's reactions. Moreover, not only the type of facial expressions, but also the degree of intensity of the expressions, may influence the meaning of the specific feedback. In this study, we propose a multimodal deep neural network model that predicts the intensity of facial expressions co-occurring with feedback responses. We focus on multiparty video-mediated communication. In video-mediated communication, close-up frontal face images of each participant are continuously presented on the display; the attention of the participants is more likely to be drawn to the facial expressions. We assume that in such communication, the importance of facial expression in the listeners? feedback responses increases. We collected 33 video-mediated conversations by groups of three people and obtained audio and speech data for each participant. Using the corpus collected as a dataset, we created a deep neural network model that predicts the intensity of 17 types of action units (AUs) co-occurring with the feedback responses. The proposed method employed GRU-based model with attention mechanism for audio, visual, and language modalities. A decoder was trained to produce the intensity values for the 17 AUs frame by frame. In the experiment, unimodal and multimodal models were compared in terms of their performance in predicting salient AUs that characterize facial expression in feedback responses. The results suggest that well-performing models differ depending on the AU categories; audio information was useful for predicting AUs that express happiness, and visual and language information contributes to predicting AUs expressing sadness and disgust.
向说话者提供反馈是维持对话的基本沟通信号。在具体的反馈中,哪一个表明了听者对说话者的反应?在说话过程中,面部表情是传达听者反应的有效方式。此外,不仅面部表情的类型,而且表情的强度程度,都可能影响具体反馈的含义。在这项研究中,我们提出了一个多模态深度神经网络模型来预测面部表情与反馈反应共同发生的强度。我们专注于多方视频媒介通信。在视频媒介的交流中,每个参与者的正面特写图像连续呈现在显示器上;参与者的注意力更有可能被面部表情所吸引。我们假设在这样的交流中,面部表情对听者的重要性?反馈反应增加。我们收集了33个以视频为媒介的三人小组对话,并获得了每个参与者的音频和语音数据。使用收集到的语料库作为数据集,我们创建了一个深度神经网络模型,该模型预测了与反馈响应共同发生的17种动作单元(au)的强度。该方法采用基于gru的模型,并结合听觉、视觉和语言三种模态的注意机制。我们训练了一个解码器来逐帧生成17 au的强度值。在实验中,比较了单模态和多模态模型在预测反馈反应中面部表情特征的显著特征方面的表现。结果表明,表现良好的模型因AU类别而异;音频信息有助于预测表达快乐的au,而视觉和语言信息有助于预测表达悲伤和厌恶的au。
{"title":"Estimating the Intensity of Facial Expressions Accompanying Feedback Responses in Multiparty Video-Mediated Communication","authors":"Ryosuke Ueno, Y. Nakano, Jie Zeng, Fumio Nihei","doi":"10.1145/3382507.3418878","DOIUrl":"https://doi.org/10.1145/3382507.3418878","url":null,"abstract":"Providing feedback to a speaker is an essential communication signal for maintaining a conversation. In specific feedback, which indicates the listener's reaction to the speaker?s utterances, the facial expression is an effective modality for conveying the listener's reactions. Moreover, not only the type of facial expressions, but also the degree of intensity of the expressions, may influence the meaning of the specific feedback. In this study, we propose a multimodal deep neural network model that predicts the intensity of facial expressions co-occurring with feedback responses. We focus on multiparty video-mediated communication. In video-mediated communication, close-up frontal face images of each participant are continuously presented on the display; the attention of the participants is more likely to be drawn to the facial expressions. We assume that in such communication, the importance of facial expression in the listeners? feedback responses increases. We collected 33 video-mediated conversations by groups of three people and obtained audio and speech data for each participant. Using the corpus collected as a dataset, we created a deep neural network model that predicts the intensity of 17 types of action units (AUs) co-occurring with the feedback responses. The proposed method employed GRU-based model with attention mechanism for audio, visual, and language modalities. A decoder was trained to produce the intensity values for the 17 AUs frame by frame. In the experiment, unimodal and multimodal models were compared in terms of their performance in predicting salient AUs that characterize facial expression in feedback responses. The results suggest that well-performing models differ depending on the AU categories; audio information was useful for predicting AUs that express happiness, and visual and language information contributes to predicting AUs expressing sadness and disgust.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116892857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A Multi-modal System to Assess Cognition in Children from their Physical Movements 儿童身体运动认知评价的多模态系统
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418829
Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Ashish Jaiswal, Alexis Lueckenhoff, Maria Kyrarini, F. Makedon
In recent years, computer and game-based cognitive tests have become popular with the advancement in mobile technology. However, these tests require very little body movements and do not consider the influence that physical motion has on cognitive development. Our work mainly focus on assessing cognition in children through their physical movements. Hence, an assessment test "Ball-Drop-to-the-Beat" that is both physically and cognitively demanding has been used where the child is expected to perform certain actions based on the commands. The task is specifically designed to measure attention, response inhibition, and coordination in children. A dataset has been created with 25 children performing this test. To automate the scoring, a computer vision-based assessment system has been developed. The vision system employs an attention-based fusion mechanism to combine multiple modalities such as optical flow, human poses, and objects in the scene to predict a child's action. The proposed method outperforms other state-of-the-art approaches by achieving an average accuracy of 89.8 percent on predicting the actions and an average accuracy of 88.5 percent on predicting the rhythm on the Ball-Drop-to-the-Beat dataset.
近年来,随着移动技术的进步,基于电脑和游戏的认知测试变得流行起来。然而,这些测试只需要很少的身体运动,也没有考虑到身体运动对认知发展的影响。我们的工作主要集中在通过儿童的身体运动来评估他们的认知能力。因此,一个评估测试“ball drop -to-the- beat”对身体和认知都有要求,孩子们被要求根据命令执行某些动作。这项任务是专门设计用来衡量儿童的注意力、反应抑制和协调能力的。已经创建了一个数据集,其中有25个孩子执行此测试。为了实现自动评分,开发了基于计算机视觉的评分系统。视觉系统采用基于注意力的融合机制,结合多种模式,如光流、人体姿势和场景中的物体,来预测儿童的动作。所提出的方法优于其他最先进的方法,在预测动作方面达到89.8%的平均准确率,在预测球落到节拍数据集的节奏方面达到88.5%的平均准确率。
{"title":"A Multi-modal System to Assess Cognition in Children from their Physical Movements","authors":"Ashwin Ramesh Babu, Mohammad Zaki Zadeh, Ashish Jaiswal, Alexis Lueckenhoff, Maria Kyrarini, F. Makedon","doi":"10.1145/3382507.3418829","DOIUrl":"https://doi.org/10.1145/3382507.3418829","url":null,"abstract":"In recent years, computer and game-based cognitive tests have become popular with the advancement in mobile technology. However, these tests require very little body movements and do not consider the influence that physical motion has on cognitive development. Our work mainly focus on assessing cognition in children through their physical movements. Hence, an assessment test \"Ball-Drop-to-the-Beat\" that is both physically and cognitively demanding has been used where the child is expected to perform certain actions based on the commands. The task is specifically designed to measure attention, response inhibition, and coordination in children. A dataset has been created with 25 children performing this test. To automate the scoring, a computer vision-based assessment system has been developed. The vision system employs an attention-based fusion mechanism to combine multiple modalities such as optical flow, human poses, and objects in the scene to predict a child's action. The proposed method outperforms other state-of-the-art approaches by achieving an average accuracy of 89.8 percent on predicting the actions and an average accuracy of 88.5 percent on predicting the rhythm on the Ball-Drop-to-the-Beat dataset.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123951935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Multimodal, Multiparty Modeling of Collaborative Problem Solving Performance 协作问题解决性能的多模式、多方建模
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418877
Shree Krishna Subburaj, Angela E. B. Stewart, A. Rao, S. D’Mello
Modeling team phenomena from multiparty interactions inherently requires combining signals from multiple teammates, often by weighting strategies. Here, we explored the hypothesis that strategic weighting signals from individual teammates would outperform an equal weighting baseline. Accordingly, we explored role-, trait-, and behavior-based weighting of behavioral signals across team members. We analyzed data from 101 triads engaged in computer-mediated collaborative problem solving (CPS) in an educational physics game. We investigated the accuracy of machine-learned models trained on facial expressions, acoustic-prosodics, eye gaze, and task context information, computed one-minute prior to the end of a game level, at predicting success at solving that level. AUROCs for unimodal models that equally weighted features from the three teammates ranged from .54 to .67, whereas a combination of gaze, face, and task context features, achieved an AUROC of .73. The various multiparty weighting strategies did not outperform an equal-weighting baseline. However, our best nonverbal model (AUROC = .73) outperformed a language-based model (AUROC = .67), and there were some advantages to combining the two (AUROC = .75). Finally, models aimed at prospectively predicting performance on a minute-by-minute basis from the start of the level achieved a lower, but still above-chance, AUROC of .60. We discuss implications for multiparty modeling of team performance and other team constructs.
从多方交互中建模团队现象本质上需要组合来自多个团队成员的信号,通常是通过加权策略。在这里,我们探讨了一个假设,即来自个体队友的战略权重信号将优于同等权重基线。因此,我们探索了团队成员之间基于角色、特征和行为的行为信号权重。我们分析了101个三合会在一个教育物理游戏中参与计算机媒介协作解决问题(CPS)的数据。我们研究了在游戏关卡结束前一分钟计算的面部表情、声学韵律、眼睛注视和任务上下文信息训练的机器学习模型在预测成功解决该关卡方面的准确性。单模模型的AUROC从0.54到0.67不等,而凝视、面部和任务上下文特征的组合AUROC为0.73。各种多方加权策略的表现并不优于同等加权基线。然而,我们最好的非语言模型(AUROC = .73)优于基于语言的模型(AUROC = .67),并且两者结合有一些优势(AUROC = .75)。最后,从关卡一开始就以每分钟为基础进行前瞻性预测的模型,AUROC为0.60,虽然较低,但仍高于概率。我们讨论了团队绩效和其他团队结构的多方建模的含义。
{"title":"Multimodal, Multiparty Modeling of Collaborative Problem Solving Performance","authors":"Shree Krishna Subburaj, Angela E. B. Stewart, A. Rao, S. D’Mello","doi":"10.1145/3382507.3418877","DOIUrl":"https://doi.org/10.1145/3382507.3418877","url":null,"abstract":"Modeling team phenomena from multiparty interactions inherently requires combining signals from multiple teammates, often by weighting strategies. Here, we explored the hypothesis that strategic weighting signals from individual teammates would outperform an equal weighting baseline. Accordingly, we explored role-, trait-, and behavior-based weighting of behavioral signals across team members. We analyzed data from 101 triads engaged in computer-mediated collaborative problem solving (CPS) in an educational physics game. We investigated the accuracy of machine-learned models trained on facial expressions, acoustic-prosodics, eye gaze, and task context information, computed one-minute prior to the end of a game level, at predicting success at solving that level. AUROCs for unimodal models that equally weighted features from the three teammates ranged from .54 to .67, whereas a combination of gaze, face, and task context features, achieved an AUROC of .73. The various multiparty weighting strategies did not outperform an equal-weighting baseline. However, our best nonverbal model (AUROC = .73) outperformed a language-based model (AUROC = .67), and there were some advantages to combining the two (AUROC = .75). Finally, models aimed at prospectively predicting performance on a minute-by-minute basis from the start of the level achieved a lower, but still above-chance, AUROC of .60. We discuss implications for multiparty modeling of team performance and other team constructs.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124422340","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
How to Complement Learning Analytics with Smartwatches?: Fusing Physical Activities, Environmental Context, and Learning Activities 如何用智能手表补充学习分析?:身体活动、环境情境和学习活动的融合
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3421151
George-Petru Ciordas-Hertel
To obtain a holistic perspective on learning, a multimodal technical infrastructure for Learning Analytics (LA) can be beneficial. Recent studies have investigated various aspects of technical LA infrastructure. However, it has not yet been explored how LA indicators can be complemented with Smartwatch sensor data to detect physical activity and the environmental context. Sensor data, such as the accelerometer, are often used in related work to infer a specific behavior and environmental context, thus triggering interventions on a just-in-time basis. In this dissertation project, we plan to use Smartwatch sensor data to explore further indicators for learning from blended learning sessions conducted in-the-wild, e.g., at home. Such indicators could be used within learning sessions to suggest breaks, or afterward to support learners in reflection processes. We plan to investigate the following three research questions: (RQ1) How can multimodal learning analytics infrastructure be designed to support real-time data acquisition and processing effectively?; (RQ2) how to use smartwatch sensor data to infer environmental context and physical activities to complement learning analytics indicators for blended learning sessions; and (RQ3) how can we align the extracted multimodal indicators with pedagogical interventions. RQ1 was investigated by a structured literature review and by conducting eleven semi-structured interviews with LA infrastructure developers. According to RQ2, we are currently designing and implementing a multimodal learning analytics infrastructure to collect and process sensor and experience data from Smartwatches. Finally, according to RQ3, an exploratory field study will be conducted to extract multimodal learning indicators and examine them with learners and pedagogical experts to develop effective interventions. Researchers, educators, and learners can use and adapt our contributions to gain new insights into learners' time and learning tactics, and physical learning spaces from learning sessions taking place in-the-wild.
为了获得学习的整体视角,学习分析(LA)的多模态技术基础结构可能是有益的。最近的研究调查了洛杉矶技术基础设施的各个方面。然而,目前还没有研究如何将LA指标与智能手表传感器数据相结合,以检测身体活动和环境背景。传感器数据,如加速度计,经常用于相关工作,以推断特定的行为和环境背景,从而触发及时的干预措施。在这个论文项目中,我们计划使用智能手表传感器数据来探索在野外进行的混合学习课程中学习的进一步指标,例如在家里。这些指标可以在学习过程中用于建议休息时间,或者在学习结束后用于支持学习者的反思过程。我们计划调查以下三个研究问题:(RQ1)如何设计多模态学习分析基础设施来有效地支持实时数据采集和处理?(RQ2)如何使用智能手表传感器数据推断环境背景和身体活动,以补充混合式学习课程的学习分析指标;(RQ3)我们如何将提取的多模态指标与教学干预措施结合起来。RQ1通过结构化文献综述和对洛杉矶基础设施开发商进行11次半结构化访谈进行调查。根据RQ2,我们目前正在设计和实现一个多模式学习分析基础设施,以收集和处理来自智能手表的传感器和体验数据。最后,根据RQ3,将进行一项探索性的实地研究,以提取多模态学习指标,并与学习者和教学专家一起对其进行检查,以制定有效的干预措施。研究人员、教育工作者和学习者可以使用和调整我们的贡献,以获得关于学习者的时间和学习策略的新见解,以及从野外学习课程中获得的物理学习空间。
{"title":"How to Complement Learning Analytics with Smartwatches?: Fusing Physical Activities, Environmental Context, and Learning Activities","authors":"George-Petru Ciordas-Hertel","doi":"10.1145/3382507.3421151","DOIUrl":"https://doi.org/10.1145/3382507.3421151","url":null,"abstract":"To obtain a holistic perspective on learning, a multimodal technical infrastructure for Learning Analytics (LA) can be beneficial. Recent studies have investigated various aspects of technical LA infrastructure. However, it has not yet been explored how LA indicators can be complemented with Smartwatch sensor data to detect physical activity and the environmental context. Sensor data, such as the accelerometer, are often used in related work to infer a specific behavior and environmental context, thus triggering interventions on a just-in-time basis. In this dissertation project, we plan to use Smartwatch sensor data to explore further indicators for learning from blended learning sessions conducted in-the-wild, e.g., at home. Such indicators could be used within learning sessions to suggest breaks, or afterward to support learners in reflection processes. We plan to investigate the following three research questions: (RQ1) How can multimodal learning analytics infrastructure be designed to support real-time data acquisition and processing effectively?; (RQ2) how to use smartwatch sensor data to infer environmental context and physical activities to complement learning analytics indicators for blended learning sessions; and (RQ3) how can we align the extracted multimodal indicators with pedagogical interventions. RQ1 was investigated by a structured literature review and by conducting eleven semi-structured interviews with LA infrastructure developers. According to RQ2, we are currently designing and implementing a multimodal learning analytics infrastructure to collect and process sensor and experience data from Smartwatches. Finally, according to RQ3, an exploratory field study will be conducted to extract multimodal learning indicators and examine them with learners and pedagogical experts to develop effective interventions. Researchers, educators, and learners can use and adapt our contributions to gain new insights into learners' time and learning tactics, and physical learning spaces from learning sessions taking place in-the-wild.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121458245","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
FeetBack: Augmenting Robotic Telepresence with Haptic Feedback on the Feet 反馈:增强机器人远程呈现与触觉反馈的脚
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418820
Brennan Jones, Jens Maiero, Alireza Mogharrab, I. A. Aguilar, Ashu Adhikari, B. Riecke, E. Kruijff, Carman Neustaedter, R. Lindeman
Telepresence robots allow people to participate in remote spaces, yet they can be difficult to manoeuvre with people and obstacles around. We designed a haptic-feedback system called "FeetBack," which users place their feet in when driving a telepresence robot. When the robot approaches people or obstacles, haptic proximity and collision feedback are provided on the respective sides of the feet, helping inform users about events that are hard to notice through the robot's camera views. We conducted two studies: one to explore the usage of FeetBack in virtual environments, another focused on real environments. We found that FeetBack can increase spatial presence in simple virtual environments. Users valued the feedback to adjust their behaviour in both types of environments, though it was sometimes too frequent or unneeded for certain situations after a period of time. These results point to the value of foot-based haptic feedback for telepresence robot systems, while also the need to design context-sensitive haptic feedback.
远程呈现机器人允许人们参与远程空间,但它们可能很难与周围的人和障碍物进行操作。我们设计了一种叫做“反馈”的触觉反馈系统,用户在驾驶远程呈现机器人时,可以把脚放进去。当机器人接近人或障碍物时,触觉接近和碰撞反馈分别在脚的两侧提供,帮助告知用户通过机器人的相机视图难以注意到的事件。我们进行了两项研究:一项是探索反馈在虚拟环境中的使用,另一项是关注真实环境。我们发现反馈可以在简单的虚拟环境中增加空间存在感。用户重视反馈来调整他们在这两种环境中的行为,尽管有时在一段时间后的某些情况下,反馈过于频繁或不需要。这些结果指出了基于足部的触觉反馈对远程呈现机器人系统的价值,同时也需要设计上下文敏感的触觉反馈。
{"title":"FeetBack: Augmenting Robotic Telepresence with Haptic Feedback on the Feet","authors":"Brennan Jones, Jens Maiero, Alireza Mogharrab, I. A. Aguilar, Ashu Adhikari, B. Riecke, E. Kruijff, Carman Neustaedter, R. Lindeman","doi":"10.1145/3382507.3418820","DOIUrl":"https://doi.org/10.1145/3382507.3418820","url":null,"abstract":"Telepresence robots allow people to participate in remote spaces, yet they can be difficult to manoeuvre with people and obstacles around. We designed a haptic-feedback system called \"FeetBack,\" which users place their feet in when driving a telepresence robot. When the robot approaches people or obstacles, haptic proximity and collision feedback are provided on the respective sides of the feet, helping inform users about events that are hard to notice through the robot's camera views. We conducted two studies: one to explore the usage of FeetBack in virtual environments, another focused on real environments. We found that FeetBack can increase spatial presence in simple virtual environments. Users valued the feedback to adjust their behaviour in both types of environments, though it was sometimes too frequent or unneeded for certain situations after a period of time. These results point to the value of foot-based haptic feedback for telepresence robot systems, while also the need to design context-sensitive haptic feedback.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121537174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Enhancing Affect Detection in Game-Based Learning Environments with Multimodal Conditional Generative Modeling 用多模态条件生成建模增强基于游戏的学习环境中的影响检测
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418892
Nathan L. Henderson, Wookhee Min, Jonathan P. Rowe, James C. Lester
Accurately detecting and responding to student affect is a critical capability for adaptive learning environments. Recent years have seen growing interest in modeling student affect with multimodal sensor data. A key challenge in multimodal affect detection is dealing with data loss due to noisy, missing, or invalid multimodal features. Because multimodal affect detection often requires large quantities of data, data loss can have a strong, adverse impact on affect detector performance. To address this issue, we present a multimodal data imputation framework that utilizes conditional generative models to automatically impute posture and interaction log data from student interactions with a game-based learning environment for emergency medical training. We investigate two generative models, a Conditional Generative Adversarial Network (C-GAN) and a Conditional Variational Autoencoder (C-VAE), that are trained using a modality that has undergone varying levels of artificial data masking. The generative models are conditioned on the corresponding intact modality, enabling the data imputation process to capture the interaction between the concurrent modalities. We examine the effectiveness of the conditional generative models on imputation accuracy and its impact on the performance of affect detection. Each imputation model is evaluated using varying amounts of artificial data masking to determine how the data missingness impacts the performance of each imputation method. Results based on the modalities captured from students? interactions with the game-based learning environment indicate that deep conditional generative models within a multimodal data imputation framework yield significant benefits compared to baseline imputation techniques in terms of both imputation accuracy and affective detector performance.
准确地发现和回应学生的情绪是适应学习环境的关键能力。近年来,人们对用多模态传感器数据建模学生影响的兴趣越来越大。多模态影响检测的一个关键挑战是处理由于噪声、缺失或无效的多模态特征而导致的数据丢失。由于多模态影响检测通常需要大量数据,因此数据丢失会对影响检测器的性能产生强烈的不利影响。为了解决这个问题,我们提出了一个多模态数据输入框架,该框架利用条件生成模型自动输入来自学生与基于游戏的急救医学培训学习环境交互的姿势和交互日志数据。我们研究了两个生成模型,条件生成对抗网络(C-GAN)和条件变分自编码器(C-VAE),它们使用经过不同程度人工数据屏蔽的模态进行训练。生成模型以相应的完整模态为条件,使数据输入过程能够捕获并发模态之间的交互。我们研究了条件生成模型对输入精度的有效性及其对情感检测性能的影响。使用不同数量的人工数据屏蔽来评估每个插入模型,以确定数据缺失如何影响每个插入方法的性能。根据从学生身上获取的模式得出的结果?与基于游戏的学习环境的交互表明,在多模态数据输入框架内的深度条件生成模型与基线输入技术相比,在输入精度和情感检测器性能方面都具有显着的优势。
{"title":"Enhancing Affect Detection in Game-Based Learning Environments with Multimodal Conditional Generative Modeling","authors":"Nathan L. Henderson, Wookhee Min, Jonathan P. Rowe, James C. Lester","doi":"10.1145/3382507.3418892","DOIUrl":"https://doi.org/10.1145/3382507.3418892","url":null,"abstract":"Accurately detecting and responding to student affect is a critical capability for adaptive learning environments. Recent years have seen growing interest in modeling student affect with multimodal sensor data. A key challenge in multimodal affect detection is dealing with data loss due to noisy, missing, or invalid multimodal features. Because multimodal affect detection often requires large quantities of data, data loss can have a strong, adverse impact on affect detector performance. To address this issue, we present a multimodal data imputation framework that utilizes conditional generative models to automatically impute posture and interaction log data from student interactions with a game-based learning environment for emergency medical training. We investigate two generative models, a Conditional Generative Adversarial Network (C-GAN) and a Conditional Variational Autoencoder (C-VAE), that are trained using a modality that has undergone varying levels of artificial data masking. The generative models are conditioned on the corresponding intact modality, enabling the data imputation process to capture the interaction between the concurrent modalities. We examine the effectiveness of the conditional generative models on imputation accuracy and its impact on the performance of affect detection. Each imputation model is evaluated using varying amounts of artificial data masking to determine how the data missingness impacts the performance of each imputation method. Results based on the modalities captured from students? interactions with the game-based learning environment indicate that deep conditional generative models within a multimodal data imputation framework yield significant benefits compared to baseline imputation techniques in terms of both imputation accuracy and affective detector performance.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116793511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the 2020 International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1