首页 > 最新文献

Companion Publication of the 2020 International Conference on Multimodal Interaction最新文献

英文 中文
4th ICMI Workshop on Bridging Social Sciences and AI for Understanding Child Behaviour 第四届ICMI工作坊:连接社会科学和人工智能来理解儿童行为
Heysem Kaya, Anouk Neerincx, Maryam Najafian, Saeid Safavi
Analysing and understanding child behaviour is a topic of great scientific interest across a wide range of disciplines, including social sciences and artificial intelligence (AI). Knowledge in these diverse fields is not yet integrated to its full potential. The aim of this workshop is to bring researchers from these fields together. The first three workshops had a significant impact. In this workshop, we discussed topics such as the use of AI techniques to better examine and model interactions and children’s emotional development, analyzing head movement patterns with respect to child age. The 2023 edition of the workshop is a successful new step towards the objective of bridging social sciences and AI, attracting contributions from various academic fields on child behaviour analysis. We see that atypical child development holds an important space in child behaviour research. While in visual domain, gaze and joint attention are popularly studied; speech and physiological signals of atypically developing children are shown to provide valuable cues motivating future work. This document summarizes the WoCBU’23 workshop, including the review process, keynote talks and the accepted papers.
分析和理解儿童行为是包括社会科学和人工智能(AI)在内的广泛学科的重大科学兴趣话题。这些不同领域的知识尚未充分发挥其潜力。这次研讨会的目的是把这些领域的研究人员聚集在一起。前三个讲习班产生了重大影响。在本次研讨会上,我们讨论了诸如使用人工智能技术来更好地检查和模拟互动和儿童情感发展,分析儿童年龄方面的头部运动模式等主题。2023年的研讨会是朝着连接社会科学和人工智能的目标迈出的成功的新一步,吸引了来自儿童行为分析各个学术领域的贡献。我们看到,非典型儿童发展在儿童行为研究中占有重要的地位。而在视觉领域,凝视和共同注意被广泛研究;非典型发育儿童的言语和生理信号被证明为激励未来工作提供了有价值的线索。本文件总结了WoCBU’23研讨会的评审过程、主题演讲和被接受的论文。
{"title":"4th ICMI Workshop on Bridging Social Sciences and AI for Understanding Child Behaviour","authors":"Heysem Kaya, Anouk Neerincx, Maryam Najafian, Saeid Safavi","doi":"10.1145/3577190.3616858","DOIUrl":"https://doi.org/10.1145/3577190.3616858","url":null,"abstract":"Analysing and understanding child behaviour is a topic of great scientific interest across a wide range of disciplines, including social sciences and artificial intelligence (AI). Knowledge in these diverse fields is not yet integrated to its full potential. The aim of this workshop is to bring researchers from these fields together. The first three workshops had a significant impact. In this workshop, we discussed topics such as the use of AI techniques to better examine and model interactions and children’s emotional development, analyzing head movement patterns with respect to child age. The 2023 edition of the workshop is a successful new step towards the objective of bridging social sciences and AI, attracting contributions from various academic fields on child behaviour analysis. We see that atypical child development holds an important space in child behaviour research. While in visual domain, gaze and joint attention are popularly studied; speech and physiological signals of atypically developing children are shown to provide valuable cues motivating future work. This document summarizes the WoCBU’23 workshop, including the review process, keynote talks and the accepted papers.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"43 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A New Theory of Data Processing: Applying Artificial Intelligence to Cognition and Humanity 数据处理新理论:将人工智能应用于认知与人性
Jingwei Liu
The traditional data processing uses machine as a passive feature detector or classifier for a given fixed dataset. However, we contend that this is not how humans understand and process data from the real world. Based on active inference, we propose a neural network model that actively processes the incoming data using predictive processing and actively samples the inputs from the environment that conforms to its internal representations. The model we adopt is the Helmholtz machine, a perfect parallel for the hierarchical model of the brain and the forward-backward connections of the cortex, thus available a biologically plausible implementation of the brain functions such as predictive processing, hierarchical message passing, and predictive coding under a machine-learning context. Besides, active sampling could also be incorporated into the model via the generative end as an interaction of the agent with the external world. The active sampling of the environment directly resorts to environmental salience and cultural niche construction. By studying a coupled multi-agent model of constructing a “desire path” as part of a cultural niche, we find a plausible way of explaining and simulating various problems under group flow, social interactions, shared cultural practices, and thinking through other minds.
传统的数据处理是将机器作为给定的固定数据集的被动特征检测器或分类器。然而,我们认为这不是人类理解和处理现实世界数据的方式。基于主动推理,我们提出了一种神经网络模型,该模型使用预测处理主动处理传入数据,并主动采样符合其内部表示的环境输入。我们采用的模型是亥姆霍兹机(Helmholtz machine),它是大脑分层模型和皮层前向后连接的完美并行,因此可以在机器学习环境下实现生物学上合理的大脑功能,如预测处理、分层信息传递和预测编码。此外,主动采样也可以作为agent与外部世界的交互,通过生成端加入到模型中。环境的主动采样直接诉诸于环境显著性和文化生态位的构建。通过研究构建“欲望路径”的耦合多智能体模型作为文化生态位的一部分,我们找到了一种合理的方法来解释和模拟群体流动、社会互动、共享文化实践和通过他人思想思考的各种问题。
{"title":"A New Theory of Data Processing: Applying Artificial Intelligence to Cognition and Humanity","authors":"Jingwei Liu","doi":"10.1145/3577190.3616123","DOIUrl":"https://doi.org/10.1145/3577190.3616123","url":null,"abstract":"The traditional data processing uses machine as a passive feature detector or classifier for a given fixed dataset. However, we contend that this is not how humans understand and process data from the real world. Based on active inference, we propose a neural network model that actively processes the incoming data using predictive processing and actively samples the inputs from the environment that conforms to its internal representations. The model we adopt is the Helmholtz machine, a perfect parallel for the hierarchical model of the brain and the forward-backward connections of the cortex, thus available a biologically plausible implementation of the brain functions such as predictive processing, hierarchical message passing, and predictive coding under a machine-learning context. Besides, active sampling could also be incorporated into the model via the generative end as an interaction of the agent with the external world. The active sampling of the environment directly resorts to environmental salience and cultural niche construction. By studying a coupled multi-agent model of constructing a “desire path” as part of a cultural niche, we find a plausible way of explaining and simulating various problems under group flow, social interactions, shared cultural practices, and thinking through other minds.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135045696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The FineMotion entry to the GENEA Challenge 2023: DeepPhase for conversational gestures generation FineMotion参加GENEA挑战赛2023:DeepPhase会话手势生成
Vladislav Korzun, Anna Beloborodova, Arkady Ilin
This paper describes FineMotion’s entry to the GENEA Challenge 2023. We explore the potential of DeepPhase embeddings by adapting neural motion controllers to conversational gesture generation. This is achieved by introducing a recurrent encoder for control features. We additionally use VQ-VAE codebook encoding of gestures to support dyadic setup. The resulting system generates stable realistic motion controllable by audio, text and interlocutor’s motion.
本文描述了FineMotion参加2023年GENEA挑战赛的情况。我们通过使神经运动控制器适应会话手势生成来探索深度相位嵌入的潜力。这是通过为控制特性引入循环编码器来实现的。我们还使用VQ-VAE码本编码的手势来支持二元设置。由此产生的系统可以通过音频、文本和对话者的动作来控制稳定的逼真运动。
{"title":"The FineMotion entry to the GENEA Challenge 2023: DeepPhase for conversational gestures generation","authors":"Vladislav Korzun, Anna Beloborodova, Arkady Ilin","doi":"10.1145/3577190.3616119","DOIUrl":"https://doi.org/10.1145/3577190.3616119","url":null,"abstract":"This paper describes FineMotion’s entry to the GENEA Challenge 2023. We explore the potential of DeepPhase embeddings by adapting neural motion controllers to conversational gesture generation. This is achieved by introducing a recurrent encoder for control features. We additionally use VQ-VAE codebook encoding of gestures to support dyadic setup. The resulting system generates stable realistic motion controllable by audio, text and interlocutor’s motion.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135043300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
From Natural to Non-Natural Interaction: Embracing Interaction Design Beyond the Accepted Convention of Natural 从自然到非自然的交互:拥抱超越自然公认惯例的交互设计
Radu-Daniel Vatavu
Natural interactions feel intuitive, familiar, and a good match to the task, user’s abilities, and context. Consequently, a wealth of scientific research has been conducted on natural interaction with computer systems. Contrary to conventional mainstream, we advocate for “non-natural interaction design” as a transformative, creative process that results in highly usable and effective interactions by deliberately deviating from users’ expectations and experience of engaging with the physical world. The non-natural approach to interaction design provokes a departure from the established notion of the “natural,” all the while prioritizing usability—albeit amidst the backdrop of the unconventional, unexpected, and intriguing.
自然交互感觉直观、熟悉,并且与任务、用户能力和上下文很好地匹配。因此,对与计算机系统的自然交互进行了大量的科学研究。与传统主流相反,我们提倡“非自然交互设计”,将其作为一种变革性的、创造性的过程,通过故意偏离用户与物理世界互动的期望和体验,从而产生高度可用性和有效的交互。交互设计的非自然方法引发了对“自然”的既定概念的背离,同时始终优先考虑可用性——尽管是在非常规的、意想不到的和有趣的背景下。
{"title":"From Natural to Non-Natural Interaction: Embracing Interaction Design Beyond the Accepted Convention of Natural","authors":"Radu-Daniel Vatavu","doi":"10.1145/3577190.3616122","DOIUrl":"https://doi.org/10.1145/3577190.3616122","url":null,"abstract":"Natural interactions feel intuitive, familiar, and a good match to the task, user’s abilities, and context. Consequently, a wealth of scientific research has been conducted on natural interaction with computer systems. Contrary to conventional mainstream, we advocate for “non-natural interaction design” as a transformative, creative process that results in highly usable and effective interactions by deliberately deviating from users’ expectations and experience of engaging with the physical world. The non-natural approach to interaction design provokes a departure from the established notion of the “natural,” all the while prioritizing usability—albeit amidst the backdrop of the unconventional, unexpected, and intriguing.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044390","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Evaluating the Potential of Caption Activation to Mitigate Confusion Inferred from Facial Gestures in Virtual Meetings 评估标题激活的潜力,以减轻虚拟会议中从面部手势推断的混淆
Melanie Heck, Jinhee Jeong, Christian Becker
Following the COVID-19 pandemic, virtual meetings have not only become an integral part of collaboration, but are now also a popular tool for disseminating information to a large audience through webinars, online lectures, and the like. Ideally, the meeting participants should understand discussed topics as smoothly as in physical encounters. However, many experience confusion, but are hesitant to express their doubts. In this paper, we present the results from a user study with 45 Google Meet users that investigates how auto-generated captions can be used to improve comprehension. The results show that captions can help overcome confusion caused by language barriers, but not if it is the result of distorted words. To mitigate negative side effects such as occlusion of important visual information when captions are not strictly needed, we propose to activate them dynamically only when a user effectively experiences confusion. To determine instances that require captioning, we test whether the subliminal cues from facial gestures can be used to detect confusion. We confirm that confusion activates six facial action units (AU4, AU6, AU7, AU10, AU17, and AU23).
在2019冠状病毒病大流行之后,虚拟会议不仅成为协作的一个组成部分,而且现在还成为通过网络研讨会、在线讲座等向广大受众传播信息的流行工具。理想情况下,会议参与者应该像在实际接触中一样顺利地理解讨论的主题。然而,许多人经历了困惑,却不愿表达自己的怀疑。在本文中,我们展示了来自45名Google Meet用户的用户研究结果,该研究调查了如何使用自动生成的字幕来提高理解能力。结果表明,字幕可以帮助克服语言障碍造成的混淆,但如果是由扭曲的单词造成的,则不是。为了减轻负面的副作用,例如当不需要字幕时遮挡重要的视觉信息,我们建议只有在用户有效地经历困惑时才动态地激活它们。为了确定需要字幕的实例,我们测试了面部手势的潜意识线索是否可以用来检测困惑。我们证实,混淆激活了6个面部动作单元(AU4、AU6、AU7、AU10、AU17和AU23)。
{"title":"Evaluating the Potential of Caption Activation to Mitigate Confusion Inferred from Facial Gestures in Virtual Meetings","authors":"Melanie Heck, Jinhee Jeong, Christian Becker","doi":"10.1145/3577190.3614142","DOIUrl":"https://doi.org/10.1145/3577190.3614142","url":null,"abstract":"Following the COVID-19 pandemic, virtual meetings have not only become an integral part of collaboration, but are now also a popular tool for disseminating information to a large audience through webinars, online lectures, and the like. Ideally, the meeting participants should understand discussed topics as smoothly as in physical encounters. However, many experience confusion, but are hesitant to express their doubts. In this paper, we present the results from a user study with 45 Google Meet users that investigates how auto-generated captions can be used to improve comprehension. The results show that captions can help overcome confusion caused by language barriers, but not if it is the result of distorted words. To mitigate negative side effects such as occlusion of important visual information when captions are not strictly needed, we propose to activate them dynamically only when a user effectively experiences confusion. To determine instances that require captioning, we test whether the subliminal cues from facial gestures can be used to detect confusion. We confirm that confusion activates six facial action units (AU4, AU6, AU7, AU10, AU17, and AU23).","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044539","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpreting Sign Language Recognition using Transformers and MediaPipe Landmarks 使用变形金刚和MediaPipe地标解释手语识别
Cristina Luna-Jiménez, Manuel Gil-Martín, Ricardo Kleinlein, Rubén San-Segundo, Fernando Fernández-Martínez
Sign Language Recognition (SLR) is a challenging task that aims to bridge the communication gap between the deaf and hearing communities. In recent years, deep learning-based approaches have shown promising results in SLR. However, the lack of interpretability remains a significant challenge. In this paper, we seek to understand which hand and pose MediaPipe Landmarks are deemed the most important for prediction as estimated by a Transformer model. We propose to embed a learnable array of parameters into the model that performs an element-wise multiplication of the inputs. This learned array highlights the most informative input features that contributed to solve the recognition task. Resulting in a human-interpretable vector that lets us interpret the model predictions. We evaluate our approach on public datasets called WLASL100 (SRL) and IPNHand (gesture recognition). We believe that the insights gained in this way could be exploited for the development of more efficient SLR pipelines.
手语识别(SLR)是一项具有挑战性的任务,旨在弥合聋人与听力健全群体之间的沟通差距。近年来,基于深度学习的方法在单反中显示出有希望的结果。然而,缺乏可解释性仍然是一个重大挑战。在本文中,我们试图了解哪只手和姿势的MediaPipe地标被认为是最重要的预测由变压器模型估计。我们建议将一个可学习的参数数组嵌入到模型中,该模型执行输入的元素乘法。这个学习数组突出了有助于解决识别任务的最有信息的输入特征。结果是一个人类可解释的向量,让我们可以解释模型预测。我们在名为WLASL100 (SRL)和IPNHand(手势识别)的公共数据集上评估了我们的方法。我们相信,通过这种方式获得的见解可以用于开发更高效的单反管道。
{"title":"Interpreting Sign Language Recognition using Transformers and MediaPipe Landmarks","authors":"Cristina Luna-Jiménez, Manuel Gil-Martín, Ricardo Kleinlein, Rubén San-Segundo, Fernando Fernández-Martínez","doi":"10.1145/3577190.3614143","DOIUrl":"https://doi.org/10.1145/3577190.3614143","url":null,"abstract":"Sign Language Recognition (SLR) is a challenging task that aims to bridge the communication gap between the deaf and hearing communities. In recent years, deep learning-based approaches have shown promising results in SLR. However, the lack of interpretability remains a significant challenge. In this paper, we seek to understand which hand and pose MediaPipe Landmarks are deemed the most important for prediction as estimated by a Transformer model. We propose to embed a learnable array of parameters into the model that performs an element-wise multiplication of the inputs. This learned array highlights the most informative input features that contributed to solve the recognition task. Resulting in a human-interpretable vector that lets us interpret the model predictions. We evaluate our approach on public datasets called WLASL100 (SRL) and IPNHand (gesture recognition). We believe that the insights gained in this way could be exploited for the development of more efficient SLR pipelines.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimation of Violin Bow Pressure Using Photo-Reflective Sensors 利用光反射传感器估算小提琴弓压力
Yurina Mizuho, Riku Kitamura, Yuta Sugiura
The violin is one of the most popular instruments, but it is hard to learn. The bowing of the right hand is a crucial factor in determining the tone quality, but it is too complex to master, teach, and reproduce. Therefore, many studies have attempted to measure and analyze the bowing of the violin to help record performances and support practice. This work aimed to measure bow pressure, one of the parameters of bowing motion.
小提琴是最受欢迎的乐器之一,但它很难学。右手的弓弦是决定音质的关键因素,但它太复杂了,难以掌握、传授和再现。因此,许多研究试图测量和分析小提琴的弓形,以帮助记录演奏和支持练习。这项工作旨在测量弓形压力,弓形运动的参数之一。
{"title":"Estimation of Violin Bow Pressure Using Photo-Reflective Sensors","authors":"Yurina Mizuho, Riku Kitamura, Yuta Sugiura","doi":"10.1145/3577190.3614172","DOIUrl":"https://doi.org/10.1145/3577190.3614172","url":null,"abstract":"The violin is one of the most popular instruments, but it is hard to learn. The bowing of the right hand is a crucial factor in determining the tone quality, but it is too complex to master, teach, and reproduce. Therefore, many studies have attempted to measure and analyze the bowing of the violin to help record performances and support practice. This work aimed to measure bow pressure, one of the parameters of bowing motion.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044663","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can empathy affect the attribution of mental states to robots? 同理心会影响机器人的心理状态归属吗?
Cristina Gena, Francesca Manini, Antonio Lieto, Alberto Lillo, Fabiana Vernero
This paper presents an experimental study showing that the humanoid robot NAO, in a condition already validated with regards to its capacity to trigger situational empathy in humans, is able to stimulate the attribution of mental states towards itself. Indeed, results show that participants not only experienced empathy towards NAO, when the robot was afraid of losing its memory due to a malfunction, but they also attributed higher scores to the robot emotional intelligence in the Attribution of Mental State Questionnaire, in comparison with the users in the control condition. This result suggests a possible correlation between empathy toward the robot and humans’ attribution of mental states to it.
本文提出了一项实验研究,表明人形机器人NAO在触发人类情境同理心的能力已经得到验证的情况下,能够刺激对自己的心理状态的归因。事实上,结果表明,当机器人害怕因故障而失去记忆时,参与者不仅对NAO产生了同理心,而且在心理状态归因问卷中,他们给机器人的情商打分也比对照组的用户高。这一结果表明,对机器人的同理心与人类对机器人的心理状态归因之间可能存在关联。
{"title":"Can empathy affect the attribution of mental states to robots?","authors":"Cristina Gena, Francesca Manini, Antonio Lieto, Alberto Lillo, Fabiana Vernero","doi":"10.1145/3577190.3614167","DOIUrl":"https://doi.org/10.1145/3577190.3614167","url":null,"abstract":"This paper presents an experimental study showing that the humanoid robot NAO, in a condition already validated with regards to its capacity to trigger situational empathy in humans, is able to stimulate the attribution of mental states towards itself. Indeed, results show that participants not only experienced empathy towards NAO, when the robot was afraid of losing its memory due to a malfunction, but they also attributed higher scores to the robot emotional intelligence in the Attribution of Mental State Questionnaire, in comparison with the users in the control condition. This result suggests a possible correlation between empathy toward the robot and humans’ attribution of mental states to it.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044912","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SHAP-based Prediction of Mother's History of Depression to Understand the Influence on Child Behavior 基于shap的母亲抑郁史预测了解对儿童行为的影响
Maneesh Bilalpur, Saurabh Hinduja, Laura Cariola, Lisa Sheeber, Nicholas Allen, Louis-Philippe Morency, Jeffrey F. Cohn
Depression strongly impacts parents’ behavior. Does parents’ depression strongly affect the behavior of their children as well? To investigate this question, we compared dyadic interactions between 73 depressed and 75 non-depressed mothers and their adolescent child. Families were of low income and 84% were white. Child behavior was measured from audio-video recordings using manual annotation of verbal and nonverbal behavior by expert coders and by multimodal computational measures of facial expression, face and head dynamics, prosody, speech behavior, and linguistics. For both sets of measures, we used Support Vector Machines. For computational measures, we investigated the relative contribution of single versus multiple modalities using a novel approach to SHapley Additive exPlanations (SHAP). Computational measures outperformed manual ratings by human experts. Among individual computational measures, prosody was the most informative. SHAP reduction resulted in a four-fold decrease in the number of features and highest performance (77% accuracy; positive and negative agreements at 75% and 76%, respectively). These findings suggest that maternal depression strongly impacts the behavior of adolescent children; differences are most revealed in prosody; multimodal features together with SHAP reduction are most powerful.
抑郁症会严重影响父母的行为。父母的抑郁也会强烈影响孩子的行为吗?为了研究这个问题,我们比较了73名抑郁母亲和75名非抑郁母亲与其青春期孩子之间的二元互动。家庭收入较低,84%是白人。儿童行为是通过由专家编码人员手工注释语言和非语言行为,以及面部表情、面部和头部动态、韵律、语言行为和语言学的多模态计算测量来测量的。对于这两组度量,我们使用了支持向量机。对于计算测量,我们使用SHapley加性解释(SHAP)的新方法研究了单一与多种模式的相对贡献。计算方法优于人类专家的手动评级。在个体计算测量中,韵律是最具信息量的。SHAP减少导致特征数量减少了四倍,性能最高(准确率为77%;正面和负面协议分别占75%和76%)。这些发现表明,母亲抑郁对青春期儿童的行为有强烈的影响;差异主要体现在韵律上;多模态特征加上SHAP还原是最强大的。
{"title":"SHAP-based Prediction of Mother's History of Depression to Understand the Influence on Child Behavior","authors":"Maneesh Bilalpur, Saurabh Hinduja, Laura Cariola, Lisa Sheeber, Nicholas Allen, Louis-Philippe Morency, Jeffrey F. Cohn","doi":"10.1145/3577190.3614136","DOIUrl":"https://doi.org/10.1145/3577190.3614136","url":null,"abstract":"Depression strongly impacts parents’ behavior. Does parents’ depression strongly affect the behavior of their children as well? To investigate this question, we compared dyadic interactions between 73 depressed and 75 non-depressed mothers and their adolescent child. Families were of low income and 84% were white. Child behavior was measured from audio-video recordings using manual annotation of verbal and nonverbal behavior by expert coders and by multimodal computational measures of facial expression, face and head dynamics, prosody, speech behavior, and linguistics. For both sets of measures, we used Support Vector Machines. For computational measures, we investigated the relative contribution of single versus multiple modalities using a novel approach to SHapley Additive exPlanations (SHAP). Computational measures outperformed manual ratings by human experts. Among individual computational measures, prosody was the most informative. SHAP reduction resulted in a four-fold decrease in the number of features and highest performance (77% accuracy; positive and negative agreements at 75% and 76%, respectively). These findings suggest that maternal depression strongly impacts the behavior of adolescent children; differences are most revealed in prosody; multimodal features together with SHAP reduction are most powerful.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Autonomous Physiological Signal Extraction From Thermal Videos Using Deep Learning 基于深度学习的热视频自主生理信号提取
Kapotaksha Das, Mohamed Abouelenien, Mihai G. Burzo, John Elson, Kwaku Prakah-Asante, Clay Maranville
Using the thermal modality in order to extract physiological signals as a noncontact means of remote monitoring is gaining traction in applications, such as healthcare monitoring. However, existing methods rely heavily on traditional tracking and mostly unsupervised signal processing methods, which could be affected significantly by noise and subjects’ movements. Using a novel deep learning architecture based on convolutional long short-term memory networks on a diverse dataset of 36 subjects, we present a personalized approach to extract multimodal signals, including the heart rate, respiration rate, and body temperature from thermal videos. We perform multimodal signal extraction for subjects in states of both active speaking and silence, requiring no parameter tuning in an end-to-end deep learning approach with automatic feature extraction. We experiment with different data sampling methods for training our deep learning models, as well as different network designs. Our results indicate the effectiveness and improved efficiency of the proposed models reaching more than 90% accuracy based on the availability of proper training data for each subject.
利用热模态提取生理信号作为一种非接触式远程监测手段,在医疗保健监测等应用中越来越受到关注。然而,现有的方法严重依赖于传统的跟踪和大多数无监督的信号处理方法,这些方法可能受到噪声和受试者运动的显著影响。使用基于卷积长短期记忆网络的新颖深度学习架构,我们提出了一种个性化的方法,从热视频中提取多模态信号,包括心率、呼吸频率和体温。我们在主动说话和沉默状态下对受试者进行多模态信号提取,不需要在端到端深度学习方法中进行参数调整,并进行自动特征提取。我们尝试了不同的数据采样方法来训练我们的深度学习模型,以及不同的网络设计。我们的结果表明,基于每个主题适当的训练数据的可用性,所提出的模型的有效性和提高的效率达到90%以上的准确率。
{"title":"Towards Autonomous Physiological Signal Extraction From Thermal Videos Using Deep Learning","authors":"Kapotaksha Das, Mohamed Abouelenien, Mihai G. Burzo, John Elson, Kwaku Prakah-Asante, Clay Maranville","doi":"10.1145/3577190.3614123","DOIUrl":"https://doi.org/10.1145/3577190.3614123","url":null,"abstract":"Using the thermal modality in order to extract physiological signals as a noncontact means of remote monitoring is gaining traction in applications, such as healthcare monitoring. However, existing methods rely heavily on traditional tracking and mostly unsupervised signal processing methods, which could be affected significantly by noise and subjects’ movements. Using a novel deep learning architecture based on convolutional long short-term memory networks on a diverse dataset of 36 subjects, we present a personalized approach to extract multimodal signals, including the heart rate, respiration rate, and body temperature from thermal videos. We perform multimodal signal extraction for subjects in states of both active speaking and silence, requiring no parameter tuning in an end-to-end deep learning approach with automatic feature extraction. We experiment with different data sampling methods for training our deep learning models, as well as different network designs. Our results indicate the effectiveness and improved efficiency of the proposed models reaching more than 90% accuracy based on the availability of proper training data for each subject.","PeriodicalId":93171,"journal":{"name":"Companion Publication of the 2020 International Conference on Multimodal Interaction","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-10-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"135044922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Companion Publication of the 2020 International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1