首页 > 最新文献

Proceedings of the 2020 International Conference on Multimodal Interaction最新文献

英文 中文
Going with our Guts: Potentials of Wearable Electrogastrography (EGG) for Affect Detection 跟着我们的直觉走:可穿戴式胃电图(EGG)用于情绪检测的潜力
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418882
Angela Vujic, S. Tong, Rosalind W. Picard, P. Maes
A hard challenge for wearable systems is to measure differences in emotional valence, i.e. positive and negative affect via physiology. However, the stomach or gastric signal is an unexplored modality that could offer new affective information. We created a wearable device and software to record gastric signals, known as electrogastrography (EGG). An in-laboratory study was conducted to compare EGG with electrodermal activity (EDA) in 33 individuals viewing affective stimuli. We found that negative stimuli attenuate EGG's indicators of parasympathetic activation, or "rest and digest" activity. We compare EGG to the remaining physiological signals and describe implications for affect detection. Further, we introduce how wearable EGG may support future applications in areas as diverse as reducing nausea in virtual reality and helping treat emotion-related eating disorders.
可穿戴系统面临的一个严峻挑战是测量情绪效价的差异,即通过生理学测量积极和消极影响。然而,胃或胃信号是一种未被探索的模式,可以提供新的情感信息。我们创造了一种可穿戴设备和软件来记录胃信号,称为胃电图(EGG)。在实验室研究中,对33名观看情感刺激的个体进行了EGG和皮肤电活动(EDA)的比较。我们发现负刺激减弱了EGG的副交感神经激活指标,或“休息和消化”活动。我们将EGG与其他生理信号进行比较,并描述影响检测的含义。此外,我们还介绍了可穿戴式EGG如何支持未来在虚拟现实中减少恶心和帮助治疗与情绪相关的饮食失调等领域的应用。
{"title":"Going with our Guts: Potentials of Wearable Electrogastrography (EGG) for Affect Detection","authors":"Angela Vujic, S. Tong, Rosalind W. Picard, P. Maes","doi":"10.1145/3382507.3418882","DOIUrl":"https://doi.org/10.1145/3382507.3418882","url":null,"abstract":"A hard challenge for wearable systems is to measure differences in emotional valence, i.e. positive and negative affect via physiology. However, the stomach or gastric signal is an unexplored modality that could offer new affective information. We created a wearable device and software to record gastric signals, known as electrogastrography (EGG). An in-laboratory study was conducted to compare EGG with electrodermal activity (EDA) in 33 individuals viewing affective stimuli. We found that negative stimuli attenuate EGG's indicators of parasympathetic activation, or \"rest and digest\" activity. We compare EGG to the remaining physiological signals and describe implications for affect detection. Further, we introduce how wearable EGG may support future applications in areas as diverse as reducing nausea in virtual reality and helping treat emotion-related eating disorders.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131688977","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Automated Time Synchronization of Cough Events from Multimodal Sensors in Mobile Devices 移动设备中多模态传感器咳嗽事件的自动时间同步
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418855
Tousif Ahmed, M. Y. Ahmed, Md. Mahbubur Rahman, Ebrahim Nemati, Bashima Islam, K. Vatanparvar, Viswam Nathan, Daniel McCaffrey, Jilong Kuang, J. Gao
Tracking the type and frequency of cough events is critical for monitoring respiratory diseases. Coughs are one of the most common symptoms of respiratory and infectious diseases like COVID-19, and a cough monitoring system could have been vital in remote monitoring during a pandemic like COVID-19. While the existing solutions for cough monitoring use unimodal (e.g., audio) approaches for detecting coughs, a fusion of multimodal sensors (e.g., audio and accelerometer) from multiple devices (e.g., phone and watch) are likely to discover additional insights and can help to track the exacerbation of the respiratory conditions. However, such multimodal and multidevice fusion requires accurate time synchronization, which could be challenging for coughs as coughs are usually concise events (0.3-0.7 seconds). In this paper, we first demonstrate the time synchronization challenges of cough synchronization based on the cough data collected from two studies. Then we highlight the performance of a cross-correlation based time synchronization algorithm on the alignment of cough events. Our algorithm can synchronize 98.9% of cough events with an average synchronization error of 0.046s from two devices.
追踪咳嗽事件的类型和频率对于监测呼吸系统疾病至关重要。咳嗽是COVID-19等呼吸道和传染病最常见的症状之一,在COVID-19等大流行期间,咳嗽监测系统在远程监测中可能至关重要。虽然现有的咳嗽监测解决方案使用单模态(例如,音频)方法来检测咳嗽,但来自多个设备(例如,电话和手表)的多模态传感器(例如,音频和加速度计)的融合可能会发现额外的见解,并有助于跟踪呼吸条件的恶化。然而,这种多模式和多设备融合需要精确的时间同步,这对于咳嗽来说可能是一个挑战,因为咳嗽通常是简洁的事件(0.3-0.7秒)。在本文中,我们首先基于两项研究收集的咳嗽数据,论证了咳嗽同步的时间同步挑战。然后,我们重点介绍了基于互相关的时间同步算法在咳嗽事件对齐方面的性能。我们的算法可以同步98.9%的咳嗽事件,平均同步误差为0.046秒。
{"title":"Automated Time Synchronization of Cough Events from Multimodal Sensors in Mobile Devices","authors":"Tousif Ahmed, M. Y. Ahmed, Md. Mahbubur Rahman, Ebrahim Nemati, Bashima Islam, K. Vatanparvar, Viswam Nathan, Daniel McCaffrey, Jilong Kuang, J. Gao","doi":"10.1145/3382507.3418855","DOIUrl":"https://doi.org/10.1145/3382507.3418855","url":null,"abstract":"Tracking the type and frequency of cough events is critical for monitoring respiratory diseases. Coughs are one of the most common symptoms of respiratory and infectious diseases like COVID-19, and a cough monitoring system could have been vital in remote monitoring during a pandemic like COVID-19. While the existing solutions for cough monitoring use unimodal (e.g., audio) approaches for detecting coughs, a fusion of multimodal sensors (e.g., audio and accelerometer) from multiple devices (e.g., phone and watch) are likely to discover additional insights and can help to track the exacerbation of the respiratory conditions. However, such multimodal and multidevice fusion requires accurate time synchronization, which could be challenging for coughs as coughs are usually concise events (0.3-0.7 seconds). In this paper, we first demonstrate the time synchronization challenges of cough synchronization based on the cough data collected from two studies. Then we highlight the performance of a cross-correlation based time synchronization algorithm on the alignment of cough events. Our algorithm can synchronize 98.9% of cough events with an average synchronization error of 0.046s from two devices.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130574144","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Towards Real-Time Multimodal Emotion Recognition among Couples 情侣间实时多模态情绪识别研究
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3421154
George Boateng
Researchers are interested in understanding the emotions of couples as it relates to relationship quality and dyadic management of chronic diseases. Currently, the process of assessing emotions is manual, time-intensive, and costly. Despite the existence of works on emotion recognition among couples, there exists no ubiquitous system that recognizes the emotions of couples in everyday life while addressing the complexity of dyadic interactions such as turn-taking in couples? conversations. In this work, we seek to develop a smartwatch-based system that leverages multimodal sensor data to recognize each partner's emotions in daily life. We are collecting data from couples in the lab and in the field and we plan to use the data to develop multimodal machine learning models for emotion recognition. Then, we plan to implement the best models in a smartwatch app and evaluate its performance in real-time and everyday life through another field study. Such a system could enable research both in the lab (e.g. couple therapy) or in daily life (assessment of chronic disease management or relationship quality) and enable interventions to improve the emotional well-being, relationship quality, and chronic disease management of couples.
研究人员对了解夫妻的情绪很感兴趣,因为它与关系质量和慢性病的双重管理有关。目前,评估情绪的过程是手动的,耗时且昂贵。尽管夫妻之间的情感识别工作已经存在,但目前还没有一个普遍存在的系统来识别日常生活中夫妻的情感,同时解决夫妻轮流等二元互动的复杂性。的谈话。在这项工作中,我们试图开发一种基于智能手表的系统,该系统利用多模态传感器数据来识别日常生活中每个伴侣的情绪。我们正在实验室和现场收集夫妻的数据,我们计划利用这些数据开发用于情感识别的多模态机器学习模型。然后,我们计划在智能手表应用程序中实现最佳模型,并通过另一个实地研究来评估其在实时和日常生活中的性能。这样一个系统可以使实验室(例如夫妻治疗)或日常生活(慢性病管理或关系质量评估)的研究成为可能,并使干预措施能够改善夫妻的情绪健康、关系质量和慢性病管理。
{"title":"Towards Real-Time Multimodal Emotion Recognition among Couples","authors":"George Boateng","doi":"10.1145/3382507.3421154","DOIUrl":"https://doi.org/10.1145/3382507.3421154","url":null,"abstract":"Researchers are interested in understanding the emotions of couples as it relates to relationship quality and dyadic management of chronic diseases. Currently, the process of assessing emotions is manual, time-intensive, and costly. Despite the existence of works on emotion recognition among couples, there exists no ubiquitous system that recognizes the emotions of couples in everyday life while addressing the complexity of dyadic interactions such as turn-taking in couples? conversations. In this work, we seek to develop a smartwatch-based system that leverages multimodal sensor data to recognize each partner's emotions in daily life. We are collecting data from couples in the lab and in the field and we plan to use the data to develop multimodal machine learning models for emotion recognition. Then, we plan to implement the best models in a smartwatch app and evaluate its performance in real-time and everyday life through another field study. Such a system could enable research both in the lab (e.g. couple therapy) or in daily life (assessment of chronic disease management or relationship quality) and enable interventions to improve the emotional well-being, relationship quality, and chronic disease management of couples.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121043215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
LASO: Exploiting Locomotive and Acoustic Signatures over the Edge to Annotate IMU Data for Human Activity Recognition LASO:利用机车和声学特征在边缘上标注IMU数据用于人类活动识别
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418826
S. Chatterjee, Avijoy Chakma, A. Gangopadhyay, Nirmalya Roy, Bivas Mitra, Sandip Chakraborty
Annotated IMU sensor data from smart devices and wearables are essential for developing supervised models for fine-grained human activity recognition, albeit generating sufficient annotated data for diverse human activities under different environments is challenging. Existing approaches primarily use human-in-the-loop based techniques, including active learning; however, they are tedious, costly, and time-consuming. Leveraging the availability of acoustic data from embedded microphones over the data collection devices, in this paper, we propose LASO, a multimodal approach for automated data annotation from acoustic and locomotive information. LASO works over the edge device itself, ensuring that only the annotated IMU data is collected, discarding the acoustic data from the device itself, hence preserving the audio-privacy of the user. In the absence of any pre-existing labeling information, such an auto-annotation is challenging as the IMU data needs to be sessionized for different time-scaled activities in a completely unsupervised manner. We use a change-point detection technique while synchronizing the locomotive information from the IMU data with the acoustic data, and then use pre-trained audio-based activity recognition models for labeling the IMU data while handling the acoustic noises. LASO efficiently annotates IMU data, without any explicit human intervention, with a mean accuracy of $0.93$ ($pm 0.04$) and $0.78$ ($pm 0.05$) for two different real-life datasets from workshop and kitchen environments, respectively.
来自智能设备和可穿戴设备的带注释的IMU传感器数据对于开发用于细粒度人类活动识别的监督模型至关重要,尽管为不同环境下的各种人类活动生成足够的带注释数据是一项挑战。现有的方法主要使用基于人在环的技术,包括主动学习;然而,它们乏味、昂贵且耗时。利用数据收集设备上嵌入式麦克风的声学数据的可用性,在本文中,我们提出了LASO,一种从声学和机车信息中自动注释数据的多模式方法。LASO在边缘设备上工作,确保只收集带注释的IMU数据,丢弃设备本身的声学数据,从而保护用户的音频隐私。在没有任何预先存在的标记信息的情况下,这种自动注释是具有挑战性的,因为IMU数据需要以完全无监督的方式对不同的时间尺度活动进行会话化。我们使用变点检测技术将IMU数据中的机车信息与声学数据同步,然后使用预训练的基于音频的活动识别模型对IMU数据进行标记,同时处理声学噪声。LASO有效地注释了IMU数据,没有任何显式的人为干预,对于来自车间和厨房环境的两个不同的真实数据集,平均精度分别为0.93美元($pm 0.04美元)和0.78美元($pm 0.05美元)。
{"title":"LASO: Exploiting Locomotive and Acoustic Signatures over the Edge to Annotate IMU Data for Human Activity Recognition","authors":"S. Chatterjee, Avijoy Chakma, A. Gangopadhyay, Nirmalya Roy, Bivas Mitra, Sandip Chakraborty","doi":"10.1145/3382507.3418826","DOIUrl":"https://doi.org/10.1145/3382507.3418826","url":null,"abstract":"Annotated IMU sensor data from smart devices and wearables are essential for developing supervised models for fine-grained human activity recognition, albeit generating sufficient annotated data for diverse human activities under different environments is challenging. Existing approaches primarily use human-in-the-loop based techniques, including active learning; however, they are tedious, costly, and time-consuming. Leveraging the availability of acoustic data from embedded microphones over the data collection devices, in this paper, we propose LASO, a multimodal approach for automated data annotation from acoustic and locomotive information. LASO works over the edge device itself, ensuring that only the annotated IMU data is collected, discarding the acoustic data from the device itself, hence preserving the audio-privacy of the user. In the absence of any pre-existing labeling information, such an auto-annotation is challenging as the IMU data needs to be sessionized for different time-scaled activities in a completely unsupervised manner. We use a change-point detection technique while synchronizing the locomotive information from the IMU data with the acoustic data, and then use pre-trained audio-based activity recognition models for labeling the IMU data while handling the acoustic noises. LASO efficiently annotates IMU data, without any explicit human intervention, with a mean accuracy of $0.93$ ($pm 0.04$) and $0.78$ ($pm 0.05$) for two different real-life datasets from workshop and kitchen environments, respectively.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129873523","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Is She Truly Enjoying the Conversation?: Analysis of Physiological Signals toward Adaptive Dialogue Systems 她真的喜欢你的谈话吗?自适应对话系统的生理信号分析
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418844
Shun Katada, S. Okada, Yuki Hirano, Kazunori Komatani
In human-agent interactions, it is necessary for the systems to identify the current emotional state of the user to adapt their dialogue strategies. Nevertheless, this task is challenging because the current emotional states are not always expressed in a natural setting and change dynamically. Recent accumulated evidence has indicated the usefulness of physiological modalities to realize emotion recognition. However, the contribution of the time series physiological signals in human-agent interaction during a dialogue has not been extensively investigated. This paper presents a machine learning model based on physiological signals to estimate a user's sentiment at every exchange during a dialogue. Using a wearable sensing device, the time series physiological data including the electrodermal activity (EDA) and heart rate in addition to acoustic and visual information during a dialogue were collected. The sentiment labels were annotated by the participants themselves and by external human coders for each exchange consisting of a pair of system and participant utterances. The experimental results showed that a multimodal deep neural network (DNN) model combined with the EDA and visual features achieved an accuracy of 63.2%. In general, this task is challenging, as indicated by the accuracy of 63.0% attained by the external coders. The analysis of the sentiment estimation results for each individual indicated that the human coders often wrongly estimated the negative sentiment labels, and in this case, the performance of the DNN model was higher than that of the human coders. These results indicate that physiological signals can help in detecting the implicit aspects of negative sentiments, which are acoustically/visually indistinguishable.
在人机交互中,系统有必要识别用户当前的情绪状态,以适应他们的对话策略。然而,这项任务是具有挑战性的,因为当前的情绪状态并不总是在自然环境中表达,并且是动态变化的。最近积累的证据表明,生理模式在实现情绪识别方面是有用的。然而,在对话过程中,时间序列生理信号在人机交互中的作用尚未得到广泛的研究。本文提出了一种基于生理信号的机器学习模型,用于估计用户在对话过程中的每次交流中的情绪。利用可穿戴传感装置,采集对话过程中的时间序列生理数据,包括皮电活动(EDA)和心率,以及声音和视觉信息。情感标签由参与者自己和外部人类编码器对每个由一对系统和参与者话语组成的交换进行注释。实验结果表明,结合EDA和视觉特征的多模态深度神经网络(DNN)模型的准确率达到了63.2%。一般来说,这项任务是具有挑战性的,正如外部编码器达到的63.0%的准确性所表明的那样。对每个个体的情感估计结果的分析表明,人类编码员经常错误地估计负面情感标签,在这种情况下,DNN模型的性能高于人类编码员。这些结果表明,生理信号可以帮助发现负面情绪的内隐方面,这是声学/视觉上无法区分的。
{"title":"Is She Truly Enjoying the Conversation?: Analysis of Physiological Signals toward Adaptive Dialogue Systems","authors":"Shun Katada, S. Okada, Yuki Hirano, Kazunori Komatani","doi":"10.1145/3382507.3418844","DOIUrl":"https://doi.org/10.1145/3382507.3418844","url":null,"abstract":"In human-agent interactions, it is necessary for the systems to identify the current emotional state of the user to adapt their dialogue strategies. Nevertheless, this task is challenging because the current emotional states are not always expressed in a natural setting and change dynamically. Recent accumulated evidence has indicated the usefulness of physiological modalities to realize emotion recognition. However, the contribution of the time series physiological signals in human-agent interaction during a dialogue has not been extensively investigated. This paper presents a machine learning model based on physiological signals to estimate a user's sentiment at every exchange during a dialogue. Using a wearable sensing device, the time series physiological data including the electrodermal activity (EDA) and heart rate in addition to acoustic and visual information during a dialogue were collected. The sentiment labels were annotated by the participants themselves and by external human coders for each exchange consisting of a pair of system and participant utterances. The experimental results showed that a multimodal deep neural network (DNN) model combined with the EDA and visual features achieved an accuracy of 63.2%. In general, this task is challenging, as indicated by the accuracy of 63.0% attained by the external coders. The analysis of the sentiment estimation results for each individual indicated that the human coders often wrongly estimated the negative sentiment labels, and in this case, the performance of the DNN model was higher than that of the human coders. These results indicate that physiological signals can help in detecting the implicit aspects of negative sentiments, which are acoustically/visually indistinguishable.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128632257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
International Workshop on Deep Video Understanding 深度视频理解国际研讨会
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3419746
Keith Curtis, G. Awad, Shahzad Rajput, I. Soboroff
This is the introduction paper to the International Workshop on Deep Video Understanding, organized at the 22nd ACM Interational Conference on Multimodal Interaction. In recent years, a growing trend towards working on understanding videos (in particular movies) to a deeper level started to motivate researchers working in multimedia and computer vision to present new approaches and datasets to tackle this problem. This is a challenging research area which aims to develop a deep understanding of the relations which exist between different individuals and entities in movies using all available modalities such as video, audio, text and metadata. The aim of this workshop is to foster innovative research in this new direction and to provide benchmarking evaluations to advance technologies in the deep video understanding community.
这是在第22届ACM多模态交互国际会议上组织的深度视频理解国际研讨会的介绍论文。近年来,在更深层次上理解视频(尤其是电影)的趋势日益增长,这促使多媒体和计算机视觉领域的研究人员提出新的方法和数据集来解决这个问题。这是一个具有挑战性的研究领域,旨在深入理解电影中不同个体和实体之间存在的关系,使用所有可用的模式,如视频、音频、文本和元数据。本次研讨会的目的是促进这一新方向的创新研究,并为深度视频理解领域的技术进步提供基准评估。
{"title":"International Workshop on Deep Video Understanding","authors":"Keith Curtis, G. Awad, Shahzad Rajput, I. Soboroff","doi":"10.1145/3382507.3419746","DOIUrl":"https://doi.org/10.1145/3382507.3419746","url":null,"abstract":"This is the introduction paper to the International Workshop on Deep Video Understanding, organized at the 22nd ACM Interational Conference on Multimodal Interaction. In recent years, a growing trend towards working on understanding videos (in particular movies) to a deeper level started to motivate researchers working in multimedia and computer vision to present new approaches and datasets to tackle this problem. This is a challenging research area which aims to develop a deep understanding of the relations which exist between different individuals and entities in movies using all available modalities such as video, audio, text and metadata. The aim of this workshop is to foster innovative research in this new direction and to provide benchmarking evaluations to advance technologies in the deep video understanding community.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133683865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Fifty Shades of Green: Towards a Robust Measure of Inter-annotator Agreement for Continuous Signals 绿色的五十度:对连续信号的注释器间一致性的鲁棒度量
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418860
Brandon M. Booth, Shrikanth S. Narayanan
Continuous human annotations of complex human experiences are essential for enabling psychological and machine-learned inquiry into the human mind, but establishing a reliable set of annotations for analysis and ground truth generation is difficult. Measures of consensus or agreement are often used to establish the reliability of a collection of annotations and thereby purport their suitability for further research and analysis. This work examines many of the commonly used agreement metrics for continuous-scale and continuous-time human annotations and demonstrates their shortcomings, especially in measuring agreement in general annotation shape and structure. Annotation quality is carefully examined in a controlled study where the true target signal is known and evidence is presented suggesting that annotators' perceptual distortions can be modeled using monotonic functions. A novel measure of agreement is proposed which is agnostic to these perceptual differences between annotators and provides unique information when assessing agreement. We illustrate how this measure complements existing agreement metrics and can serve as a tool for curating a reliable collection of human annotations based on differential consensus.
对复杂的人类经验进行连续的人类注释对于使心理和机器学习探究人类思想至关重要,但建立一套可靠的注释用于分析和基础真理生成是困难的。共识或协议的度量通常用于建立注释集合的可靠性,从而声称它们适合进一步的研究和分析。这项工作检查了许多用于连续尺度和连续时间人类注释的常用一致性度量,并展示了它们的缺点,特别是在测量一般注释形状和结构的一致性方面。在一项已知真实目标信号的受控研究中,对注释质量进行了仔细检查,并提出证据表明注释者的感知扭曲可以使用单调函数建模。提出了一种新的一致性测量方法,该方法对注释者之间的这些感知差异不可知,并在评估一致性时提供独特的信息。我们说明了这个度量是如何补充现有的协议度量的,并且可以作为一种工具来管理基于差异共识的可靠的人工注释集合。
{"title":"Fifty Shades of Green: Towards a Robust Measure of Inter-annotator Agreement for Continuous Signals","authors":"Brandon M. Booth, Shrikanth S. Narayanan","doi":"10.1145/3382507.3418860","DOIUrl":"https://doi.org/10.1145/3382507.3418860","url":null,"abstract":"Continuous human annotations of complex human experiences are essential for enabling psychological and machine-learned inquiry into the human mind, but establishing a reliable set of annotations for analysis and ground truth generation is difficult. Measures of consensus or agreement are often used to establish the reliability of a collection of annotations and thereby purport their suitability for further research and analysis. This work examines many of the commonly used agreement metrics for continuous-scale and continuous-time human annotations and demonstrates their shortcomings, especially in measuring agreement in general annotation shape and structure. Annotation quality is carefully examined in a controlled study where the true target signal is known and evidence is presented suggesting that annotators' perceptual distortions can be modeled using monotonic functions. A novel measure of agreement is proposed which is agnostic to these perceptual differences between annotators and provides unique information when assessing agreement. We illustrate how this measure complements existing agreement metrics and can serve as a tool for curating a reliable collection of human annotations based on differential consensus.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127446483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice 机器人引导的第二语言会话练习中听者不确定性的检测
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418873
Ronald Cumbal, José Lopes, Olov Engwall
Uncertainty is a frequently occurring affective state that learners experience during the acquisition of a second language. This state can constitute both a learning opportunity and a source of learner frustration. An appropriate detection could therefore benefit the learning process by reducing cognitive instability. In this study, we use a dyadic practice conversation between an adult second-language learner and a social robot to elicit events of uncertainty through the manipulation of the robot's spoken utterances (increased lexical complexity or prosody modifications). The characteristics of these events are then used to analyze multi-party practice conversations between a robot and two learners. Classification models are trained with multimodal features from annotated events of listener (un)certainty. We report the performance of our models on different settings, (sub)turn segments and multimodal inputs.
不确定性是学习者在第二语言习得过程中经常出现的一种情感状态。这种状态既可以构成学习机会,也可以构成学习者受挫的根源。因此,适当的检测可以通过减少认知不稳定性而有利于学习过程。在本研究中,我们使用成人第二语言学习者和社交机器人之间的二元对话练习,通过操纵机器人的口语(增加词汇复杂性或韵律修改)来引出不确定性事件。然后使用这些事件的特征来分析机器人和两个学习者之间的多方练习对话。分类模型是用多模态特征训练的,这些特征来自于听众(不)确定性的注释事件。我们报告了我们的模型在不同设置、(子)转弯段和多模态输入上的性能。
{"title":"Detection of Listener Uncertainty in Robot-Led Second Language Conversation Practice","authors":"Ronald Cumbal, José Lopes, Olov Engwall","doi":"10.1145/3382507.3418873","DOIUrl":"https://doi.org/10.1145/3382507.3418873","url":null,"abstract":"Uncertainty is a frequently occurring affective state that learners experience during the acquisition of a second language. This state can constitute both a learning opportunity and a source of learner frustration. An appropriate detection could therefore benefit the learning process by reducing cognitive instability. In this study, we use a dyadic practice conversation between an adult second-language learner and a social robot to elicit events of uncertainty through the manipulation of the robot's spoken utterances (increased lexical complexity or prosody modifications). The characteristics of these events are then used to analyze multi-party practice conversations between a robot and two learners. Classification models are trained with multimodal features from annotated events of listener (un)certainty. We report the performance of our models on different settings, (sub)turn segments and multimodal inputs.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129524971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
The WoNoWa Dataset: Investigating the Transactive Memory System in Small Group Interactions WoNoWa数据集:研究小组互动中的交互记忆系统
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418843
Béatrice Biancardi, Lou Maisonnave-Couterou, Pierrick Renault, Brian Ravenet, M. Mancini, G. Varni
We present WoNoWa, a novel multi-modal dataset of small group interactions in collaborative tasks. The dataset is explicitly designed to elicit and to study over time a Transactive Memory System (TMS), a group's emergent state characterizing the group's meta-knowledge about "who knows what". A rich set of automatic features and manual annotations, extracted from the collected audio-visual data, is available on request for research purposes. Features include individual descriptors (e.g., position, Quantity of Motion, speech activity) and group descriptors (e.g., F-formations). Additionally, participants' self-assessments are available. Preliminary results from exploratory analyses show that the WoNoWa design allowed groups to develop a TMS that increased across the tasks. These results encourage the use of the WoNoWa dataset for a better understanding of the relationship between behavioural patterns and TMS, that in turn could help to improve group performance.
我们提出了WoNoWa,一个新的多模态数据集,用于协作任务中的小组交互。该数据集被明确设计为引出并随着时间的推移研究一个交互记忆系统(TMS),这是一个群体的突发状态,表征了这个群体关于“谁知道什么”的元知识。从收集的视听数据中提取的一套丰富的自动特征和手动注释可根据研究目的的要求提供。特征包括个体描述符(例如,位置,运动数量,言语活动)和群体描述符(例如,f形)。此外,参与者的自我评估是可用的。探索性分析的初步结果表明,WoNoWa设计允许小组开发在任务中增加的TMS。这些结果鼓励使用WoNoWa数据集来更好地理解行为模式和经颅磁刺激之间的关系,从而有助于提高群体表现。
{"title":"The WoNoWa Dataset: Investigating the Transactive Memory System in Small Group Interactions","authors":"Béatrice Biancardi, Lou Maisonnave-Couterou, Pierrick Renault, Brian Ravenet, M. Mancini, G. Varni","doi":"10.1145/3382507.3418843","DOIUrl":"https://doi.org/10.1145/3382507.3418843","url":null,"abstract":"We present WoNoWa, a novel multi-modal dataset of small group interactions in collaborative tasks. The dataset is explicitly designed to elicit and to study over time a Transactive Memory System (TMS), a group's emergent state characterizing the group's meta-knowledge about \"who knows what\". A rich set of automatic features and manual annotations, extracted from the collected audio-visual data, is available on request for research purposes. Features include individual descriptors (e.g., position, Quantity of Motion, speech activity) and group descriptors (e.g., F-formations). Additionally, participants' self-assessments are available. Preliminary results from exploratory analyses show that the WoNoWa design allowed groups to develop a TMS that increased across the tasks. These results encourage the use of the WoNoWa dataset for a better understanding of the relationship between behavioural patterns and TMS, that in turn could help to improve group performance.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126234007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Multimodal Affect and Aesthetic Experience 多模态情感与审美体验
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3420055
Theodoros Kostoulas, Michal Muszynski, Theodora Chaspari, Panos Amelidis
The term 'aesthetic experience' corresponds to the inner state of a person exposed to form and content of artistic objects. Exploring certain aesthetic values of artistic objects, as well as interpreting the aesthetic experience of people when exposed to art can contribute towards understanding (a) art and (b) people's affective reactions to artwork. Focusing on different types of artistic content, such as movies, music, urban art and other artwork, the goal of this workshop is to enhance the interdisciplinary collaboration between affective computing and aesthetics researchers.
“审美体验”一词对应的是一个人接触艺术对象的形式和内容的内心状态。探索艺术对象的某些审美价值,以及解释人们在接触艺术时的审美体验,有助于理解(a)艺术和(b)人们对艺术的情感反应。关注不同类型的艺术内容,如电影、音乐、城市艺术和其他艺术作品,本次研讨会的目标是加强情感计算和美学研究人员之间的跨学科合作。
{"title":"Multimodal Affect and Aesthetic Experience","authors":"Theodoros Kostoulas, Michal Muszynski, Theodora Chaspari, Panos Amelidis","doi":"10.1145/3382507.3420055","DOIUrl":"https://doi.org/10.1145/3382507.3420055","url":null,"abstract":"The term 'aesthetic experience' corresponds to the inner state of a person exposed to form and content of artistic objects. Exploring certain aesthetic values of artistic objects, as well as interpreting the aesthetic experience of people when exposed to art can contribute towards understanding (a) art and (b) people's affective reactions to artwork. Focusing on different types of artistic content, such as movies, music, urban art and other artwork, the goal of this workshop is to enhance the interdisciplinary collaboration between affective computing and aesthetics researchers.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"46 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122467248","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
Proceedings of the 2020 International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1