首页 > 最新文献

Proceedings of the 2020 International Conference on Multimodal Interaction最新文献

英文 中文
Multimodal Groups' Analysis for Automated Cohesion Estimation 自动衔接估计中的多模态群分析
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3421153
Lucien Maman
Groups are getting more and more scholars' attention. With the rise of Social Signal Processing (SSP), many studies based on Social Sciences and Psychology findings focused on detecting and classifying groups? dynamics. Cohesion plays an important role in these groups? dynamics and is one of the most studied emergent states, involving both group motions and goals. This PhD project aims to provide a computational model addressing the multidimensionality of cohesion and capturing its subtle dynamics. It will offer new opportunities to develop applications to enhance interactions among humans as well as among humans and machines.
群体正受到越来越多学者的关注。随着社会信号处理(SSP)的兴起,许多基于社会科学和心理学发现的研究都集中在检测和分类群体上。动力学。凝聚力在这些群体中扮演着重要的角色。动态和是研究最多的紧急状态之一,涉及群体运动和目标。这个博士项目旨在提供一个计算模型来解决内聚的多维性并捕捉其微妙的动态。它将为开发应用程序提供新的机会,以增强人与人之间以及人与机器之间的互动。
{"title":"Multimodal Groups' Analysis for Automated Cohesion Estimation","authors":"Lucien Maman","doi":"10.1145/3382507.3421153","DOIUrl":"https://doi.org/10.1145/3382507.3421153","url":null,"abstract":"Groups are getting more and more scholars' attention. With the rise of Social Signal Processing (SSP), many studies based on Social Sciences and Psychology findings focused on detecting and classifying groups? dynamics. Cohesion plays an important role in these groups? dynamics and is one of the most studied emergent states, involving both group motions and goals. This PhD project aims to provide a computational model addressing the multidimensionality of cohesion and capturing its subtle dynamics. It will offer new opportunities to develop applications to enhance interactions among humans as well as among humans and machines.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"169 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132972567","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Advanced Multi-Instance Learning Method with Multi-features Engineering and Conservative Optimization for Engagement Intensity Prediction 基于多特征工程和保守优化的交战强度预测的高级多实例学习方法
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3417959
Jianming Wu, Bo Yang, Yanan Wang, Gen Hattori
This paper proposes an advanced multi-instance learning method with multi-features engineering and conservative optimization for engagement intensity prediction. It was applied to the EmotiW Challenge 2020 and the results demonstrated the proposed method's good performance. The task is to predict the engagement level when a subject-student is watching an educational video under a range of conditions and in various environments. As engagement intensity has a strong correlation with facial movements, upper-body posture movements and overall environmental movements in a given time interval, we extract and incorporate these motion features into a deep regression model consisting of layers with a combination of long short-term memory(LSTM), gated recurrent unit (GRU) and a fully connected layer. In order to precisely and robustly predict the engagement level in a long video with various situations such as darkness and complex backgrounds, a multi-features engineering function is used to extract synchronized multi-model features in a given period of time by considering both short-term and long-term dependencies. Based on these well-processed engineered multi-features, in the 1st training stage, we train and generate the best models covering all the model configurations to maximize validation accuracy. Furthermore, in the 2nd training stage, to avoid the overfitting problem attributable to the extremely small engagement dataset, we conduct conservative optimization by applying a single Bi-LSTM layer with only 16 units to minimize the overfitting, and split the engagement dataset (train + validation) with 5-fold cross validation (stratified k-fold) to train a conservative model. The proposed method, by using decision-level ensemble for the two training stages' models, finally win the second place in the challenge (MSE: 0.061110 on the testing set).
本文提出了一种基于多特征工程和保守优化的先进多实例学习方法用于接合强度预测。将该方法应用于EmotiW挑战赛2020,结果证明了该方法的良好性能。这项任务是预测学生在各种条件和环境下观看教育视频时的参与程度。由于参与强度与给定时间间隔内的面部运动、上半身姿势运动和整体环境运动有很强的相关性,我们提取并将这些运动特征合并到一个深度回归模型中,该模型由长短期记忆(LSTM)、门控循环单元(GRU)和完全连接层组成。为了准确、稳健地预测长视频在黑暗、复杂背景等多种情况下的参与程度,采用多特征工程函数,同时考虑短、长期依赖关系,提取给定时间内同步的多模型特征。基于这些经过精心处理的工程多特征,在第一个训练阶段,我们训练并生成涵盖所有模型配置的最佳模型,以最大化验证精度。此外,在第二训练阶段,为了避免因交战数据集极小而导致的过拟合问题,我们采用仅16个单元的单个Bi-LSTM层进行保守优化,以最小化过拟合,并将交战数据集(训练+验证)分割为5倍交叉验证(分层k-fold),以训练保守模型。该方法通过对两个训练阶段的模型进行决策级集成,最终在挑战中获得第二名(测试集的MSE: 0.061110)。
{"title":"Advanced Multi-Instance Learning Method with Multi-features Engineering and Conservative Optimization for Engagement Intensity Prediction","authors":"Jianming Wu, Bo Yang, Yanan Wang, Gen Hattori","doi":"10.1145/3382507.3417959","DOIUrl":"https://doi.org/10.1145/3382507.3417959","url":null,"abstract":"This paper proposes an advanced multi-instance learning method with multi-features engineering and conservative optimization for engagement intensity prediction. It was applied to the EmotiW Challenge 2020 and the results demonstrated the proposed method's good performance. The task is to predict the engagement level when a subject-student is watching an educational video under a range of conditions and in various environments. As engagement intensity has a strong correlation with facial movements, upper-body posture movements and overall environmental movements in a given time interval, we extract and incorporate these motion features into a deep regression model consisting of layers with a combination of long short-term memory(LSTM), gated recurrent unit (GRU) and a fully connected layer. In order to precisely and robustly predict the engagement level in a long video with various situations such as darkness and complex backgrounds, a multi-features engineering function is used to extract synchronized multi-model features in a given period of time by considering both short-term and long-term dependencies. Based on these well-processed engineered multi-features, in the 1st training stage, we train and generate the best models covering all the model configurations to maximize validation accuracy. Furthermore, in the 2nd training stage, to avoid the overfitting problem attributable to the extremely small engagement dataset, we conduct conservative optimization by applying a single Bi-LSTM layer with only 16 units to minimize the overfitting, and split the engagement dataset (train + validation) with 5-fold cross validation (stratified k-fold) to train a conservative model. The proposed method, by using decision-level ensemble for the two training stages' models, finally win the second place in the challenge (MSE: 0.061110 on the testing set).","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131039134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 22
Detecting Depression in Less Than 10 Seconds: Impact of Speaking Time on Depression Detection Sensitivity 在10秒内检测抑郁症:说话时间对抑郁症检测灵敏度的影响
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418875
Nujud Aloshban, A. Esposito, A. Vinciarelli
This article investigates whether it is possible to detect depression using less than 10 seconds of speech. The experiments have involved 59 participants (including 29 that have been diagnosed with depression by a professional psychiatrist) and are based on a multimodal approach that jointly models linguistic (what people say) and acoustic (how people say it) aspects of speech using four different strategies for the fusion of multiple data streams. On average, every interview has lasted for 242.2 seconds, but the results show that 10 seconds or less are sufficient to achieve the same level of recall (roughly 70%) observed after using the entire inteview of every participant. In other words, it is possible to maintain the same level of sensitivity (the name of recall in clinical settings) while reducing by 95%, on average, the amount of time requireed to collect the necessary data.
这篇文章研究了是否有可能用不到10秒的言语来检测抑郁症。实验涉及59名参与者(包括29名被专业精神科医生诊断为抑郁症的人),实验基于多模态方法,该方法使用四种不同的策略来融合多个数据流,共同模拟语言(人们说什么)和声学(人们怎么说)方面的语言。平均而言,每次采访持续了242.2秒,但结果表明,10秒或更短的时间足以达到在使用每个参与者的整个采访后观察到的相同水平的回忆(大约70%)。换句话说,在平均减少95%收集必要数据所需时间的同时,保持相同水平的敏感性(临床环境中的召回名称)是可能的。
{"title":"Detecting Depression in Less Than 10 Seconds: Impact of Speaking Time on Depression Detection Sensitivity","authors":"Nujud Aloshban, A. Esposito, A. Vinciarelli","doi":"10.1145/3382507.3418875","DOIUrl":"https://doi.org/10.1145/3382507.3418875","url":null,"abstract":"This article investigates whether it is possible to detect depression using less than 10 seconds of speech. The experiments have involved 59 participants (including 29 that have been diagnosed with depression by a professional psychiatrist) and are based on a multimodal approach that jointly models linguistic (what people say) and acoustic (how people say it) aspects of speech using four different strategies for the fusion of multiple data streams. On average, every interview has lasted for 242.2 seconds, but the results show that 10 seconds or less are sufficient to achieve the same level of recall (roughly 70%) observed after using the entire inteview of every participant. In other words, it is possible to maintain the same level of sensitivity (the name of recall in clinical settings) while reducing by 95%, on average, the amount of time requireed to collect the necessary data.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115921864","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions 手势增强对模糊人机指令的理解
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418863
Dulanga Weerakoon, Vigneshwaran Subbaraju, Nipuni Karumpulli, Tuan Tran, Qianli Xu, U-Xuan Tan, Joo-Hwee Lim, Archan Misra
This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative tasks.We present M2Gestic, a system that combines neural-based text parsing with a novel knowledge-graph traversal mechanism, over a multi-modal input of vision, natural language text and pointing. Via multiple studies related to a benchmark table top manipulation task, we show that (a) M2Gestic can achieve close-to-human performance in reasoning over unambiguous verbal instructions, and (b) incorporating pointing input (even with its inherent location uncertainty) in M2Gestic results in a significant (30%) accuracy improvement when verbal instructions are ambiguous.
这项工作证明了使用指向手势的可行性和好处,这是一种自然产生的额外输入方式,可以提高人类对机器人代理进行协作任务的指令的多模态理解精度。我们提出了M2Gestic系统,它结合了基于神经的文本解析和一种新的知识图遍历机制,通过视觉、自然语言文本和指向的多模态输入。通过与基准桌面操作任务相关的多项研究,我们表明(a) M2Gestic可以在无歧义的口头指令上实现接近人类的推理性能,(b)在M2Gestic中结合指向输入(即使具有固有的位置不确定性),当口头指令含糊不清时,准确性显著提高(30%)。
{"title":"Gesture Enhanced Comprehension of Ambiguous Human-to-Robot Instructions","authors":"Dulanga Weerakoon, Vigneshwaran Subbaraju, Nipuni Karumpulli, Tuan Tran, Qianli Xu, U-Xuan Tan, Joo-Hwee Lim, Archan Misra","doi":"10.1145/3382507.3418863","DOIUrl":"https://doi.org/10.1145/3382507.3418863","url":null,"abstract":"This work demonstrates the feasibility and benefits of using pointing gestures, a naturally-generated additional input modality, to improve the multi-modal comprehension accuracy of human instructions to robotic agents for collaborative tasks.We present M2Gestic, a system that combines neural-based text parsing with a novel knowledge-graph traversal mechanism, over a multi-modal input of vision, natural language text and pointing. Via multiple studies related to a benchmark table top manipulation task, we show that (a) M2Gestic can achieve close-to-human performance in reasoning over unambiguous verbal instructions, and (b) incorporating pointing input (even with its inherent location uncertainty) in M2Gestic results in a significant (30%) accuracy improvement when verbal instructions are ambiguous.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128265337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Personalised Human Device Interaction through Context aware Augmented Reality 通过上下文感知增强现实的个性化人类设备交互
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3421157
Madhawa Perera
Human-device interactions in smart environments are shifting prominently towards naturalistic user interactions such as gaze and gesture. However, ambiguities arise when users have to switch interactions as contexts change. This could confuse users who are accustomed to a set of conventional controls leading to system inefficiencies. My research explores how to reduce interaction ambiguity by semantically modelling user specific interactions with context, enabling personalised interactions through AR. Sensory data captured from an AR device is utilised to interpret user interactions and context which is then modeled in an extendable knowledge graph along with user's interaction preference using semantic web standards. These representations are utilized to bring semantics to AR applications about user's intent to interact with a particular device affordance. Therefore, this research aims to bring semantical modeling of personalised gesture interactions in AR/VR applications for smart/immersive environments.
智能环境中的人机交互正显著转向自然的用户交互,如凝视和手势。但是,当用户必须随着上下文的变化而切换交互时,就会出现歧义。这可能会使习惯于一组常规控制的用户感到困惑,从而导致系统效率低下。我的研究探讨了如何通过语义建模与上下文的用户特定交互来减少交互歧义,从而通过AR实现个性化交互。从AR设备捕获的感官数据用于解释用户交互和上下文,然后在可扩展的知识图中建模,并使用语义web标准与用户交互偏好一起建模。这些表示用于为AR应用程序带来关于用户意图与特定设备功能交互的语义。因此,本研究旨在为智能/沉浸式环境下的AR/VR应用带来个性化手势交互的语义建模。
{"title":"Personalised Human Device Interaction through Context aware Augmented Reality","authors":"Madhawa Perera","doi":"10.1145/3382507.3421157","DOIUrl":"https://doi.org/10.1145/3382507.3421157","url":null,"abstract":"Human-device interactions in smart environments are shifting prominently towards naturalistic user interactions such as gaze and gesture. However, ambiguities arise when users have to switch interactions as contexts change. This could confuse users who are accustomed to a set of conventional controls leading to system inefficiencies. My research explores how to reduce interaction ambiguity by semantically modelling user specific interactions with context, enabling personalised interactions through AR. Sensory data captured from an AR device is utilised to interpret user interactions and context which is then modeled in an extendable knowledge graph along with user's interaction preference using semantic web standards. These representations are utilized to bring semantics to AR applications about user's intent to interact with a particular device affordance. Therefore, this research aims to bring semantical modeling of personalised gesture interactions in AR/VR applications for smart/immersive environments.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130033894","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
SmellControl: The Study of Sense of Agency in Smell 嗅觉控制:嗅觉代理感的研究
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418810
Patricia Ivette Cornelio Martinez, E. Maggioni, Giada Brianza, S. Subramanian, Marianna Obrist
The Sense of Agency (SoA) is crucial in interaction with technology, it refers to the feeling of 'I did that' as opposed to 'the system did that' supporting a feeling of being in control. Research in human-computer interaction has recently studied agency in visual, auditory and haptic interfaces, however the role of smell on agency remains unknown. Our sense of smell is quite powerful to elicit emotions, memories and awareness of the environment, which has been exploited to enhance user experiences (e.g., in VR and driving scenarios). In light of increased interest in designing multimodal interfaces including smell and its close link with emotions, we investigated, for the first time, the effect of smell-induced emotions on the SoA. We conducted a study using the Intentional Binding (IB) paradigm used to measure SoA while participants were exposed to three scents with different valence (pleasant, unpleasant, neutral). Our results show that participants? SoA increased with a pleasant scent compared to neutral and unpleasant scents. We discuss how our results can inform the design of multimodal and future olfactory interfaces.
代理感(SoA)在与技术的交互中是至关重要的,它指的是“我做了那件事”的感觉,而不是“系统做了那件事”,它支持一种控制的感觉。近年来,人机交互的研究主要集中在视觉、听觉和触觉界面上,但嗅觉在交互中的作用尚不清楚。我们的嗅觉非常强大,可以引发情绪、记忆和对环境的感知,这已经被用来增强用户体验(例如,在VR和驾驶场景中)。鉴于人们对设计包括气味及其与情感的密切联系在内的多模态界面的兴趣日益增加,我们首次调查了气味引起的情绪对SoA的影响。我们使用意图绑定(IB)范式进行了一项研究,该范式用于测量SoA,同时参与者暴露于三种不同价的气味(愉快的,不愉快的,中性的)。我们的研究结果显示,参与者?与中性和不愉快的气味相比,令人愉快的气味增加了SoA。我们讨论了我们的结果如何为多模态和未来嗅觉界面的设计提供信息。
{"title":"SmellControl: The Study of Sense of Agency in Smell","authors":"Patricia Ivette Cornelio Martinez, E. Maggioni, Giada Brianza, S. Subramanian, Marianna Obrist","doi":"10.1145/3382507.3418810","DOIUrl":"https://doi.org/10.1145/3382507.3418810","url":null,"abstract":"The Sense of Agency (SoA) is crucial in interaction with technology, it refers to the feeling of 'I did that' as opposed to 'the system did that' supporting a feeling of being in control. Research in human-computer interaction has recently studied agency in visual, auditory and haptic interfaces, however the role of smell on agency remains unknown. Our sense of smell is quite powerful to elicit emotions, memories and awareness of the environment, which has been exploited to enhance user experiences (e.g., in VR and driving scenarios). In light of increased interest in designing multimodal interfaces including smell and its close link with emotions, we investigated, for the first time, the effect of smell-induced emotions on the SoA. We conducted a study using the Intentional Binding (IB) paradigm used to measure SoA while participants were exposed to three scents with different valence (pleasant, unpleasant, neutral). Our results show that participants? SoA increased with a pleasant scent compared to neutral and unpleasant scents. We discuss how our results can inform the design of multimodal and future olfactory interfaces.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132848893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Force9: Force-assisted Miniature Keyboard on Smart Wearables Force9:智能可穿戴设备上的力辅助微型键盘
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418827
Lik-Hang Lee, Ngo Yan Yeung, Tristan Braud, Tong Li, Xiang Su, Pan Hui
Smartwatches and other wearables are characterized by small-scale touchscreens that complicate the interaction with content. In this paper, we present Force9, the first optimized miniature keyboard leveraging force-sensitive touchscreens on wrist-worn computers. Force9 enables character selection in an ambiguous layout by analyzing the trade-off between interaction space and the easiness of force-assisted interaction. We argue that dividing the screen's pressure range into three contiguous force levels is sufficient to differentiate characters for fast and accurate text input. Our pilot study captures and calibrates the ability of users to perform force-assisted touches on miniature-sized keys on touchscreen devices. We then optimize the keyboard layout considering the goodness of character pairs (with regards to the selected English corpus) under the force-based configuration and the users? familiarity with the QWERTY layout. We finally evaluate the performance of the trimetric optimized Force9 layout, and achieve an average of 10.18 WPM by the end of the final session. Compared to the other state-of-the-art approaches, Force9 allows for single-gesture character selection without addendum sensors.
智能手表和其他可穿戴设备的特点是小型触摸屏,这使得与内容的交互变得复杂。在本文中,我们展示了Force9,这是第一个优化的微型键盘,利用腕式计算机上的力敏触摸屏。Force9通过分析交互空间和力辅助交互的易用性之间的权衡,在模糊布局中实现字符选择。我们认为将屏幕的压力范围划分为三个连续的力度级别足以区分字符,从而实现快速准确的文本输入。我们的试点研究捕获并校准了用户在触摸屏设备上对微型键执行力辅助触摸的能力。然后,考虑到在基于力的配置下字符对(相对于所选的英语语料库)的优点和用户的需求,我们优化了键盘布局。熟悉QWERTY键盘布局。我们最后评估了三度量优化的Force9布局的性能,并在最后一次会议结束时实现了平均10.18 WPM。与其他最先进的方法相比,Force9允许在没有附加传感器的情况下进行单手势字符选择。
{"title":"Force9: Force-assisted Miniature Keyboard on Smart Wearables","authors":"Lik-Hang Lee, Ngo Yan Yeung, Tristan Braud, Tong Li, Xiang Su, Pan Hui","doi":"10.1145/3382507.3418827","DOIUrl":"https://doi.org/10.1145/3382507.3418827","url":null,"abstract":"Smartwatches and other wearables are characterized by small-scale touchscreens that complicate the interaction with content. In this paper, we present Force9, the first optimized miniature keyboard leveraging force-sensitive touchscreens on wrist-worn computers. Force9 enables character selection in an ambiguous layout by analyzing the trade-off between interaction space and the easiness of force-assisted interaction. We argue that dividing the screen's pressure range into three contiguous force levels is sufficient to differentiate characters for fast and accurate text input. Our pilot study captures and calibrates the ability of users to perform force-assisted touches on miniature-sized keys on touchscreen devices. We then optimize the keyboard layout considering the goodness of character pairs (with regards to the selected English corpus) under the force-based configuration and the users? familiarity with the QWERTY layout. We finally evaluate the performance of the trimetric optimized Force9 layout, and achieve an average of 10.18 WPM by the end of the final session. Compared to the other state-of-the-art approaches, Force9 allows for single-gesture character selection without addendum sensors.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126212775","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Multisensory Approaches to Human-Food Interaction 人与食物相互作用的多感官方法
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3419749
Carlos Velasco, A. Nijholt, C. Spence, Takuji Narumi, Kosuke Motoki, Gijs Huisman, Marianna Obrist
Here, we present the outcome of the 4th workshop on Multisensory Approaches to Human-Food Interaction (MHFI), developed in collaboration with ICMI 2020 in Utrecht, The Netherlands. Capitalizing on the increasing interest on multisensory aspects of human-food interaction and the unique contribution that our community offers, we developed a space to discuss ideas ranging from mechanisms of multisensory food perception, through multisensory technologies, to new applications of systems in the context of MHFI. All in all, the workshop involved 11 contributions, which will hopefully further help shape the basis of a field of inquiry that grows as we see progress in our understanding of the senses and the development of new technologies in the context of food.
在这里,我们介绍了与荷兰乌得勒支ICMI 2020合作开发的第四届人与食物相互作用多感官方法研讨会(MHFI)的成果。利用人们对人-食物相互作用的多感官方面日益增长的兴趣和我们社区提供的独特贡献,我们开发了一个空间来讨论从多感官食物感知机制,通过多感官技术,到MHFI背景下系统的新应用。总而言之,研讨会涉及了11项贡献,这些贡献有望进一步帮助塑造一个探究领域的基础,随着我们对感官的理解和食品新技术的发展,这个领域将不断发展。
{"title":"Multisensory Approaches to Human-Food Interaction","authors":"Carlos Velasco, A. Nijholt, C. Spence, Takuji Narumi, Kosuke Motoki, Gijs Huisman, Marianna Obrist","doi":"10.1145/3382507.3419749","DOIUrl":"https://doi.org/10.1145/3382507.3419749","url":null,"abstract":"Here, we present the outcome of the 4th workshop on Multisensory Approaches to Human-Food Interaction (MHFI), developed in collaboration with ICMI 2020 in Utrecht, The Netherlands. Capitalizing on the increasing interest on multisensory aspects of human-food interaction and the unique contribution that our community offers, we developed a space to discuss ideas ranging from mechanisms of multisensory food perception, through multisensory technologies, to new applications of systems in the context of MHFI. All in all, the workshop involved 11 contributions, which will hopefully further help shape the basis of a field of inquiry that grows as we see progress in our understanding of the senses and the development of new technologies in the context of food.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126560484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Recognizing Emotion in the Wild using Multimodal Data 使用多模态数据识别野外情绪
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3417970
Shivam Srivastava, Saandeep Aathreya Sidhapur Lakshminarayan, Saurabh Hinduja, Sk Rahatul Jannat, Hamza Elhamdadi, Shaun J. Canavan
In this work, we present our approach for all four tracks of the eighth Emotion Recognition in the Wild Challenge (EmotiW 2020). The four tasks are group emotion recognition, driver gaze prediction, predicting engagement in the wild, and emotion recognition using physiological signals. We explore multiple approaches including classical machine learning tools such as random forests, state of the art deep neural networks, and multiple fusion and ensemble-based approaches. We also show that similar approaches can be used across tracks as many of the features generalize well to the different problems (e.g. facial features). We detail evaluation results that are either comparable to or outperform the baseline results for both the validation and testing for most of the tracks.
在这项工作中,我们提出了我们在野生挑战(EmotiW 2020)中第八届情绪识别的所有四个轨道的方法。这四项任务分别是群体情绪识别、驾驶员注视预测、预测野外参与以及利用生理信号进行情绪识别。我们探索了多种方法,包括经典的机器学习工具,如随机森林,最先进的深度神经网络,以及多种融合和基于集成的方法。我们还表明,类似的方法可以跨轨道使用,因为许多特征可以很好地概括不同的问题(例如面部特征)。我们详细描述了对于大多数轨迹的验证和测试,与基线结果相比较或者优于基线结果的评估结果。
{"title":"Recognizing Emotion in the Wild using Multimodal Data","authors":"Shivam Srivastava, Saandeep Aathreya Sidhapur Lakshminarayan, Saurabh Hinduja, Sk Rahatul Jannat, Hamza Elhamdadi, Shaun J. Canavan","doi":"10.1145/3382507.3417970","DOIUrl":"https://doi.org/10.1145/3382507.3417970","url":null,"abstract":"In this work, we present our approach for all four tracks of the eighth Emotion Recognition in the Wild Challenge (EmotiW 2020). The four tasks are group emotion recognition, driver gaze prediction, predicting engagement in the wild, and emotion recognition using physiological signals. We explore multiple approaches including classical machine learning tools such as random forests, state of the art deep neural networks, and multiple fusion and ensemble-based approaches. We also show that similar approaches can be used across tracks as many of the features generalize well to the different problems (e.g. facial features). We detail evaluation results that are either comparable to or outperform the baseline results for both the validation and testing for most of the tracks.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"78 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121143044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Preserving Privacy in Image-based Emotion Recognition through User Anonymization 通过用户匿名保护基于图像的情感识别中的隐私
Pub Date : 2020-10-21 DOI: 10.1145/3382507.3418833
Vansh Narula, Kexin Feng, Theodora Chaspari
The large amount of data captured by ambulatory sensing devices can afford us insights into longitudinal behavioral patterns, which can be linked to emotional, psychological, and cognitive outcomes. Yet, the sensitivity of behavioral data, which regularly involve speech signals and facial images, can cause strong privacy concerns, such as the leaking of the user identity. We examine the interplay between emotion-specific and user identity-specific information in image-based emotion recognition systems. We further study a user anonymization approach that preserves emotion-specific information, but eliminates user-dependent information from the convolutional kernel of convolutional neural networks (CNN), therefore reducing user re-identification risks. We formulate an adversarial learning problem implemented with a multitask CNN, that minimizes emotion classification and maximizes user identification loss. The proposed system is evaluated on three datasets achieving moderate to high emotion recognition and poor user identity recognition performance. The resulting image transformation obtained by the convolutional layer is visually inspected, attesting to the efficacy of the proposed system in preserving emotion-specific information. Implications from this study can inform the design of privacy-aware emotion recognition systems that preserve facets of human behavior, while concealing the identity of the user, and can be used in ambulatory monitoring applications related to health, well-being, and education.
移动传感设备捕获的大量数据可以让我们深入了解纵向行为模式,这可能与情感、心理和认知结果有关。然而,通常涉及语音信号和面部图像的行为数据的敏感性可能会引起强烈的隐私问题,例如泄露用户身份。我们研究了基于图像的情感识别系统中情感特定信息和用户身份特定信息之间的相互作用。我们进一步研究了一种用户匿名化方法,该方法保留了情感特定信息,但从卷积神经网络(CNN)的卷积核中消除了用户依赖信息,从而降低了用户重新识别的风险。我们制定了一个使用多任务CNN实现的对抗性学习问题,该问题最大限度地减少了情感分类并最大限度地减少了用户识别损失。在三个数据集上对该系统进行了评估,获得了中高情感识别和较差的用户身份识别性能。通过卷积层获得的图像变换结果进行了视觉检查,证明了所提出的系统在保留情感特定信息方面的有效性。本研究的启示可以为隐私感知情感识别系统的设计提供信息,该系统在隐藏用户身份的同时保留了人类行为的各个方面,并可用于与健康、福祉和教育相关的动态监测应用。
{"title":"Preserving Privacy in Image-based Emotion Recognition through User Anonymization","authors":"Vansh Narula, Kexin Feng, Theodora Chaspari","doi":"10.1145/3382507.3418833","DOIUrl":"https://doi.org/10.1145/3382507.3418833","url":null,"abstract":"The large amount of data captured by ambulatory sensing devices can afford us insights into longitudinal behavioral patterns, which can be linked to emotional, psychological, and cognitive outcomes. Yet, the sensitivity of behavioral data, which regularly involve speech signals and facial images, can cause strong privacy concerns, such as the leaking of the user identity. We examine the interplay between emotion-specific and user identity-specific information in image-based emotion recognition systems. We further study a user anonymization approach that preserves emotion-specific information, but eliminates user-dependent information from the convolutional kernel of convolutional neural networks (CNN), therefore reducing user re-identification risks. We formulate an adversarial learning problem implemented with a multitask CNN, that minimizes emotion classification and maximizes user identification loss. The proposed system is evaluated on three datasets achieving moderate to high emotion recognition and poor user identity recognition performance. The resulting image transformation obtained by the convolutional layer is visually inspected, attesting to the efficacy of the proposed system in preserving emotion-specific information. Implications from this study can inform the design of privacy-aware emotion recognition systems that preserve facets of human behavior, while concealing the identity of the user, and can be used in ambulatory monitoring applications related to health, well-being, and education.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121666569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
Proceedings of the 2020 International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1