首页 > 最新文献

Proceedings of the 2015 ACM on International Conference on Multimodal Interaction最新文献

英文 中文
Multimodal Assessment of Teaching Behavior in Immersive Rehearsal Environment-TeachLivE 沉浸式排练环境下教学行为的多模态评价
R. Barmaki
Nonverbal behaviors such as facial expressions, eye contact, gestures, and body movements in general have strong impacts on the process of communicative interactions. Gestures play an important role in interpersonal communication in the classroom between student and teacher. To assist teachers with exhibiting open and positive nonverbal signals in their actual classroom, we have designed a multimodal teaching application with provisions for real-time feedback in coordination with our TeachLivE test-bed environment and its reflective application; ReflectLivE. Individuals walk into this virtual environment and interact with five virtual students shown on a large screen display. The recent research study is designed to have two settings (7-minute long each). In each of the settings, the participants are provided lesson plans from which they teach. All the participants are asked to take part in both settings, with half receiving automated real-time feedback about their body poses in the first session (group 1) and the other half receiving such feedback in the second session (group 2). Feedback is in the form of a visual indication each time the participant exhibits a closed stance. To create this automated feedback application, a closed posture corpus was collected and trained based on the existing TeachLivE teaching records. After each session, the participants take a post-questionnaire about their experience. We hypothesize that visual feedback improves positive body gestures for both groups during the feedback session, and that, for group 2, this persists into their second unaided session but, for group 1, improvements occur only during the second session.
非语言行为,如面部表情、眼神交流、手势和身体动作,通常对交流互动的过程有很大的影响。手势在课堂上师生之间的人际交往中起着重要的作用。为了帮助教师在实际课堂中表现出开放和积极的非语言信号,我们设计了一个多模式教学应用程序,并提供实时反馈,以配合我们的TeachLivE测试平台环境及其反思性应用程序;ReflectLivE。个人走进这个虚拟环境,并与显示在大屏幕上的五个虚拟学生互动。最近的研究被设计成两个场景(每个场景7分钟)。在每个设置中,参与者都提供了他们教学的课程计划。所有的参与者都被要求参加两种设置,其中一半人在第一阶段(第一组)收到关于他们身体姿势的自动实时反馈,另一半人在第二阶段(第二组)收到这样的反馈。反馈以视觉指示的形式出现,每次参与者表现出封闭的姿势。为了创建这个自动反馈应用程序,基于现有的TeachLivE教学记录收集和训练了一个封闭的姿势语料库。每次疗程结束后,参与者都会对他们的经历进行问卷调查。我们假设,在反馈过程中,视觉反馈改善了两组人的积极肢体动作,对第二组来说,这种改善会持续到他们的第二次独立会话,而对第一组来说,这种改善只会在第二次会话中出现。
{"title":"Multimodal Assessment of Teaching Behavior in Immersive Rehearsal Environment-TeachLivE","authors":"R. Barmaki","doi":"10.1145/2818346.2823306","DOIUrl":"https://doi.org/10.1145/2818346.2823306","url":null,"abstract":"Nonverbal behaviors such as facial expressions, eye contact, gestures, and body movements in general have strong impacts on the process of communicative interactions. Gestures play an important role in interpersonal communication in the classroom between student and teacher. To assist teachers with exhibiting open and positive nonverbal signals in their actual classroom, we have designed a multimodal teaching application with provisions for real-time feedback in coordination with our TeachLivE test-bed environment and its reflective application; ReflectLivE. Individuals walk into this virtual environment and interact with five virtual students shown on a large screen display. The recent research study is designed to have two settings (7-minute long each). In each of the settings, the participants are provided lesson plans from which they teach. All the participants are asked to take part in both settings, with half receiving automated real-time feedback about their body poses in the first session (group 1) and the other half receiving such feedback in the second session (group 2). Feedback is in the form of a visual indication each time the participant exhibits a closed stance. To create this automated feedback application, a closed posture corpus was collected and trained based on the existing TeachLivE teaching records. After each session, the participants take a post-questionnaire about their experience. We hypothesize that visual feedback improves positive body gestures for both groups during the feedback session, and that, for group 2, this persists into their second unaided session but, for group 1, improvements occur only during the second session.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72720552","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Micro-opinion Sentiment Intensity Analysis and Summarization in Online Videos 网络视频中的微意见情绪强度分析与总结
Amir Zadeh
There has been substantial progress in the field of text based sentiment analysis but little effort has been made to incorporate other modalities. Previous work in sentiment analysis has shown that using multimodal data yields to more accurate models of sentiment. Efforts have been made towards expressing sentiment as a spectrum of intensity rather than just positive or negative. Such models are useful not only for detection of positivity or negativity, but also giving out a score of how positive or negative a statement is. Based on the state of the art studies in sentiment analysis, prediction in terms of sentiment score is still far from accurate, even in large datasets [27]. Another challenge in sentiment analysis is dealing with small segments or micro opinions as they carry less context than large segments thus making analysis of the sentiment harder. This paper presents a Ph.D. thesis shaped towards comprehensive studies in multimodal micro-opinion sentiment intensity analysis.
在基于文本的情感分析领域已经取得了实质性进展,但很少努力纳入其他模式。先前在情绪分析方面的工作表明,使用多模态数据可以产生更准确的情绪模型。人们努力将情绪表达为一系列的强度,而不仅仅是积极或消极。这些模型不仅对检测积极或消极有用,而且还对陈述的积极或消极程度进行评分。基于情感分析的最新研究,即使在大型数据集中,基于情感得分的预测仍然远远不够准确[27]。情绪分析的另一个挑战是处理小片段或微观点,因为它们比大片段具有更少的背景,从而使情绪分析变得更加困难。本文提出了一篇针对多模态微意见情绪强度分析的综合研究的博士论文。
{"title":"Micro-opinion Sentiment Intensity Analysis and Summarization in Online Videos","authors":"Amir Zadeh","doi":"10.1145/2818346.2823317","DOIUrl":"https://doi.org/10.1145/2818346.2823317","url":null,"abstract":"There has been substantial progress in the field of text based sentiment analysis but little effort has been made to incorporate other modalities. Previous work in sentiment analysis has shown that using multimodal data yields to more accurate models of sentiment. Efforts have been made towards expressing sentiment as a spectrum of intensity rather than just positive or negative. Such models are useful not only for detection of positivity or negativity, but also giving out a score of how positive or negative a statement is. Based on the state of the art studies in sentiment analysis, prediction in terms of sentiment score is still far from accurate, even in large datasets [27]. Another challenge in sentiment analysis is dealing with small segments or micro opinions as they carry less context than large segments thus making analysis of the sentiment harder. This paper presents a Ph.D. thesis shaped towards comprehensive studies in multimodal micro-opinion sentiment intensity analysis.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74035397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Exploiting Multimodal Affect and Semantics to Identify Politically Persuasive Web Videos 利用多模态情感和语义学识别具有政治说服力的网络视频
Behjat Siddiquie, Dave Chisholm, Ajay Divakaran
We introduce the task of automatically classifying politically persuasive web videos and propose a highly effective multi-modal approach for this task. We extract audio, visual, and textual features that attempt to capture affect and semantics in the audio-visual content and sentiment in the viewers' comments. We demonstrate that each of the feature modalities can be used to classify politically persuasive content, and that fusing them leads to the best performance. We also perform experiments to examine human accuracy and inter-coder reliability for this task and show that our best automatic classifier slightly outperforms average human performance. Finally we show that politically persuasive videos generate more strongly negative viewer comments than non-persuasive videos and analyze how affective content can be used to predict viewer reactions.
我们介绍了自动分类具有政治说服力的网络视频的任务,并提出了一种高效的多模态方法。我们提取音频、视觉和文本特征,试图捕捉视听内容中的情感和语义以及观众评论中的情感。我们证明了每个特征模态都可以用来对具有政治说服力的内容进行分类,并且融合它们会产生最佳性能。我们还进行了实验来检查人类对这项任务的准确性和编码器间的可靠性,并表明我们最好的自动分类器略微优于人类的平均性能。最后,我们证明了政治说服性视频比非说服性视频产生更强烈的负面观众评论,并分析了如何使用情感内容来预测观众的反应。
{"title":"Exploiting Multimodal Affect and Semantics to Identify Politically Persuasive Web Videos","authors":"Behjat Siddiquie, Dave Chisholm, Ajay Divakaran","doi":"10.1145/2818346.2820732","DOIUrl":"https://doi.org/10.1145/2818346.2820732","url":null,"abstract":"We introduce the task of automatically classifying politically persuasive web videos and propose a highly effective multi-modal approach for this task. We extract audio, visual, and textual features that attempt to capture affect and semantics in the audio-visual content and sentiment in the viewers' comments. We demonstrate that each of the feature modalities can be used to classify politically persuasive content, and that fusing them leads to the best performance. We also perform experiments to examine human accuracy and inter-coder reliability for this task and show that our best automatic classifier slightly outperforms average human performance. Finally we show that politically persuasive videos generate more strongly negative viewer comments than non-persuasive videos and analyze how affective content can be used to predict viewer reactions.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"505 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77345738","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
A Distributed Architecture for Interacting with NAO 与NAO交互的分布式体系结构
Fabien Badeig, Quentin Pelorson, S. Arias, Vincent Drouard, I. D. Gebru, Xiaofei Li, Georgios D. Evangelidis, R. Horaud
One of the main applications of the humanoid robot NAO - a small robot companion - is human-robot interaction (HRI). NAO is particularly well suited for HRI applications because of its design, hardware specifications, programming capabilities, and affordable cost. Indeed, NAO can stand up, walk, wander, dance, play soccer, sit down, recognize and grasp simple objects, detect and identify people, localize sounds, understand some spoken words, engage itself in simple and goal-directed dialogs, and synthesize speech. This is made possible due to the robot's 24 degree-of-freedom articulated structure (body, legs, feet, arms, hands, head, etc.), motors, cameras, microphones, etc., as well as to its on-board computing hardware and embedded software, e.g., robot motion control. Nevertheless, the current NAO configuration has two drawbacks that restrict the complexity of interactive behaviors that could potentially be implemented. Firstly, the on-board computing resources are inherently limited, which implies that it is difficult to implement sophisticated computer vision and audio signal analysis algorithms required by advanced interactive tasks. Secondly, programming new robot functionalities currently implies the development of embedded software, which is a difficult task in its own right necessitating specialized knowledge. The vast majority of HRI practitioners may not have this kind of expertise and hence they cannot easily and quickly implement their ideas, carry out thorough experimental validations, and design proof-of-concept demonstrators. We have developed a distributed software architecture that attempts to overcome these two limitations. Broadly speaking, NAO's on-board computing resources are augmented with external computing resources. The latter is a computer platform with its CPUs, GPUs, memory, operating system, libraries, software packages, internet access, etc. This configuration enables easy and fast development in Matlab, C, C++, or Python. Moreover, it allows the user to combine on-board libraries (motion control, face detection, etc.) with external toolboxes, e.g., OpenCv.
人形机器人NAO是一种小型机器人伴侣,其主要应用之一是人机交互(HRI)。由于其设计、硬件规格、编程能力和可承受的成本,NAO特别适合HRI应用程序。事实上,NAO可以站立、行走、漫步、跳舞、踢足球、坐下、识别和抓住简单的物体、检测和识别人、定位声音、理解一些口语单词、进行简单和目标导向的对话,以及合成语音。这是由于机器人的24自由度铰接结构(身体,腿,脚,手臂,手,头等),电机,相机,麦克风等,以及其机载计算硬件和嵌入式软件,例如,机器人运动控制。然而,当前的NAO配置有两个缺点,限制了可能实现的交互行为的复杂性。首先,机载计算资源本身有限,难以实现高级交互任务所需的复杂计算机视觉和音频信号分析算法。其次,编程新的机器人功能目前意味着嵌入式软件的开发,这本身就是一项艰巨的任务,需要专门的知识。绝大多数HRI从业者可能没有这种专业知识,因此他们无法轻松快速地实现他们的想法,进行彻底的实验验证,并设计概念验证演示。我们已经开发了一种分布式软件架构,试图克服这两个限制。一般来说,NAO的机载计算资源是通过外部计算资源进行扩充的。后者是一个计算机平台,包括cpu、gpu、内存、操作系统、库、软件包、互联网接入等。此配置可以在Matlab, C, c++或Python中轻松快速地进行开发。此外,它允许用户将机载库(运动控制,人脸检测等)与外部工具箱(例如OpenCv)结合起来。
{"title":"A Distributed Architecture for Interacting with NAO","authors":"Fabien Badeig, Quentin Pelorson, S. Arias, Vincent Drouard, I. D. Gebru, Xiaofei Li, Georgios D. Evangelidis, R. Horaud","doi":"10.1145/2818346.2823303","DOIUrl":"https://doi.org/10.1145/2818346.2823303","url":null,"abstract":"One of the main applications of the humanoid robot NAO - a small robot companion - is human-robot interaction (HRI). NAO is particularly well suited for HRI applications because of its design, hardware specifications, programming capabilities, and affordable cost. Indeed, NAO can stand up, walk, wander, dance, play soccer, sit down, recognize and grasp simple objects, detect and identify people, localize sounds, understand some spoken words, engage itself in simple and goal-directed dialogs, and synthesize speech. This is made possible due to the robot's 24 degree-of-freedom articulated structure (body, legs, feet, arms, hands, head, etc.), motors, cameras, microphones, etc., as well as to its on-board computing hardware and embedded software, e.g., robot motion control. Nevertheless, the current NAO configuration has two drawbacks that restrict the complexity of interactive behaviors that could potentially be implemented. Firstly, the on-board computing resources are inherently limited, which implies that it is difficult to implement sophisticated computer vision and audio signal analysis algorithms required by advanced interactive tasks. Secondly, programming new robot functionalities currently implies the development of embedded software, which is a difficult task in its own right necessitating specialized knowledge. The vast majority of HRI practitioners may not have this kind of expertise and hence they cannot easily and quickly implement their ideas, carry out thorough experimental validations, and design proof-of-concept demonstrators. We have developed a distributed software architecture that attempts to overcome these two limitations. Broadly speaking, NAO's on-board computing resources are augmented with external computing resources. The latter is a computer platform with its CPUs, GPUs, memory, operating system, libraries, software packages, internet access, etc. This configuration enables easy and fast development in Matlab, C, C++, or Python. Moreover, it allows the user to combine on-board libraries (motion control, face detection, etc.) with external toolboxes, e.g., OpenCv.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76388108","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Challenges in Deep Learning for Multimodal Applications 深度学习在多模态应用中的挑战
Sayan Ghosh
This consortium paper outlines a research plan for investigating deep learning techniques as applied to multimodal multi-task learning and multimodal fusion. We discuss our prior research results in this area, and how these results motivate us to explore more in this direction. We also define concrete steps of enquiry we wish to undertake as a short-term goal, and further outline some other challenges of multimodal learning using deep neural networks, such as inter and intra-modality synchronization, robustness to noise in modality data acquisition, and data insufficiency.
这篇联合论文概述了一项研究计划,用于研究应用于多模态多任务学习和多模态融合的深度学习技术。我们讨论了我们之前在这一领域的研究成果,以及这些结果如何激励我们在这一方向上进行更多的探索。我们还定义了我们希望作为短期目标进行的具体调查步骤,并进一步概述了使用深度神经网络进行多模态学习的一些其他挑战,例如模态间和模态内同步,模态数据采集中对噪声的鲁棒性以及数据不足。
{"title":"Challenges in Deep Learning for Multimodal Applications","authors":"Sayan Ghosh","doi":"10.1145/2818346.2823313","DOIUrl":"https://doi.org/10.1145/2818346.2823313","url":null,"abstract":"This consortium paper outlines a research plan for investigating deep learning techniques as applied to multimodal multi-task learning and multimodal fusion. We discuss our prior research results in this area, and how these results motivate us to explore more in this direction. We also define concrete steps of enquiry we wish to undertake as a short-term goal, and further outline some other challenges of multimodal learning using deep neural networks, such as inter and intra-modality synchronization, robustness to noise in modality data acquisition, and data insufficiency.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"36 6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77677328","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Deciphering the Silent Participant: On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions 解读沉默的参与者:利用视听线索对小组讨论中的听者类别进行分类
Catharine Oertel, Kenneth Alberto Funes Mora, Joakim Gustafson, J. Odobez
Estimating a silent participant's degree of engagement and his role within a group discussion can be challenging, as there are no speech related cues available at the given time. Having this information available, however, can provide important insights into the dynamics of the group as a whole. In this paper, we study the classification of listeners into several categories (attentive listener, side participant and bystander). We devised a thin-sliced perception test where subjects were asked to assess listener roles and engagement levels in 15-second video-clips taken from a corpus of group interviews. Results show that humans are usually able to assess silent participant roles. Using the annotation to identify from a set of multimodal low-level features, such as past speaking activity, backchannels (both visual and verbal), as well as gaze patterns, we could identify the features which are able to distinguish between different listener categories. Moreover, the results show that many of the audio-visual effects observed on listeners in dyadic interactions, also hold for multi-party interactions. A preliminary classifier achieves an accuracy of 64 %.
估计一个沉默的参与者的参与程度和他在小组讨论中的角色是具有挑战性的,因为在给定的时间内没有与言语相关的线索。然而,拥有这些可用的信息可以提供对整个团队动态的重要见解。在本文中,我们研究了倾听者的分类:注意倾听者、侧面参与者和旁观者。我们设计了一个薄片感知测试,要求受试者在15秒的视频片段中评估听众的角色和参与程度,这些视频片段取自小组访谈的语料库。结果表明,人类通常能够评估沉默参与者的角色。使用注释从一组多模态低级特征中进行识别,例如过去的说话活动,反向通道(视觉和口头)以及凝视模式,我们可以识别能够区分不同听众类别的特征。此外,结果表明,在二元互动中观察到的听者的许多视听效果也适用于多方互动。初步分类器的准确率达到64%。
{"title":"Deciphering the Silent Participant: On the Use of Audio-Visual Cues for the Classification of Listener Categories in Group Discussions","authors":"Catharine Oertel, Kenneth Alberto Funes Mora, Joakim Gustafson, J. Odobez","doi":"10.1145/2818346.2820759","DOIUrl":"https://doi.org/10.1145/2818346.2820759","url":null,"abstract":"Estimating a silent participant's degree of engagement and his role within a group discussion can be challenging, as there are no speech related cues available at the given time. Having this information available, however, can provide important insights into the dynamics of the group as a whole. In this paper, we study the classification of listeners into several categories (attentive listener, side participant and bystander). We devised a thin-sliced perception test where subjects were asked to assess listener roles and engagement levels in 15-second video-clips taken from a corpus of group interviews. Results show that humans are usually able to assess silent participant roles. Using the annotation to identify from a set of multimodal low-level features, such as past speaking activity, backchannels (both visual and verbal), as well as gaze patterns, we could identify the features which are able to distinguish between different listener categories. Moreover, the results show that many of the audio-visual effects observed on listeners in dyadic interactions, also hold for multi-party interactions. A preliminary classifier achieves an accuracy of 64 %.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"17 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86895940","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Social Touch Gesture Recognition using Random Forest and Boosting on Distinct Feature Sets 基于随机森林和显著特征集增强的社交触摸手势识别
Y. F. A. Gaus, Temitayo A. Olugbade, Asim Jan, R. Qin, Jingxin Liu, Fan Zhang, H. Meng, N. Bianchi-Berthouze
Touch is a primary nonverbal communication channel used to communicate emotions or other social messages. Despite its importance, this channel is still very little explored in the affective computing field, as much more focus has been placed on visual and aural channels. In this paper, we investigate the possibility to automatically discriminate between different social touch types. We propose five distinct feature sets for describing touch behaviours captured by a grid of pressure sensors. These features are then combined together by using the Random Forest and Boosting methods for categorizing the touch gesture type. The proposed methods were evaluated on both the HAART (7 gesture types over different surfaces) and the CoST (14 gesture types over the same surface) datasets made available by the Social Touch Gesture Challenge 2015. Well above chance level performances were achieved with a 67% accuracy for the HAART and 59% for the CoST testing datasets respectively.
触摸是一种主要的非语言交流渠道,用于交流情感或其他社会信息。尽管它很重要,但在情感计算领域,这个渠道仍然很少被探索,因为更多的焦点放在视觉和听觉渠道上。在本文中,我们研究了自动区分不同社交触摸类型的可能性。我们提出了五个不同的特征集来描述由压力传感器网格捕获的触摸行为。然后使用Random Forest和Boosting方法将这些特征组合在一起,对触摸手势类型进行分类。提出的方法在2015年社交触摸手势挑战提供的HAART(不同表面上的7种手势类型)和CoST(同一表面上的14种手势类型)数据集上进行了评估。HAART和CoST测试数据集的准确率分别为67%和59%,远高于机会水平。
{"title":"Social Touch Gesture Recognition using Random Forest and Boosting on Distinct Feature Sets","authors":"Y. F. A. Gaus, Temitayo A. Olugbade, Asim Jan, R. Qin, Jingxin Liu, Fan Zhang, H. Meng, N. Bianchi-Berthouze","doi":"10.1145/2818346.2830599","DOIUrl":"https://doi.org/10.1145/2818346.2830599","url":null,"abstract":"Touch is a primary nonverbal communication channel used to communicate emotions or other social messages. Despite its importance, this channel is still very little explored in the affective computing field, as much more focus has been placed on visual and aural channels. In this paper, we investigate the possibility to automatically discriminate between different social touch types. We propose five distinct feature sets for describing touch behaviours captured by a grid of pressure sensors. These features are then combined together by using the Random Forest and Boosting methods for categorizing the touch gesture type. The proposed methods were evaluated on both the HAART (7 gesture types over different surfaces) and the CoST (14 gesture types over the same surface) datasets made available by the Social Touch Gesture Challenge 2015. Well above chance level performances were achieved with a 67% accuracy for the HAART and 59% for the CoST testing datasets respectively.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"93 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83207166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 27
Session details: Doctoral Consortium 会议详情:博士联盟
C. Busso
{"title":"Session details: Doctoral Consortium","authors":"C. Busso","doi":"10.1145/3252454","DOIUrl":"https://doi.org/10.1145/3252454","url":null,"abstract":"","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"28 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89199725","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Multimodal System for Public Speaking with Real Time Feedback 具有实时反馈的公共演讲多模式系统
F. Dermody, Alistair Sutherland
We have developed a multimodal prototype for public speaking with real time feedback using the Microsoft Kinect. Effective speaking involves use of gesture, facial expression, posture, voice as well as the spoken word. These modalities combine to give the appearance of self-confidence in the speaker. This initial prototype detects body pose, facial expressions and voice. Visual and text feedback is displayed in real time to the user using a video panel, icon panel and text feedback panel. The user can also set and view elapsed time during their speaking performance. Real time feedback is displayed on gaze direction, body pose and gesture, vocal tonality, vocal dysfluencies and speaking rate.
我们已经开发了一个多模式的原型,用于使用微软Kinect进行实时反馈的公共演讲。有效的说话包括使用手势、面部表情、姿势、声音以及说的话。这些方式结合在一起,使说话者显得自信。这个最初的原型可以检测身体姿势、面部表情和声音。通过视频面板、图标面板和文本反馈面板,实时向用户显示视觉和文本反馈。用户还可以设置和查看他们说话过程中经过的时间。实时反馈显示在凝视方向,身体姿势和手势,语音音调,语音不流畅和说话速度。
{"title":"A Multimodal System for Public Speaking with Real Time Feedback","authors":"F. Dermody, Alistair Sutherland","doi":"10.1145/2818346.2823295","DOIUrl":"https://doi.org/10.1145/2818346.2823295","url":null,"abstract":"We have developed a multimodal prototype for public speaking with real time feedback using the Microsoft Kinect. Effective speaking involves use of gesture, facial expression, posture, voice as well as the spoken word. These modalities combine to give the appearance of self-confidence in the speaker. This initial prototype detects body pose, facial expressions and voice. Visual and text feedback is displayed in real time to the user using a video panel, icon panel and text feedback panel. The user can also set and view elapsed time during their speaking performance. Real time feedback is displayed on gaze direction, body pose and gesture, vocal tonality, vocal dysfluencies and speaking rate.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"38 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85853786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
MPHA: A Personal Hearing Doctor Based on Mobile Devices MPHA:基于移动设备的个人听力医生
Yu-Hao Wu, Jia Jia, Wai-Kim Leung, Yejun Liu, Lianhong Cai
As more and more people inquire to know their hearing level condition, audiometry is becoming increasingly important. However, traditional audiometric method requires the involvement of audiometers, which are very expensive and time consuming. In this paper, we present mobile personal hearing assessment (MPHA), a novel interactive mode for testing hearing level based on mobile devices. MPHA, 1) provides a general method to calibrate sound intensity for mobile devices to guarantee the reliability and validity of the audiometry system; 2) designs an audiometric correction algorithm for the real noisy audiometric environment. The experimental results show that MPHA is reliable and valid compared with conventional audiometric assessment.
随着越来越多的人想要了解自己的听力水平,测听变得越来越重要。然而,传统的听力测量方法需要使用听力计,这是非常昂贵和耗时的。本文提出了一种基于移动设备的交互式听力水平测试模式——移动个人听力评估(MPHA)。MPHA, 1)提供了一种通用的校准移动设备声强的方法,保证了测听系统的可靠性和有效性;2)针对真实噪声环境设计了一种听觉校正算法。实验结果表明,与传统的听力评估相比,MPHA是可靠和有效的。
{"title":"MPHA: A Personal Hearing Doctor Based on Mobile Devices","authors":"Yu-Hao Wu, Jia Jia, Wai-Kim Leung, Yejun Liu, Lianhong Cai","doi":"10.1145/2818346.2820753","DOIUrl":"https://doi.org/10.1145/2818346.2820753","url":null,"abstract":"As more and more people inquire to know their hearing level condition, audiometry is becoming increasingly important. However, traditional audiometric method requires the involvement of audiometers, which are very expensive and time consuming. In this paper, we present mobile personal hearing assessment (MPHA), a novel interactive mode for testing hearing level based on mobile devices. MPHA, 1) provides a general method to calibrate sound intensity for mobile devices to guarantee the reliability and validity of the audiometry system; 2) designs an audiometric correction algorithm for the real noisy audiometric environment. The experimental results show that MPHA is reliable and valid compared with conventional audiometric assessment.","PeriodicalId":20486,"journal":{"name":"Proceedings of the 2015 ACM on International Conference on Multimodal Interaction","volume":"31 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2015-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79184101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Proceedings of the 2015 ACM on International Conference on Multimodal Interaction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1