2015 International Conference on Affective Computing and Intelligent Interaction (ACII)最新文献

英文中文

Design of a wearable research tool for warm mediated social touches 设计一种可穿戴的研究工具，用于温暖介导的社会接触

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344694

Isabel Pfab, Christian J. A. M. Willemse

Social touches are essential in interpersonal communication, for instance to show affect. Despite this importance, mediated interpersonal communication oftentimes lacks the possibility to touch. A human touch is a complex composition of several physical qualities and parameters, but different haptic technologies allow us to isolate such parameters and to investigate their opportunities and limitations for affective communication devices. In our research, we focus on the role that temperature may play in affective mediated communication. In the current paper, we describe the design of a wearable `research tool' that will facilitate systematic research on the possibilities of temperature in affective communication. We present use cases, and define a list of requirements accordingly. Based on a requirement fulfillment analysis, we conclude that our research tool can be of value for research on new forms of affective mediated communication.

社交接触在人际交往中是必不可少的，例如表达情感。尽管有这种重要性，中介人际沟通往往缺乏接触的可能性。人类的触摸是多种物理品质和参数的复杂组合，但不同的触觉技术使我们能够分离这些参数，并研究它们对情感通信设备的机会和限制。在我们的研究中，我们关注温度在情感中介沟通中可能发挥的作用。在当前的论文中，我们描述了一种可穿戴的“研究工具”的设计，这将有助于系统地研究温度在情感交流中的可能性。我们呈现用例，并相应地定义需求列表。基于需求实现分析，我们得出结论，我们的研究工具可以为研究新的情感中介沟通形式提供价值。

引用次数: 2

An ECA expressing appreciations 非洲经委会表示赞赏

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344691

Sabrina Campano, Caroline Langlet, N. Glas, C. Clavel, C. Pelachaud

In this paper, we propose a computational model that provides an Embodied Conversational Agent (ECA) with the ability to generate verbal other-repetition (repetitions of some of the words uttered in the previous user speaker turn) when interacting with a user in a museum setting. We focus on the generation of other-repetitions expressing emotional stances in appreciation sentences. Emotional stances and their semantic features are selected according to the user's verbal input, and ECA's utterance is generated according to these features. We present an evaluation of this model through users' subjective reports. Results indicate that the expression of emotional stances by the ECA has a positive effect oIn this paper, we propose a computational model that provides an Embodied Conversational Agent (ECA) with the ability to generate verbal other-repetition (repetitions of some of the words uttered in the previous user speaker turn) when interacting with a user in a museum setting. We focus on the generation of other-repetitions expressing emotional stances in appreciation sentences. Emotional stances and their semantic features are selected according to the user's verbal input, and ECA's utterance is generated according to these features. We present an evaluation of this model through users' subjective reports. Results indicate that the expression of emotional stances by the ECA has a positive effect on user engagement, and that ECA's behaviours are rated as more believable by users when the ECA utters other-repetitions.n user engagement, and that ECA's behaviours are rated as more believable by users when the ECA utters other-repetitions.

在本文中，我们提出了一个计算模型，该模型提供了一个具身会话代理(ECA)，当与博物馆设置的用户交互时，具有生成口头其他重复的能力(重复之前用户说话时所说的一些单词)。我们关注的是在欣赏句中表达情感立场的其他重复的产生。根据用户的语言输入选择情感立场及其语义特征，并根据这些特征生成ECA的话语。我们通过用户的主观报告对该模型进行了评估。在本文中，我们提出了一个计算模型，该模型提供了一个具身会话代理(ECA)在与博物馆设置的用户交互时产生口头其他重复(重复之前用户说话时所说的一些单词)的能力。我们关注的是在欣赏句中表达情感立场的其他重复的产生。根据用户的语言输入选择情感立场及其语义特征，并根据这些特征生成ECA的话语。我们通过用户的主观报告对该模型进行了评估。结果表明，ECA的情绪立场表达对用户参与有积极影响，并且当ECA发出其他重复时，ECA的行为被用户评为更可信。当ECA发出其他重复时，用户认为ECA的行为更可信。

{"title":"An ECA expressing appreciations","authors":"Sabrina Campano, Caroline Langlet, N. Glas, C. Clavel, C. Pelachaud","doi":"10.1109/ACII.2015.7344691","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344691","url":null,"abstract":"In this paper, we propose a computational model that provides an Embodied Conversational Agent (ECA) with the ability to generate verbal other-repetition (repetitions of some of the words uttered in the previous user speaker turn) when interacting with a user in a museum setting. We focus on the generation of other-repetitions expressing emotional stances in appreciation sentences. Emotional stances and their semantic features are selected according to the user's verbal input, and ECA's utterance is generated according to these features. We present an evaluation of this model through users' subjective reports. Results indicate that the expression of emotional stances by the ECA has a positive effect oIn this paper, we propose a computational model that provides an Embodied Conversational Agent (ECA) with the ability to generate verbal other-repetition (repetitions of some of the words uttered in the previous user speaker turn) when interacting with a user in a museum setting. We focus on the generation of other-repetitions expressing emotional stances in appreciation sentences. Emotional stances and their semantic features are selected according to the user's verbal input, and ECA's utterance is generated according to these features. We present an evaluation of this model through users' subjective reports. Results indicate that the expression of emotional stances by the ECA has a positive effect on user engagement, and that ECA's behaviours are rated as more believable by users when the ECA utters other-repetitions.n user engagement, and that ECA's behaviours are rated as more believable by users when the ECA utters other-repetitions.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"110 1","pages":"962-967"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85273604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 15

Monocular 3D facial information retrieval for automated facial expression analysis 用于自动面部表情分析的单目三维面部信息检索

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344634

Meshia Cédric Oveneke, Isabel Gonzalez, Weiyi Wang, D. Jiang, H. Sahli

Understanding social signals is a very important aspect of human communication and interaction and has therefore attracted increased attention from various research areas. Among the different types of social signals, particular attention has been paid to facial expression of emotions and its automated analysis from image sequences. Automated facial expression analysis is a very challenging task due to the complex three-dimensional deformation and motion of the face associated to the facial expressions and the loss of 3D information during the image formation process. As a consequence, retrieving 3D spatio-temporal facial information from image sequences is essential for automated facial expression analysis. In this paper, we propose a framework for retrieving three-dimensional facial structure, motion and spatio-temporal features from monocular image sequences. First, we estimate monocular 3D scene flow by retrieving the facial structure using shape-from-shading (SFS) and combine it with 2D optical flow. Secondly, based on the retrieved structure and motion of the face, we extract spatio-temporal features for automated facial expression analysis. Experimental results illustrate the potential of the proposed 3D facial information retrieval framework for facial expression analysis, i.e. facial expression recognition and facial action-unit recognition on a benchmark dataset. This paves the way for future research on monocular 3D facial expression analysis.

理解社会信号是人类交流和互动的一个非常重要的方面，因此越来越受到各个研究领域的关注。在不同类型的社会信号中，人们特别关注面部情绪表达及其图像序列的自动分析。由于面部表情相关的复杂三维变形和运动以及图像形成过程中三维信息的丢失，自动面部表情分析是一项非常具有挑战性的任务。因此，从图像序列中检索三维时空面部信息对于自动面部表情分析至关重要。本文提出了一种从单眼图像序列中提取三维面部结构、运动和时空特征的框架。首先，利用形状-阴影(shape-from-shading, SFS)提取人脸结构，并将其与二维光流相结合，估计单眼三维场景流;其次，基于检索到的人脸结构和运动特征，提取人脸的时空特征，用于人脸表情自动分析;实验结果说明了所提出的三维面部信息检索框架在面部表情分析方面的潜力，即在基准数据集上的面部表情识别和面部动作单元识别。这为未来单眼三维面部表情分析的研究铺平了道路。

{"title":"Monocular 3D facial information retrieval for automated facial expression analysis","authors":"Meshia Cédric Oveneke, Isabel Gonzalez, Weiyi Wang, D. Jiang, H. Sahli","doi":"10.1109/ACII.2015.7344634","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344634","url":null,"abstract":"Understanding social signals is a very important aspect of human communication and interaction and has therefore attracted increased attention from various research areas. Among the different types of social signals, particular attention has been paid to facial expression of emotions and its automated analysis from image sequences. Automated facial expression analysis is a very challenging task due to the complex three-dimensional deformation and motion of the face associated to the facial expressions and the loss of 3D information during the image formation process. As a consequence, retrieving 3D spatio-temporal facial information from image sequences is essential for automated facial expression analysis. In this paper, we propose a framework for retrieving three-dimensional facial structure, motion and spatio-temporal features from monocular image sequences. First, we estimate monocular 3D scene flow by retrieving the facial structure using shape-from-shading (SFS) and combine it with 2D optical flow. Secondly, based on the retrieved structure and motion of the face, we extract spatio-temporal features for automated facial expression analysis. Experimental results illustrate the potential of the proposed 3D facial information retrieval framework for facial expression analysis, i.e. facial expression recognition and facial action-unit recognition on a benchmark dataset. This paves the way for future research on monocular 3D facial expression analysis.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"15 1","pages":"623-629"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77674028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Exploring dataset similarities using PCA-based feature selection 使用基于pca的特征选择探索数据集的相似性

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344600

Ingo Siegert, Ronald Böck, A. Wendemuth, Bogdan Vlasenko

In emotion recognition from speech, several well-established corpora are used to date for the development of classification engines. The data is annotated differently, and the community in the field uses a variety of feature extraction schemes. The aim of this paper is to investigate promising features for individual corpora and then compare the results for proposing optimal features across data sets, introducing a new ranking method. Further, this enables us to present a method for automatic identification of groups of corpora with similar characteristics. This answers an urgent question in classifier development, namely whether data from different corpora is similar enough to jointly be used as training material, overcoming shortage of material in matching domains. We compare the results of this method with manual groupings of corpora. We consider the established emotional speech corpora AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS and VAM, however our approach is general.

在语音情感识别中，一些成熟的语料库被用于分类引擎的开发。数据的注释方式不同，该领域的社区使用各种特征提取方案。本文的目的是研究单个语料库的有前途的特征，然后比较跨数据集提出最优特征的结果，引入一种新的排序方法。此外，这使我们能够提出一种具有相似特征的语料库组的自动识别方法。这就解决了分类器开发中一个迫切需要解决的问题，即不同语料库的数据是否足够相似，可以共同用作训练材料，从而克服匹配领域的材料短缺问题。我们将这种方法的结果与人工对语料库进行分组的结果进行比较。我们考虑已建立的情绪语音语料库AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS和VAM，但我们的方法是一般的。

引用次数: 6

The enduring basis of emotional episodes: Towards a capacious overview 情感事件的持久基础:走向一个广阔的概述

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344557

R. Cowie

It matters for affective computing to have a framework that brings key points about human emotion to mind in an orderly way. A natural option builds on the ancient view that overt emotion arises from interactions between rational awareness and systems of a different type whose functions are ongoing, but not obvious. Key ideas from modern research can be incorporated by assuming that the latter do five broad kinds of work: evaluating states of affairs; preparing us to act accordingly; learning from significant conjunctions; interrupting conscious processes if need be; and aligning us with other people. Multiple structures act as interfaces between those systems and rational awareness. Emotional feelings inform conscious awareness of what they are doing, and emotion words split the space of their activity into discrete regions. The picture is not ideal, but it offers a substantial organising device.

对于情感计算来说，重要的是要有一个框架，以一种有序的方式将人类情感的关键点带到脑海中。一个自然的选择建立在古老的观点之上，即公开的情感产生于理性意识和不同类型的系统之间的相互作用，这些系统的功能是持续的，但不明显。现代研究的关键思想可以通过假设后者做五大类工作来吸收:评估事态;准备我们采取相应的行动;从重要连词中学习;必要时打断意识过程;让我们与其他人团结起来。多重结构充当这些系统和理性意识之间的接口。情感感受会让人有意识地意识到自己在做什么，而情感词汇会把他们的活动空间分割成离散的区域。这幅图并不理想，但它提供了一个实质性的组织机制。

引用次数: 2

Personality test based on eye tracking techniques 基于眼动追踪技术的性格测试

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344670

Yun Zhang, Wei Xin, D. Miao

This paper presents an original research on eye tracking based personality test. To deal with the unavoidable human deception and inaccurate self-assessment during subjective psychological test, eye tracking techniques are utilized to reveal the participant's cognitive procedure during test. A non-intrusive real-time eye tracking based questionnaire system is developed for Chinese military recruitment personality test. A pilot study is carried out on 12 qualified samples. The preliminary result of experiment indicates a strong correlation between the participant's fixation features and test results. And such kind of relationship can be developed as an assistive indicator or a predictive parameter to traditional psychological test result to highly improve its reliability and validity in future applications.

本文对基于眼动追踪的人格测试进行了初步研究。针对主观心理测试中不可避免的人为欺骗和不准确的自我评估，采用眼动追踪技术揭示被试在测试过程中的认知过程。研制了一种非侵入式实时眼动追踪问卷系统，用于中国军队新兵性格测试。对12个合格样本进行了初步研究。实验初步结果表明，被试注视特征与测试结果有较强的相关性。这种关系可以作为传统心理测试结果的辅助指标或预测参数，在今后的应用中大大提高其信度和效度。

引用次数: 1

Posed and spontaneous facial expression differentiation using deep Boltzmann machines 利用深度玻尔兹曼机器进行姿势和自发的面部表情区分

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344637

Quan Gan, Chongliang Wu, Shangfei Wang, Q. Ji

Current works on differentiating between posed and spontaneous facial expressions usually use features that are handcrafted for expression category recognition. Till now, no features have been specifically designed for differentiating between posed and spontaneous facial expressions. Recently, deep learning models have been proven to be efficient for many challenging computer vision tasks, and therefore in this paper we propose using the deep Boltzmann machine to learn representations of facial images and to differentiate between posed and spontaneous facial expressions. First, faces are located from images. Then, a two-layer deep Boltzmann machine is trained to distinguish posed and spon-tanous expressions. Experimental results on two benchmark datasets, i.e. the SPOS and USTC-NVIE datasets, demonstrate that the deep Boltzmann machine performs well on posed and spontaneous expression differentiation tasks. Comparison results on both datasets show that our method has an advantage over the other methods.

目前在区分姿势和自然面部表情方面的工作通常使用手工制作的特征来识别表情类别。到目前为止，还没有专门设计用于区分摆姿势和自然面部表情的功能。最近，深度学习模型已被证明在许多具有挑战性的计算机视觉任务中是有效的，因此在本文中，我们建议使用深度玻尔兹曼机器来学习面部图像的表示，并区分姿势和自发的面部表情。首先，人脸是从图像中定位的。然后，训练一个两层深度玻尔兹曼机来区分有姿态和自发的表达式。在SPOS和USTC-NVIE两个基准数据集上的实验结果表明，深度玻尔兹曼机在定位和自发表达分化任务上表现良好。在两个数据集上的比较结果表明，我们的方法比其他方法有优势。

引用次数: 18

A temporally piece-wise fisher vector approach for depression analysis 一种用于抑郁分析的时间分段fisher向量方法

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344580

Abhinav Dhall, Roland Göcke

Depression and other mood disorders are common, disabling disorders with a profound impact on individuals and families. Inspite of its high prevalence, it is easily missed during the early stages. Automatic depression analysis has become a very active field of research in the affective computing community in the past few years. This paper presents a framework for depression analysis based on unimodal visual cues. Temporally piece-wise Fisher Vectors (FV) are computed on temporal segments. As a low-level feature, block-wise Local Binary Pattern-Three Orthogonal Planes descriptors are computed. Statistical aggregation techniques are analysed and compared for creating a discriminative representative for a video sample. The paper explores the strength of FV in representing temporal segments in a spontaneous clinical data. This creates a meaningful representation of the facial dynamics in a temporal segment. The experiments are conducted on the Audio Video Emotion Challenge (AVEC) 2014 German speaking depression database. The superior results of the proposed framework show the effectiveness of the technique as compared to the current state-of-art.

抑郁症和其他情绪障碍是常见的致残障碍，对个人和家庭产生深远影响。尽管发病率很高，但在早期阶段很容易被忽视。在过去的几年里，自动抑郁分析已经成为情感计算界一个非常活跃的研究领域。本文提出了一种基于单峰视觉线索的抑郁分析框架。时间分段费雪向量(FV)是在时间段上计算的。作为一种低级特征，计算了分块局部二值模式-三正交平面描述符。对统计聚合技术进行了分析和比较，以创建视频样本的判别代表。本文探讨了FV在自发性临床数据中表现时间段的强度。这在时间段中创建了一个有意义的面部动态表示。实验在音频视频情绪挑战(AVEC) 2014德语抑郁数据库上进行。与目前的技术相比，所提出的框架的优越结果表明了该技术的有效性。

{"title":"A temporally piece-wise fisher vector approach for depression analysis","authors":"Abhinav Dhall, Roland Göcke","doi":"10.1109/ACII.2015.7344580","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344580","url":null,"abstract":"Depression and other mood disorders are common, disabling disorders with a profound impact on individuals and families. Inspite of its high prevalence, it is easily missed during the early stages. Automatic depression analysis has become a very active field of research in the affective computing community in the past few years. This paper presents a framework for depression analysis based on unimodal visual cues. Temporally piece-wise Fisher Vectors (FV) are computed on temporal segments. As a low-level feature, block-wise Local Binary Pattern-Three Orthogonal Planes descriptors are computed. Statistical aggregation techniques are analysed and compared for creating a discriminative representative for a video sample. The paper explores the strength of FV in representing temporal segments in a spontaneous clinical data. This creates a meaningful representation of the facial dynamics in a temporal segment. The experiments are conducted on the Audio Video Emotion Challenge (AVEC) 2014 German speaking depression database. The superior results of the proposed framework show the effectiveness of the technique as compared to the current state-of-art.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"23 1","pages":"255-259"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83298186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 48

Turing's menagerie: Talking lions, virtual bats, electric sheep and analogical peacocks: Common ground and common interest are necessary components of engagement 图灵的动物园:会说话的狮子、虚拟的蝙蝠、电子羊和模拟的孔雀:共同点和共同兴趣是参与的必要组成部分

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344689

G. McKeown

This theoretical paper attempts to define some of the key components and challenges required to create embodied conversational agents that can be genuinely interesting conversational partners. Wittgenstein's argument concerning talking lions emphasizes the importance of having a shared common ground as a basis for conversational interactions. Virtual bats suggests that-for some people at least-it is important that there be a feeling of authenticity concerning a subjectively experiencing entity that can convey what it is like to be that entity. Electric sheep reminds us of the importance of empathy in human conversational interaction and that we should provide a full communicative repertoire of both verbal and non-verbal components if we are to create genuinely engaging interactions. Also we may be making the task more difficult rather than easy if we leave out non-verbal aspects of communication. Finally, analogical peacocks highlights the importance of between minds alignment and establishes a longer term goal of being interesting, creative, and humorous if an embodied conversational agent is to be truly an engaging conversational partner. Some potential directions and solutions to addressing these issues are suggested.

这篇理论论文试图定义一些关键的组成部分和挑战，需要创建具体化的会话代理，可以成为真正有趣的会话伙伴。维特根斯坦关于会说话的狮子的论点强调了拥有共同基础作为对话互动基础的重要性。虚拟蝙蝠表明——至少对某些人来说——有一种关于主观体验实体的真实感是很重要的，这种真实感可以传达作为实体的感觉。电子羊提醒我们移情在人类对话互动中的重要性，如果我们要创造真正吸引人的互动，我们应该提供语言和非语言成分的完整交流技能。此外，如果我们忽略了沟通的非语言方面，我们可能会使任务变得更加困难而不是容易。最后，类比孔雀强调了思想一致性的重要性，并建立了一个长期目标，即如果一个具体化的对话代理要成为一个真正有吸引力的对话伙伴，就必须有趣、有创意和幽默。提出了解决这些问题的一些可能的方向和解决办法。

{"title":"Turing's menagerie: Talking lions, virtual bats, electric sheep and analogical peacocks: Common ground and common interest are necessary components of engagement","authors":"G. McKeown","doi":"10.1109/ACII.2015.7344689","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344689","url":null,"abstract":"This theoretical paper attempts to define some of the key components and challenges required to create embodied conversational agents that can be genuinely interesting conversational partners. Wittgenstein's argument concerning talking lions emphasizes the importance of having a shared common ground as a basis for conversational interactions. Virtual bats suggests that-for some people at least-it is important that there be a feeling of authenticity concerning a subjectively experiencing entity that can convey what it is like to be that entity. Electric sheep reminds us of the importance of empathy in human conversational interaction and that we should provide a full communicative repertoire of both verbal and non-verbal components if we are to create genuinely engaging interactions. Also we may be making the task more difficult rather than easy if we leave out non-verbal aspects of communication. Finally, analogical peacocks highlights the importance of between minds alignment and establishes a longer term goal of being interesting, creative, and humorous if an embodied conversational agent is to be truly an engaging conversational partner. Some potential directions and solutions to addressing these issues are suggested.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"20 1","pages":"950-955"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88076923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

Genre based emotion annotation for music in noisy environment 嘈杂环境下基于体裁的音乐情感标注

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

Pub Date : 2015-09-21 DOI: 10.1109/ACII.2015.7344675

Yu-Hao Chin, Po-Chuan Lin, Tzu-Chiang Tai, Jia-Ching Wang

The music listened by human is sometimes exposed to noise. For example, background noise usually exists when listening to music in broadcasts or lives. The noise will worsen the performance in various music emotion recognition systems. To solve the problem, this work constructs a robust system for music emotion classification in a noisy environment. Furthermore, the genre is considered when determining the emotional label for the song. The proposed system consists of three major parts, i.e. subspace based noise suppression, genre index computation, and support vector machine (SVM). Firstly, the system uses noise suppression to remove the noise content in the signal. After that, acoustical features are extracted from each music clip. Next, a dictionary is constructed by using songs that cover a wide range of genres, and it is adopted to implement sparse coding. Via sparse coding, data can be transformed to sparse coefficient vectors, and this paper computes genre indexes for the music genres based on the sparse coefficient vector. The genre indexes are regarded as combination weights in the latter phase. At the training stage of the SVM, this paper train emotional models for each genre. At the prediction stage, the predictions that obtained by emotional models in each genre are weighted combined across all genres using the genre indexes. Finally, the proposed system annotates multiple emotional labels for a song based on the combined prediction. The experimental result shows that the system can achieve a good performance in both normal and noisy environments.

人类所听的音乐有时会受到噪音的影响。例如，在广播或生活中听音乐时，背景噪音通常存在。噪声会影响各种音乐情感识别系统的性能。为了解决这一问题，本文构建了一个鲁棒的噪声环境下音乐情感分类系统。此外，在确定歌曲的情感标签时，还会考虑流派。该系统由基于子空间的噪声抑制、类型索引计算和支持向量机(SVM)三大部分组成。首先，系统采用噪声抑制方法去除信号中的噪声内容。然后，从每个音乐片段中提取声学特征。其次，利用涵盖广泛体裁的歌曲构建字典，并采用字典实现稀疏编码。通过稀疏编码，将数据转换为稀疏系数向量，并根据稀疏系数向量计算音乐类型的类型指数。在后期，将类型指标作为组合权重。在支持向量机的训练阶段，本文对每个类型的情感模型进行训练。在预测阶段，使用类型指数对每种类型的情感模型所获得的预测结果进行加权组合。最后，基于组合预测为歌曲标注多个情感标签。实验结果表明，该系统在正常环境和噪声环境下都能取得良好的性能。

{"title":"Genre based emotion annotation for music in noisy environment","authors":"Yu-Hao Chin, Po-Chuan Lin, Tzu-Chiang Tai, Jia-Ching Wang","doi":"10.1109/ACII.2015.7344675","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344675","url":null,"abstract":"The music listened by human is sometimes exposed to noise. For example, background noise usually exists when listening to music in broadcasts or lives. The noise will worsen the performance in various music emotion recognition systems. To solve the problem, this work constructs a robust system for music emotion classification in a noisy environment. Furthermore, the genre is considered when determining the emotional label for the song. The proposed system consists of three major parts, i.e. subspace based noise suppression, genre index computation, and support vector machine (SVM). Firstly, the system uses noise suppression to remove the noise content in the signal. After that, acoustical features are extracted from each music clip. Next, a dictionary is constructed by using songs that cover a wide range of genres, and it is adopted to implement sparse coding. Via sparse coding, data can be transformed to sparse coefficient vectors, and this paper computes genre indexes for the music genres based on the sparse coefficient vector. The genre indexes are regarded as combination weights in the latter phase. At the training stage of the SVM, this paper train emotional models for each genre. At the prediction stage, the predictions that obtained by emotional models in each genre are weighted combined across all genres using the genre indexes. Finally, the proposed system annotates multiple emotional labels for a song based on the combined prediction. The experimental result shows that the system can achieve a good performance in both normal and noisy environments.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"29 1","pages":"863-866"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83086079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 4

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀