首页 > 最新文献

2015 International Conference on Affective Computing and Intelligent Interaction (ACII)最新文献

英文 中文
An experimental study of speech emotion recognition based on deep convolutional neural networks 基于深度卷积神经网络的语音情感识别实验研究
W. Zheng, Jian Yu, Yuexian Zou
Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.
语音情感识别(SER)是一项具有挑战性的任务,因为人们不清楚什么样的特征能够从语音中反映出人类情感的特征。然而,传统的特征提取方法在不同的情感识别任务中表现不一致。显然,不同的谱图提供了反映不同情绪的信息。本文提出了一种基于深度卷积神经网络(DCNNs)的基于标记训练音频数据的有效情感识别系统的系统方法。具体而言,计算对数谱图,并采用主成分分析(PCA)技术进行降维和抑制干扰。然后将PCA白化后的谱图分割成互不重叠的段。构建DCNN是为了从带有标记的训练语音数据片段中学习情感的表示。我们的初步实验表明,基于DCNNs(包含2个卷积和2个池化层)的情绪识别系统的分类准确率约为40%。此外,它还优于使用手工声学特征的基于支持向量机的分类。
{"title":"An experimental study of speech emotion recognition based on deep convolutional neural networks","authors":"W. Zheng, Jian Yu, Yuexian Zou","doi":"10.1109/ACII.2015.7344669","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344669","url":null,"abstract":"Speech emotion recognition (SER) is a challenging task since it is unclear what kind of features are able to reflect the characteristics of human emotion from speech. However, traditional feature extractions perform inconsistently for different emotion recognition tasks. Obviously, different spectrogram provides information reflecting difference emotion. This paper proposes a systematical approach to implement an effectively emotion recognition system based on deep convolution neural networks (DCNNs) using labeled training audio data. Specifically, the log-spectrogram is computed and the principle component analysis (PCA) technique is used to reduce the dimensionality and suppress the interferences. Then the PCA whitened spectrogram is split into non-overlapping segments. The DCNN is constructed to learn the representation of the emotion from the segments with labeled training speech data. Our preliminary experiments show the proposed emotion recognition system based on DCNNs (containing 2 convolution and 2 pooling layers) achieves about 40% classification accuracy. Moreover, it also outperforms the SVM based classification using the hand-crafted acoustic features.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"19 1","pages":"827-831"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84317516","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 145
The Belfast storytelling database: A spontaneous social interaction database with laughter focused annotation 贝尔法斯特讲故事数据库:一个以笑声为中心的自发社会互动数据库
G. McKeown, W. Curran, J. Wagner, F. Lingenfelser, E. André
To support the endeavor of creating intelligent interfaces between computers and humans the use of training materials based on realistic human-human interactions has been recognized as a crucial task. One of the effects of the creation of these databases is an increased realization of the importance of often overlooked social signals and behaviours in organizing and orchestrating our interactions. Laughter is one of these key social signals; its importance in maintaining the smooth flow of human interaction has only recently become apparent in the embodied conversational agent domain. In turn, these realizations require training data that focus on these key social signals. This paper presents a database that is well annotated and theoretically constructed with respect to understanding laughter as it is used within human social interaction. Its construction, motivation, annotation and availability are presented in detail in this paper.
为了支持在计算机和人类之间创建智能接口的努力,基于现实的人机交互的培训材料的使用已被认为是一项至关重要的任务。创建这些数据库的影响之一是,人们越来越意识到,在组织和协调我们的互动过程中,经常被忽视的社会信号和行为的重要性。笑是这些关键的社交信号之一;它在维持人类交互的流畅性方面的重要性直到最近才在具体化的会话代理领域显现出来。反过来,这些实现需要关注这些关键社会信号的训练数据。这篇论文提出了一个数据库,在理解人类社会互动中使用的笑声方面进行了很好的注释和理论构建。本文详细介绍了它的结构、动机、注释和可用性。
{"title":"The Belfast storytelling database: A spontaneous social interaction database with laughter focused annotation","authors":"G. McKeown, W. Curran, J. Wagner, F. Lingenfelser, E. André","doi":"10.1109/ACII.2015.7344567","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344567","url":null,"abstract":"To support the endeavor of creating intelligent interfaces between computers and humans the use of training materials based on realistic human-human interactions has been recognized as a crucial task. One of the effects of the creation of these databases is an increased realization of the importance of often overlooked social signals and behaviours in organizing and orchestrating our interactions. Laughter is one of these key social signals; its importance in maintaining the smooth flow of human interaction has only recently become apparent in the embodied conversational agent domain. In turn, these realizations require training data that focus on these key social signals. This paper presents a database that is well annotated and theoretically constructed with respect to understanding laughter as it is used within human social interaction. Its construction, motivation, annotation and availability are presented in detail in this paper.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"30 1","pages":"166-172"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78146251","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Estimate the intimacy of the characters based on their emotional states for application to non-task dialogue 根据角色的情绪状态来评估他们的亲密度,以便应用于非任务对话
Kazuyuki Matsumoto, Kyosuke Akita, Minoru Yoshida, K. Kita, F. Ren
Recently, a portable digital device equipped with voice guidance has been widely used with increasing the demand for the usability-conscious dialogue system. One of the problems with the existing dialogue system is its immature application to non-task dialogue. Non-task-oriented dialogue requires some schemes that enable smooth and flexible conversations with a user. For example, it would be possible to go beyond the closed relationship between the system and the user by considering the user's relationship with others in real life. In this paper, we focused on the dialogue made by the two characters in a drama scenario, and tried to express their relationship with a scale of “intimacy degree.” There will be such various elements related to the intimacy degree as the frequency of response to the utterance and the attitude of a speaker during the dialogue. We focused on the emotional state of the speaker during the utterance and tried to realize intimacy estimation with higher accuracy. As the evaluation result, we achieved higher accuracy in intimacy estimation than the existing method based on speech role.
近年来,一种带有语音引导功能的便携式数字设备得到了广泛的应用,人们对对话系统的可用性要求也越来越高。现有对话系统存在的问题之一是对非任务对话的应用不成熟。非面向任务的对话需要一些方案来实现与用户的流畅和灵活的对话。例如,通过考虑用户在现实生活中与其他人的关系,可以超越系统与用户之间的紧密关系。本文以一个戏剧场景中两个人物的对话为研究对象,试图用“亲密度”的尺度来表达他们之间的关系。在对话过程中,与亲密度相关的因素包括对话语的回应频率、说话人的态度等。我们关注说话人在说话过程中的情绪状态,试图以更高的准确率实现亲密度的估计。评价结果表明,我们在亲密度估计上取得了比现有基于语音角色的方法更高的准确性。
{"title":"Estimate the intimacy of the characters based on their emotional states for application to non-task dialogue","authors":"Kazuyuki Matsumoto, Kyosuke Akita, Minoru Yoshida, K. Kita, F. Ren","doi":"10.1109/ACII.2015.7344591","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344591","url":null,"abstract":"Recently, a portable digital device equipped with voice guidance has been widely used with increasing the demand for the usability-conscious dialogue system. One of the problems with the existing dialogue system is its immature application to non-task dialogue. Non-task-oriented dialogue requires some schemes that enable smooth and flexible conversations with a user. For example, it would be possible to go beyond the closed relationship between the system and the user by considering the user's relationship with others in real life. In this paper, we focused on the dialogue made by the two characters in a drama scenario, and tried to express their relationship with a scale of “intimacy degree.” There will be such various elements related to the intimacy degree as the frequency of response to the utterance and the attitude of a speaker during the dialogue. We focused on the emotional state of the speaker during the utterance and tried to realize intimacy estimation with higher accuracy. As the evaluation result, we achieved higher accuracy in intimacy estimation than the existing method based on speech role.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"66 1","pages":"327-333"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85614794","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring dataset similarities using PCA-based feature selection 使用基于pca的特征选择探索数据集的相似性
Ingo Siegert, Ronald Böck, A. Wendemuth, Bogdan Vlasenko
In emotion recognition from speech, several well-established corpora are used to date for the development of classification engines. The data is annotated differently, and the community in the field uses a variety of feature extraction schemes. The aim of this paper is to investigate promising features for individual corpora and then compare the results for proposing optimal features across data sets, introducing a new ranking method. Further, this enables us to present a method for automatic identification of groups of corpora with similar characteristics. This answers an urgent question in classifier development, namely whether data from different corpora is similar enough to jointly be used as training material, overcoming shortage of material in matching domains. We compare the results of this method with manual groupings of corpora. We consider the established emotional speech corpora AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS and VAM, however our approach is general.
在语音情感识别中,一些成熟的语料库被用于分类引擎的开发。数据的注释方式不同,该领域的社区使用各种特征提取方案。本文的目的是研究单个语料库的有前途的特征,然后比较跨数据集提出最优特征的结果,引入一种新的排序方法。此外,这使我们能够提出一种具有相似特征的语料库组的自动识别方法。这就解决了分类器开发中一个迫切需要解决的问题,即不同语料库的数据是否足够相似,可以共同用作训练材料,从而克服匹配领域的材料短缺问题。我们将这种方法的结果与人工对语料库进行分组的结果进行比较。我们考虑已建立的情绪语音语料库AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS和VAM,但我们的方法是一般的。
{"title":"Exploring dataset similarities using PCA-based feature selection","authors":"Ingo Siegert, Ronald Böck, A. Wendemuth, Bogdan Vlasenko","doi":"10.1109/ACII.2015.7344600","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344600","url":null,"abstract":"In emotion recognition from speech, several well-established corpora are used to date for the development of classification engines. The data is annotated differently, and the community in the field uses a variety of feature extraction schemes. The aim of this paper is to investigate promising features for individual corpora and then compare the results for proposing optimal features across data sets, introducing a new ranking method. Further, this enables us to present a method for automatic identification of groups of corpora with similar characteristics. This answers an urgent question in classifier development, namely whether data from different corpora is similar enough to jointly be used as training material, overcoming shortage of material in matching domains. We compare the results of this method with manual groupings of corpora. We consider the established emotional speech corpora AVIC, ABC, DES, EMO-DB, ENTERFACE, SAL, SMARTKOM, SUSAS and VAM, however our approach is general.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"62 1","pages":"387-393"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90738063","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
The enduring basis of emotional episodes: Towards a capacious overview 情感事件的持久基础:走向一个广阔的概述
R. Cowie
It matters for affective computing to have a framework that brings key points about human emotion to mind in an orderly way. A natural option builds on the ancient view that overt emotion arises from interactions between rational awareness and systems of a different type whose functions are ongoing, but not obvious. Key ideas from modern research can be incorporated by assuming that the latter do five broad kinds of work: evaluating states of affairs; preparing us to act accordingly; learning from significant conjunctions; interrupting conscious processes if need be; and aligning us with other people. Multiple structures act as interfaces between those systems and rational awareness. Emotional feelings inform conscious awareness of what they are doing, and emotion words split the space of their activity into discrete regions. The picture is not ideal, but it offers a substantial organising device.
对于情感计算来说,重要的是要有一个框架,以一种有序的方式将人类情感的关键点带到脑海中。一个自然的选择建立在古老的观点之上,即公开的情感产生于理性意识和不同类型的系统之间的相互作用,这些系统的功能是持续的,但不明显。现代研究的关键思想可以通过假设后者做五大类工作来吸收:评估事态;准备我们采取相应的行动;从重要连词中学习;必要时打断意识过程;让我们与其他人团结起来。多重结构充当这些系统和理性意识之间的接口。情感感受会让人有意识地意识到自己在做什么,而情感词汇会把他们的活动空间分割成离散的区域。这幅图并不理想,但它提供了一个实质性的组织机制。
{"title":"The enduring basis of emotional episodes: Towards a capacious overview","authors":"R. Cowie","doi":"10.1109/ACII.2015.7344557","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344557","url":null,"abstract":"It matters for affective computing to have a framework that brings key points about human emotion to mind in an orderly way. A natural option builds on the ancient view that overt emotion arises from interactions between rational awareness and systems of a different type whose functions are ongoing, but not obvious. Key ideas from modern research can be incorporated by assuming that the latter do five broad kinds of work: evaluating states of affairs; preparing us to act accordingly; learning from significant conjunctions; interrupting conscious processes if need be; and aligning us with other people. Multiple structures act as interfaces between those systems and rational awareness. Emotional feelings inform conscious awareness of what they are doing, and emotion words split the space of their activity into discrete regions. The picture is not ideal, but it offers a substantial organising device.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"30 1","pages":"98-104"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85733448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Personality test based on eye tracking techniques 基于眼动追踪技术的性格测试
Yun Zhang, Wei Xin, D. Miao
This paper presents an original research on eye tracking based personality test. To deal with the unavoidable human deception and inaccurate self-assessment during subjective psychological test, eye tracking techniques are utilized to reveal the participant's cognitive procedure during test. A non-intrusive real-time eye tracking based questionnaire system is developed for Chinese military recruitment personality test. A pilot study is carried out on 12 qualified samples. The preliminary result of experiment indicates a strong correlation between the participant's fixation features and test results. And such kind of relationship can be developed as an assistive indicator or a predictive parameter to traditional psychological test result to highly improve its reliability and validity in future applications.
本文对基于眼动追踪的人格测试进行了初步研究。针对主观心理测试中不可避免的人为欺骗和不准确的自我评估,采用眼动追踪技术揭示被试在测试过程中的认知过程。研制了一种非侵入式实时眼动追踪问卷系统,用于中国军队新兵性格测试。对12个合格样本进行了初步研究。实验初步结果表明,被试注视特征与测试结果有较强的相关性。这种关系可以作为传统心理测试结果的辅助指标或预测参数,在今后的应用中大大提高其信度和效度。
{"title":"Personality test based on eye tracking techniques","authors":"Yun Zhang, Wei Xin, D. Miao","doi":"10.1109/ACII.2015.7344670","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344670","url":null,"abstract":"This paper presents an original research on eye tracking based personality test. To deal with the unavoidable human deception and inaccurate self-assessment during subjective psychological test, eye tracking techniques are utilized to reveal the participant's cognitive procedure during test. A non-intrusive real-time eye tracking based questionnaire system is developed for Chinese military recruitment personality test. A pilot study is carried out on 12 qualified samples. The preliminary result of experiment indicates a strong correlation between the participant's fixation features and test results. And such kind of relationship can be developed as an assistive indicator or a predictive parameter to traditional psychological test result to highly improve its reliability and validity in future applications.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"91 1","pages":"832-837"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86028647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Posed and spontaneous facial expression differentiation using deep Boltzmann machines 利用深度玻尔兹曼机器进行姿势和自发的面部表情区分
Quan Gan, Chongliang Wu, Shangfei Wang, Q. Ji
Current works on differentiating between posed and spontaneous facial expressions usually use features that are handcrafted for expression category recognition. Till now, no features have been specifically designed for differentiating between posed and spontaneous facial expressions. Recently, deep learning models have been proven to be efficient for many challenging computer vision tasks, and therefore in this paper we propose using the deep Boltzmann machine to learn representations of facial images and to differentiate between posed and spontaneous facial expressions. First, faces are located from images. Then, a two-layer deep Boltzmann machine is trained to distinguish posed and spon-tanous expressions. Experimental results on two benchmark datasets, i.e. the SPOS and USTC-NVIE datasets, demonstrate that the deep Boltzmann machine performs well on posed and spontaneous expression differentiation tasks. Comparison results on both datasets show that our method has an advantage over the other methods.
目前在区分姿势和自然面部表情方面的工作通常使用手工制作的特征来识别表情类别。到目前为止,还没有专门设计用于区分摆姿势和自然面部表情的功能。最近,深度学习模型已被证明在许多具有挑战性的计算机视觉任务中是有效的,因此在本文中,我们建议使用深度玻尔兹曼机器来学习面部图像的表示,并区分姿势和自发的面部表情。首先,人脸是从图像中定位的。然后,训练一个两层深度玻尔兹曼机来区分有姿态和自发的表达式。在SPOS和USTC-NVIE两个基准数据集上的实验结果表明,深度玻尔兹曼机在定位和自发表达分化任务上表现良好。在两个数据集上的比较结果表明,我们的方法比其他方法有优势。
{"title":"Posed and spontaneous facial expression differentiation using deep Boltzmann machines","authors":"Quan Gan, Chongliang Wu, Shangfei Wang, Q. Ji","doi":"10.1109/ACII.2015.7344637","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344637","url":null,"abstract":"Current works on differentiating between posed and spontaneous facial expressions usually use features that are handcrafted for expression category recognition. Till now, no features have been specifically designed for differentiating between posed and spontaneous facial expressions. Recently, deep learning models have been proven to be efficient for many challenging computer vision tasks, and therefore in this paper we propose using the deep Boltzmann machine to learn representations of facial images and to differentiate between posed and spontaneous facial expressions. First, faces are located from images. Then, a two-layer deep Boltzmann machine is trained to distinguish posed and spon-tanous expressions. Experimental results on two benchmark datasets, i.e. the SPOS and USTC-NVIE datasets, demonstrate that the deep Boltzmann machine performs well on posed and spontaneous expression differentiation tasks. Comparison results on both datasets show that our method has an advantage over the other methods.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"116 1","pages":"643-648"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86065151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
A temporally piece-wise fisher vector approach for depression analysis 一种用于抑郁分析的时间分段fisher向量方法
Abhinav Dhall, Roland Göcke
Depression and other mood disorders are common, disabling disorders with a profound impact on individuals and families. Inspite of its high prevalence, it is easily missed during the early stages. Automatic depression analysis has become a very active field of research in the affective computing community in the past few years. This paper presents a framework for depression analysis based on unimodal visual cues. Temporally piece-wise Fisher Vectors (FV) are computed on temporal segments. As a low-level feature, block-wise Local Binary Pattern-Three Orthogonal Planes descriptors are computed. Statistical aggregation techniques are analysed and compared for creating a discriminative representative for a video sample. The paper explores the strength of FV in representing temporal segments in a spontaneous clinical data. This creates a meaningful representation of the facial dynamics in a temporal segment. The experiments are conducted on the Audio Video Emotion Challenge (AVEC) 2014 German speaking depression database. The superior results of the proposed framework show the effectiveness of the technique as compared to the current state-of-art.
抑郁症和其他情绪障碍是常见的致残障碍,对个人和家庭产生深远影响。尽管发病率很高,但在早期阶段很容易被忽视。在过去的几年里,自动抑郁分析已经成为情感计算界一个非常活跃的研究领域。本文提出了一种基于单峰视觉线索的抑郁分析框架。时间分段费雪向量(FV)是在时间段上计算的。作为一种低级特征,计算了分块局部二值模式-三正交平面描述符。对统计聚合技术进行了分析和比较,以创建视频样本的判别代表。本文探讨了FV在自发性临床数据中表现时间段的强度。这在时间段中创建了一个有意义的面部动态表示。实验在音频视频情绪挑战(AVEC) 2014德语抑郁数据库上进行。与目前的技术相比,所提出的框架的优越结果表明了该技术的有效性。
{"title":"A temporally piece-wise fisher vector approach for depression analysis","authors":"Abhinav Dhall, Roland Göcke","doi":"10.1109/ACII.2015.7344580","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344580","url":null,"abstract":"Depression and other mood disorders are common, disabling disorders with a profound impact on individuals and families. Inspite of its high prevalence, it is easily missed during the early stages. Automatic depression analysis has become a very active field of research in the affective computing community in the past few years. This paper presents a framework for depression analysis based on unimodal visual cues. Temporally piece-wise Fisher Vectors (FV) are computed on temporal segments. As a low-level feature, block-wise Local Binary Pattern-Three Orthogonal Planes descriptors are computed. Statistical aggregation techniques are analysed and compared for creating a discriminative representative for a video sample. The paper explores the strength of FV in representing temporal segments in a spontaneous clinical data. This creates a meaningful representation of the facial dynamics in a temporal segment. The experiments are conducted on the Audio Video Emotion Challenge (AVEC) 2014 German speaking depression database. The superior results of the proposed framework show the effectiveness of the technique as compared to the current state-of-art.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"23 1","pages":"255-259"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83298186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 48
Turing's menagerie: Talking lions, virtual bats, electric sheep and analogical peacocks: Common ground and common interest are necessary components of engagement 图灵的动物园:会说话的狮子、虚拟的蝙蝠、电子羊和模拟的孔雀:共同点和共同兴趣是参与的必要组成部分
G. McKeown
This theoretical paper attempts to define some of the key components and challenges required to create embodied conversational agents that can be genuinely interesting conversational partners. Wittgenstein's argument concerning talking lions emphasizes the importance of having a shared common ground as a basis for conversational interactions. Virtual bats suggests that-for some people at least-it is important that there be a feeling of authenticity concerning a subjectively experiencing entity that can convey what it is like to be that entity. Electric sheep reminds us of the importance of empathy in human conversational interaction and that we should provide a full communicative repertoire of both verbal and non-verbal components if we are to create genuinely engaging interactions. Also we may be making the task more difficult rather than easy if we leave out non-verbal aspects of communication. Finally, analogical peacocks highlights the importance of between minds alignment and establishes a longer term goal of being interesting, creative, and humorous if an embodied conversational agent is to be truly an engaging conversational partner. Some potential directions and solutions to addressing these issues are suggested.
这篇理论论文试图定义一些关键的组成部分和挑战,需要创建具体化的会话代理,可以成为真正有趣的会话伙伴。维特根斯坦关于会说话的狮子的论点强调了拥有共同基础作为对话互动基础的重要性。虚拟蝙蝠表明——至少对某些人来说——有一种关于主观体验实体的真实感是很重要的,这种真实感可以传达作为实体的感觉。电子羊提醒我们移情在人类对话互动中的重要性,如果我们要创造真正吸引人的互动,我们应该提供语言和非语言成分的完整交流技能。此外,如果我们忽略了沟通的非语言方面,我们可能会使任务变得更加困难而不是容易。最后,类比孔雀强调了思想一致性的重要性,并建立了一个长期目标,即如果一个具体化的对话代理要成为一个真正有吸引力的对话伙伴,就必须有趣、有创意和幽默。提出了解决这些问题的一些可能的方向和解决办法。
{"title":"Turing's menagerie: Talking lions, virtual bats, electric sheep and analogical peacocks: Common ground and common interest are necessary components of engagement","authors":"G. McKeown","doi":"10.1109/ACII.2015.7344689","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344689","url":null,"abstract":"This theoretical paper attempts to define some of the key components and challenges required to create embodied conversational agents that can be genuinely interesting conversational partners. Wittgenstein's argument concerning talking lions emphasizes the importance of having a shared common ground as a basis for conversational interactions. Virtual bats suggests that-for some people at least-it is important that there be a feeling of authenticity concerning a subjectively experiencing entity that can convey what it is like to be that entity. Electric sheep reminds us of the importance of empathy in human conversational interaction and that we should provide a full communicative repertoire of both verbal and non-verbal components if we are to create genuinely engaging interactions. Also we may be making the task more difficult rather than easy if we leave out non-verbal aspects of communication. Finally, analogical peacocks highlights the importance of between minds alignment and establishes a longer term goal of being interesting, creative, and humorous if an embodied conversational agent is to be truly an engaging conversational partner. Some potential directions and solutions to addressing these issues are suggested.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"20 1","pages":"950-955"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88076923","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Genre based emotion annotation for music in noisy environment 嘈杂环境下基于体裁的音乐情感标注
Yu-Hao Chin, Po-Chuan Lin, Tzu-Chiang Tai, Jia-Ching Wang
The music listened by human is sometimes exposed to noise. For example, background noise usually exists when listening to music in broadcasts or lives. The noise will worsen the performance in various music emotion recognition systems. To solve the problem, this work constructs a robust system for music emotion classification in a noisy environment. Furthermore, the genre is considered when determining the emotional label for the song. The proposed system consists of three major parts, i.e. subspace based noise suppression, genre index computation, and support vector machine (SVM). Firstly, the system uses noise suppression to remove the noise content in the signal. After that, acoustical features are extracted from each music clip. Next, a dictionary is constructed by using songs that cover a wide range of genres, and it is adopted to implement sparse coding. Via sparse coding, data can be transformed to sparse coefficient vectors, and this paper computes genre indexes for the music genres based on the sparse coefficient vector. The genre indexes are regarded as combination weights in the latter phase. At the training stage of the SVM, this paper train emotional models for each genre. At the prediction stage, the predictions that obtained by emotional models in each genre are weighted combined across all genres using the genre indexes. Finally, the proposed system annotates multiple emotional labels for a song based on the combined prediction. The experimental result shows that the system can achieve a good performance in both normal and noisy environments.
人类所听的音乐有时会受到噪音的影响。例如,在广播或生活中听音乐时,背景噪音通常存在。噪声会影响各种音乐情感识别系统的性能。为了解决这一问题,本文构建了一个鲁棒的噪声环境下音乐情感分类系统。此外,在确定歌曲的情感标签时,还会考虑流派。该系统由基于子空间的噪声抑制、类型索引计算和支持向量机(SVM)三大部分组成。首先,系统采用噪声抑制方法去除信号中的噪声内容。然后,从每个音乐片段中提取声学特征。其次,利用涵盖广泛体裁的歌曲构建字典,并采用字典实现稀疏编码。通过稀疏编码,将数据转换为稀疏系数向量,并根据稀疏系数向量计算音乐类型的类型指数。在后期,将类型指标作为组合权重。在支持向量机的训练阶段,本文对每个类型的情感模型进行训练。在预测阶段,使用类型指数对每种类型的情感模型所获得的预测结果进行加权组合。最后,基于组合预测为歌曲标注多个情感标签。实验结果表明,该系统在正常环境和噪声环境下都能取得良好的性能。
{"title":"Genre based emotion annotation for music in noisy environment","authors":"Yu-Hao Chin, Po-Chuan Lin, Tzu-Chiang Tai, Jia-Ching Wang","doi":"10.1109/ACII.2015.7344675","DOIUrl":"https://doi.org/10.1109/ACII.2015.7344675","url":null,"abstract":"The music listened by human is sometimes exposed to noise. For example, background noise usually exists when listening to music in broadcasts or lives. The noise will worsen the performance in various music emotion recognition systems. To solve the problem, this work constructs a robust system for music emotion classification in a noisy environment. Furthermore, the genre is considered when determining the emotional label for the song. The proposed system consists of three major parts, i.e. subspace based noise suppression, genre index computation, and support vector machine (SVM). Firstly, the system uses noise suppression to remove the noise content in the signal. After that, acoustical features are extracted from each music clip. Next, a dictionary is constructed by using songs that cover a wide range of genres, and it is adopted to implement sparse coding. Via sparse coding, data can be transformed to sparse coefficient vectors, and this paper computes genre indexes for the music genres based on the sparse coefficient vector. The genre indexes are regarded as combination weights in the latter phase. At the training stage of the SVM, this paper train emotional models for each genre. At the prediction stage, the predictions that obtained by emotional models in each genre are weighted combined across all genres using the genre indexes. Finally, the proposed system annotates multiple emotional labels for a song based on the combined prediction. The experimental result shows that the system can achieve a good performance in both normal and noisy environments.","PeriodicalId":6863,"journal":{"name":"2015 International Conference on Affective Computing and Intelligent Interaction (ACII)","volume":"29 1","pages":"863-866"},"PeriodicalIF":0.0,"publicationDate":"2015-09-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83086079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2015 International Conference on Affective Computing and Intelligent Interaction (ACII)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1