Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet Domain Features

Shamman Noor, Ehsan Ahmed Dhrubo, A. T. Minhaz, C. Shahnaz, S. Fattah
{"title":"Audio Visual Emotion Recognition Using Cross Correlation and Wavelet Packet Domain Features","authors":"Shamman Noor, Ehsan Ahmed Dhrubo, A. T. Minhaz, C. Shahnaz, S. Fattah","doi":"10.1109/WIECON-ECE.2017.8468871","DOIUrl":null,"url":null,"abstract":"The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes a method for recognizing emotions from human Speech and visual data for machine to understand. For extraction of features, videos consisting 6 classes of emotions (Happy, Sad, Fear, Disgust, Angry, and Surprise) of 44 different subjects from eNTERFACE05 database are used. As video feature, Horizontal and Vertical Cross Correlation (HCCR and VCCR) signals, extracted from regions-eye and mouth, are used. As Speech feature, Perceptual Linear Predictive Coefficients (PLPC) and Mel-frequency Cepstral Coefficients (MFCC), extracted from Wavelet Packet Coefficients, are used in conjunction with PLPC and MFCC extracted from original signal. For both types of feature, K-Nearest Neighbour (KNN) multiclass classification method is applied separately for identifying emotions expressed in speech and through facial movement. Emotion expressed in a video file is identified by concatenating the Speech and video features and applying KNN classification method.","PeriodicalId":188031,"journal":{"name":"2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE International WIE Conference on Electrical and Computer Engineering (WIECON-ECE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WIECON-ECE.2017.8468871","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2

Abstract

The better a machine realizes non-verbal ways of communication, such as emotion, better levels of human machine interrelation is achieved. This paper describes a method for recognizing emotions from human Speech and visual data for machine to understand. For extraction of features, videos consisting 6 classes of emotions (Happy, Sad, Fear, Disgust, Angry, and Surprise) of 44 different subjects from eNTERFACE05 database are used. As video feature, Horizontal and Vertical Cross Correlation (HCCR and VCCR) signals, extracted from regions-eye and mouth, are used. As Speech feature, Perceptual Linear Predictive Coefficients (PLPC) and Mel-frequency Cepstral Coefficients (MFCC), extracted from Wavelet Packet Coefficients, are used in conjunction with PLPC and MFCC extracted from original signal. For both types of feature, K-Nearest Neighbour (KNN) multiclass classification method is applied separately for identifying emotions expressed in speech and through facial movement. Emotion expressed in a video file is identified by concatenating the Speech and video features and applying KNN classification method.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于相互关联和小波包域特征的视听情感识别
机器越能实现非语言的交流方式,如情感,就能达到更好的人机交互水平。本文描述了一种从人类语音和视觉数据中识别情感的方法,以供机器理解。特征提取使用eNTERFACE05数据库中44个不同受试者的6类情绪(Happy, Sad, Fear, Disgust, Angry, and Surprise)视频。视频特征采用了从眼睛和嘴巴区域提取的水平和垂直互相关信号(HCCR和VCCR)。语音特征采用从小波包系数中提取的感知线性预测系数(PLPC)和Mel-frequency倒谱系数(MFCC)与从原始信号中提取的PLPC和MFCC相结合的方法。对于这两种类型的特征,分别应用k -最近邻(KNN)多类分类方法来识别语音中表达的情绪和通过面部运动表达的情绪。将视频文件中的语音特征和视频特征拼接起来,应用KNN分类方法对视频文件中的情感进行识别。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Price Aware Residential Demand Response With Renewable Sources and Electric Vehicle Enhanced Power Generation from Piezoelectric System under Partial Vibration Condition Implementation of ABC Algorithm To Solve Simultaneous Substation Expansion And Transmission Expansion Planning Optimal PMU Placement for Complete Power System Observability under (P–1) Contingency Nanotechnology-Based Efficient Fault Tolerant Decoder in Reversible Logic
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1