Emotion Recognition on Facial Expression and Voice: Analysis and Discussion

Kok-Why Ng, Yixen Lim, Su-Cheng Haw, Yih-Jian Yoong
{"title":"Emotion Recognition on Facial Expression and Voice: Analysis and Discussion","authors":"Kok-Why Ng, Yixen Lim, Su-Cheng Haw, Yih-Jian Yoong","doi":"10.18517/ijaseit.13.5.19023","DOIUrl":null,"url":null,"abstract":"Emotion plays an important role in our daily lives. Emotional individuals can affect the performance of a company, the harmony of a family, the wellness or growth (physical, mental, and spiritual) of a child etc. It renders a wide range of impacts. The existing works on emotion detection from facial expressions differ from the voice. It is deduced that the facial expression is captured on the face externally, whereas the voice is captured from the air passes through the vocal folds internally. Both captured output models may very much deviate from each other. This paper studies and analyses a person's emotion through dual models -- facial expression and voice separately. The proposed algorithm uses a Convolutional Neural Network (CNN) with 2-dimensions convolutional layers for facial expression and 1-Dimension convolutional layers for voice. Feature extraction is done via face detection, and Mel-Spectrogram extraction is done via voice. The network layers are fine-tuned to achieve the higher performance of the CNN model. The trained CNN models can recognize emotions from the input videos, which may cover single or multiple emotions from the facial expression and voice perspective. The experimented videos are clean from the background music and environment noise and contain only a person's voice. The proposed algorithm achieved an accuracy of 62.9% through facial expression and 82.3% through voice.","PeriodicalId":14471,"journal":{"name":"International Journal on Advanced Science, Engineering and Information Technology","volume":"60 14","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2023-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal on Advanced Science, Engineering and Information Technology","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.18517/ijaseit.13.5.19023","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q3","JCRName":"Agricultural and Biological Sciences","Score":null,"Total":0}
引用次数: 0

Abstract

Emotion plays an important role in our daily lives. Emotional individuals can affect the performance of a company, the harmony of a family, the wellness or growth (physical, mental, and spiritual) of a child etc. It renders a wide range of impacts. The existing works on emotion detection from facial expressions differ from the voice. It is deduced that the facial expression is captured on the face externally, whereas the voice is captured from the air passes through the vocal folds internally. Both captured output models may very much deviate from each other. This paper studies and analyses a person's emotion through dual models -- facial expression and voice separately. The proposed algorithm uses a Convolutional Neural Network (CNN) with 2-dimensions convolutional layers for facial expression and 1-Dimension convolutional layers for voice. Feature extraction is done via face detection, and Mel-Spectrogram extraction is done via voice. The network layers are fine-tuned to achieve the higher performance of the CNN model. The trained CNN models can recognize emotions from the input videos, which may cover single or multiple emotions from the facial expression and voice perspective. The experimented videos are clean from the background music and environment noise and contain only a person's voice. The proposed algorithm achieved an accuracy of 62.9% through facial expression and 82.3% through voice.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面部表情和声音的情感识别:分析与讨论
情感在我们的日常生活中扮演着重要的角色。情绪化的人会影响公司的业绩、家庭的和谐、孩子的健康或成长(身体、心理和精神上的)等等。它产生了广泛的影响。现有的基于面部表情的情感检测工作不同于基于声音的情感检测。据推测,面部表情是从外部捕捉到的,而声音则是从内部通过声带的空气捕捉到的。这两个捕获的输出模型可能彼此非常偏离。本文分别通过面部表情和声音两种模式对人的情绪进行研究和分析。该算法使用卷积神经网络(CNN),其中二维卷积层用于面部表情,一维卷积层用于语音。特征提取是通过人脸检测完成的,mel谱图提取是通过语音完成的。对网络层进行了微调,以实现CNN模型的更高性能。训练后的CNN模型可以从输入的视频中识别情绪,从面部表情和声音的角度来看,这些情绪可能涵盖单一或多种情绪。实验视频没有背景音乐和环境噪音,只包含一个人的声音。人脸识别的准确率为62.9%,语音识别的准确率为82.3%。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
International Journal on Advanced Science, Engineering and Information Technology
International Journal on Advanced Science, Engineering and Information Technology Agricultural and Biological Sciences-Agricultural and Biological Sciences (all)
CiteScore
1.40
自引率
0.00%
发文量
272
期刊介绍: International Journal on Advanced Science, Engineering and Information Technology (IJASEIT) is an international peer-reviewed journal dedicated to interchange for the results of high quality research in all aspect of science, engineering and information technology. The journal publishes state-of-art papers in fundamental theory, experiments and simulation, as well as applications, with a systematic proposed method, sufficient review on previous works, expanded discussion and concise conclusion. As our commitment to the advancement of science and technology, the IJASEIT follows the open access policy that allows the published articles freely available online without any subscription. The journal scopes include (but not limited to) the followings: -Science: Bioscience & Biotechnology. Chemistry & Food Technology, Environmental, Health Science, Mathematics & Statistics, Applied Physics -Engineering: Architecture, Chemical & Process, Civil & structural, Electrical, Electronic & Systems, Geological & Mining Engineering, Mechanical & Materials -Information Science & Technology: Artificial Intelligence, Computer Science, E-Learning & Multimedia, Information System, Internet & Mobile Computing
期刊最新文献
Medical Record Document Search with TF-IDF and Vector Space Model (VSM) Aesthetic Plastic Surgery Issues During the COVID-19 Period Using Topic Modeling Revolutionizing Echocardiography: A Comparative Study of Advanced AI Models for Precise Left Ventricular Segmentation The Mixed MEWMA and MCUSUM Control Chart Design of Efficiency Series Data of Production Quality Process Monitoring A Comprehensive Review of Machine Learning Approaches for Detecting Malicious Software
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1