Hoang Manh Hung, Soohyung Kim, Hyung-Jeong Yang, Gueesang Lee
{"title":"Multiple Models Using Temporal Feature Learning for Emotion Recognition","authors":"Hoang Manh Hung, Soohyung Kim, Hyung-Jeong Yang, Gueesang Lee","doi":"10.1145/3426020.3426122","DOIUrl":null,"url":null,"abstract":"Emotion recognition has a broad variety of applications in the area of affective computing, such as education, robotics, human-computer interaction. Because of that, the emotion recognition has been a significant concern in the area of computer vision in recent years, and has allowed a great deal of effort on the part of researchers to address the complexities involved in this task. Many techniques and approaches have been studied for different problems in this area including traditional machine learning techniques and deep learning approaches. The purpose of this paper is to incorporate models together to obtain benefit from different approaches for emotion recognition based on facial expression from images and videos. At the first stage, we use MTCNN to detect the faces of the objects contained in the video, then they are extracted as feature representations through ResNet50. In the next stage, the features will be learned through multi models that is LSTM, WaveNet, and SVM then we use late fusion to get the final decision. Our method is evaluated on MuSe-CaR dataset and the experimental results can compete with the baseline.","PeriodicalId":305132,"journal":{"name":"The 9th International Conference on Smart Media and Applications","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-09-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 9th International Conference on Smart Media and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3426020.3426122","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Emotion recognition has a broad variety of applications in the area of affective computing, such as education, robotics, human-computer interaction. Because of that, the emotion recognition has been a significant concern in the area of computer vision in recent years, and has allowed a great deal of effort on the part of researchers to address the complexities involved in this task. Many techniques and approaches have been studied for different problems in this area including traditional machine learning techniques and deep learning approaches. The purpose of this paper is to incorporate models together to obtain benefit from different approaches for emotion recognition based on facial expression from images and videos. At the first stage, we use MTCNN to detect the faces of the objects contained in the video, then they are extracted as feature representations through ResNet50. In the next stage, the features will be learned through multi models that is LSTM, WaveNet, and SVM then we use late fusion to get the final decision. Our method is evaluated on MuSe-CaR dataset and the experimental results can compete with the baseline.