{"title":"使用混合网络的群体级音频-视频情感识别","authors":"Chuanhe Liu, Wenqian Jiang, Minghao Wang, Tianhao Tang","doi":"10.1145/3382507.3417968","DOIUrl":null,"url":null,"abstract":"This paper presents a hybrid network for audio-video group Emo-tion Recognition. The proposed architecture includes audio stream,facial emotion stream, environmental object statistics stream (EOS)and video stream. We adopted this method at the 8th EmotionRecognition in the Wild Challenge (EmotiW2020). According to thefeedback of our submissions, the best result achieved 76.85% in theVideo level Group AFfect (VGAF) Test Database, 26.89% higherthan the baseline. Such improvements prove that our method isstate-of-the-art.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"28 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"17","resultStr":"{\"title\":\"Group Level Audio-Video Emotion Recognition Using Hybrid Networks\",\"authors\":\"Chuanhe Liu, Wenqian Jiang, Minghao Wang, Tianhao Tang\",\"doi\":\"10.1145/3382507.3417968\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"This paper presents a hybrid network for audio-video group Emo-tion Recognition. The proposed architecture includes audio stream,facial emotion stream, environmental object statistics stream (EOS)and video stream. We adopted this method at the 8th EmotionRecognition in the Wild Challenge (EmotiW2020). According to thefeedback of our submissions, the best result achieved 76.85% in theVideo level Group AFfect (VGAF) Test Database, 26.89% higherthan the baseline. Such improvements prove that our method isstate-of-the-art.\",\"PeriodicalId\":402394,\"journal\":{\"name\":\"Proceedings of the 2020 International Conference on Multimodal Interaction\",\"volume\":\"28 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-21\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"17\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 2020 International Conference on Multimodal Interaction\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3382507.3417968\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3417968","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Group Level Audio-Video Emotion Recognition Using Hybrid Networks
This paper presents a hybrid network for audio-video group Emo-tion Recognition. The proposed architecture includes audio stream,facial emotion stream, environmental object statistics stream (EOS)and video stream. We adopted this method at the 8th EmotionRecognition in the Wild Challenge (EmotiW2020). According to thefeedback of our submissions, the best result achieved 76.85% in theVideo level Group AFfect (VGAF) Test Database, 26.89% higherthan the baseline. Such improvements prove that our method isstate-of-the-art.