Bin Zhu, Xinjie Lan, Xin Guo, K. Barner, C. Boncelet
{"title":"Multi-rate Attention Based GRU Model for Engagement Prediction","authors":"Bin Zhu, Xinjie Lan, Xin Guo, K. Barner, C. Boncelet","doi":"10.1145/3382507.3417965","DOIUrl":null,"url":null,"abstract":"Engagement detection is essential in many areas such as driver attention tracking, employee engagement monitoring, and student engagement evaluation. In this paper, we propose a novel approach using attention based hybrid deep models for the 8th Emotion Recognition in the Wild (EmotiW 2020) Grand Challenge in the category of engagement prediction in the wild EMOTIW2020. The task aims to predict the engagement intensity of subjects in videos, and the subjects are students watching educational videos from Massive Open Online Courses (MOOCs). To complete the task, we propose a hybrid deep model based on multi-rate and multi-instance attention. The novelty of the proposed model can be summarized in three aspects: (a) an attention based Gated Recurrent Unit (GRU) deep network, (b) heuristic multi-rate processing on video based data, and (c) a rigorous and accurate ensemble model. Experimental results on the validation set and test set show that our method makes promising improvements, achieving a competitively low MSE of 0.0541 on the test set, improving on the baseline results by 64%. The proposed model won the first place in the engagement prediction in the wild challenge.","PeriodicalId":402394,"journal":{"name":"Proceedings of the 2020 International Conference on Multimodal Interaction","volume":"130 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2020 International Conference on Multimodal Interaction","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3382507.3417965","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 20
Abstract
Engagement detection is essential in many areas such as driver attention tracking, employee engagement monitoring, and student engagement evaluation. In this paper, we propose a novel approach using attention based hybrid deep models for the 8th Emotion Recognition in the Wild (EmotiW 2020) Grand Challenge in the category of engagement prediction in the wild EMOTIW2020. The task aims to predict the engagement intensity of subjects in videos, and the subjects are students watching educational videos from Massive Open Online Courses (MOOCs). To complete the task, we propose a hybrid deep model based on multi-rate and multi-instance attention. The novelty of the proposed model can be summarized in three aspects: (a) an attention based Gated Recurrent Unit (GRU) deep network, (b) heuristic multi-rate processing on video based data, and (c) a rigorous and accurate ensemble model. Experimental results on the validation set and test set show that our method makes promising improvements, achieving a competitively low MSE of 0.0541 on the test set, improving on the baseline results by 64%. The proposed model won the first place in the engagement prediction in the wild challenge.
敬业度检测在许多领域都是必不可少的,比如司机注意力跟踪、员工敬业度监测和学生敬业度评估。在本文中,我们提出了一种使用基于注意力的混合深度模型的新方法,用于第八届野生情绪识别(EMOTIW2020)大挑战赛(EMOTIW2020)的野生情绪识别(EMOTIW2020)投入预测类别。该任务旨在预测视频中科目的参与强度,受试者是观看大规模在线开放课程(Massive Open Online Courses, MOOCs)教育视频的学生。为了完成这一任务,我们提出了一种基于多速率和多实例关注的混合深度模型。该模型的新颖性可以概括为三个方面:(a)基于注意力的门控循环单元(GRU)深度网络,(b)基于视频数据的启发式多速率处理,以及(c)严格而准确的集成模型。在验证集和测试集上的实验结果表明,我们的方法取得了有希望的改进,在测试集上实现了0.0541的竞争性低MSE,比基线结果提高了64%。该模型在野外挑战赛的参与度预测中获得第一名。