{"title":"Learning Visual Engagement for Trauma Recovery","authors":"Svati Dhamija, T. Boult","doi":"10.1109/WACVW.2018.00016","DOIUrl":null,"url":null,"abstract":"Applications ranging from human emotion understanding to e-health are exploring methods to effectively understand user behavior from self-reported questionnaires. However, little is understood about non-invasive techniques that involve face-based deep-learning models to predict engagement. Current research in visual engagement poses two key questions: 1) how much time do we need to analyze facial behavior for accurate engagement prediction? and 2) which deep learning approach provides the most accurate predictions? In this paper we compare RNN, GRU and LSTM using different length segments of AUs. Our experiments show no significant difference in prediction accuracy when using anywhere between 15 and 90 seconds of data. Moreover, the results reveal that simpler models of recurrent networks are statistically significantly better suited for capturing engagement from AUs.","PeriodicalId":301220,"journal":{"name":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","volume":"3 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE Winter Applications of Computer Vision Workshops (WACVW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WACVW.2018.00016","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Applications ranging from human emotion understanding to e-health are exploring methods to effectively understand user behavior from self-reported questionnaires. However, little is understood about non-invasive techniques that involve face-based deep-learning models to predict engagement. Current research in visual engagement poses two key questions: 1) how much time do we need to analyze facial behavior for accurate engagement prediction? and 2) which deep learning approach provides the most accurate predictions? In this paper we compare RNN, GRU and LSTM using different length segments of AUs. Our experiments show no significant difference in prediction accuracy when using anywhere between 15 and 90 seconds of data. Moreover, the results reveal that simpler models of recurrent networks are statistically significantly better suited for capturing engagement from AUs.