T. Yamasaki, Yusuke Fukushima, Ryosuke Furuta, Litian Sun, K. Aizawa, Danushka Bollegala
{"title":"Prediction of User Ratings of Oral Presentations using Label Relations","authors":"T. Yamasaki, Yusuke Fukushima, Ryosuke Furuta, Litian Sun, K. Aizawa, Danushka Bollegala","doi":"10.1145/2813524.2813533","DOIUrl":null,"url":null,"abstract":"Predicting the users' impressions on a video talk is an important step for recommendation tasks. We propose a method to accurately predict multiple impression-related user ratings for a given video talk. Our proposal considers (a) multimodal features including linguistic as well as acoustic features, (b) correlations between different user ratings (labels), and (c) correlations between different feature types. In particular, the proposed method models both label and feature correlations within a single Markov random field (MRF), and jointly optimizes the label assignment problem to obtain a consistent and multiple set of labels for a given video. We train and evaluate the proposed method using a collection of 1,646 TED talk videos for 14 different tags. Experimental results on this dataset show that the proposed method obtains a statistically significant macro-average accuracy of 93.3%, outperforming several competitive baseline methods.","PeriodicalId":197562,"journal":{"name":"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"18","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st International Workshop on Affect & Sentiment in Multimedia","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2813524.2813533","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 18
Abstract
Predicting the users' impressions on a video talk is an important step for recommendation tasks. We propose a method to accurately predict multiple impression-related user ratings for a given video talk. Our proposal considers (a) multimodal features including linguistic as well as acoustic features, (b) correlations between different user ratings (labels), and (c) correlations between different feature types. In particular, the proposed method models both label and feature correlations within a single Markov random field (MRF), and jointly optimizes the label assignment problem to obtain a consistent and multiple set of labels for a given video. We train and evaluate the proposed method using a collection of 1,646 TED talk videos for 14 different tags. Experimental results on this dataset show that the proposed method obtains a statistically significant macro-average accuracy of 93.3%, outperforming several competitive baseline methods.