Rating Movies and Rating the Raters Who Rate Them.

IF 1.8 4区 数学 Q1 STATISTICS & PROBABILITY American Statistician Pub Date : 2009-11-01 DOI:10.1198/tast.2009.08278
Hua Zhou, Kenneth Lange
{"title":"Rating Movies and Rating the Raters Who Rate Them.","authors":"Hua Zhou,&nbsp;Kenneth Lange","doi":"10.1198/tast.2009.08278","DOIUrl":null,"url":null,"abstract":"<p><p>The movie distribution company Netflix has generated considerable buzz in the statistics community by offering a million dollar prize for improvements to its movie rating system. Among the statisticians and computer scientists who have disclosed their techniques, the emphasis has been on machine learning approaches. This article has the modest goal of discussing a simple model for movie rating and other forms of democratic rating. Because the model involves a large number of parameters, it is nontrivial to carry out maximum likelihood estimation. Here we derive a straightforward EM algorithm from the perspective of the more general MM algorithm. The algorithm is capable of finding the global maximum on a likelihood landscape littered with inferior modes. We apply two variants of the model to a dataset from the MovieLens archive and compare their results. Our model identifies quirky raters, redefines the raw rankings, and permits imputation of missing ratings. The model is intended to stimulate discussion and development of better theory rather than to win the prize. It has the added benefit of introducing readers to some of the issues connected with analyzing high-dimensional data.</p>","PeriodicalId":50801,"journal":{"name":"American Statistician","volume":"63 4","pages":"297-307"},"PeriodicalIF":1.8000,"publicationDate":"2009-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1198/tast.2009.08278","citationCount":"20","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"American Statistician","FirstCategoryId":"100","ListUrlMain":"https://doi.org/10.1198/tast.2009.08278","RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"STATISTICS & PROBABILITY","Score":null,"Total":0}
引用次数: 20

Abstract

The movie distribution company Netflix has generated considerable buzz in the statistics community by offering a million dollar prize for improvements to its movie rating system. Among the statisticians and computer scientists who have disclosed their techniques, the emphasis has been on machine learning approaches. This article has the modest goal of discussing a simple model for movie rating and other forms of democratic rating. Because the model involves a large number of parameters, it is nontrivial to carry out maximum likelihood estimation. Here we derive a straightforward EM algorithm from the perspective of the more general MM algorithm. The algorithm is capable of finding the global maximum on a likelihood landscape littered with inferior modes. We apply two variants of the model to a dataset from the MovieLens archive and compare their results. Our model identifies quirky raters, redefines the raw rankings, and permits imputation of missing ratings. The model is intended to stimulate discussion and development of better theory rather than to win the prize. It has the added benefit of introducing readers to some of the issues connected with analyzing high-dimensional data.

Abstract Image

Abstract Image

Abstract Image

查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
给电影打分,给给电影打分的人打分。
电影发行公司奈飞公司(Netflix)悬赏100万美元,希望改进其电影评级系统,这在统计界引起了不小的反响。在公开了他们技术的统计学家和计算机科学家中,重点一直放在机器学习方法上。本文的适度目标是讨论一个简单的电影评级模型和其他形式的民主评级。由于该模型涉及大量的参数,因此进行极大似然估计是非平凡的。在这里,我们从更一般的MM算法的角度推导了一个简单的EM算法。该算法能够在充斥着劣等模态的似然图上找到全局最大值。我们将模型的两个变体应用于MovieLens存档的数据集,并比较它们的结果。我们的模型识别出古怪的评级者,重新定义原始排名,并允许对缺失评级进行估算。该模型旨在促进讨论和发展更好的理论,而不是为了获奖。它还有一个额外的好处,就是向读者介绍与分析高维数据相关的一些问题。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
American Statistician
American Statistician 数学-统计学与概率论
CiteScore
3.50
自引率
5.60%
发文量
64
审稿时长
>12 weeks
期刊介绍: Are you looking for general-interest articles about current national and international statistical problems and programs; interesting and fun articles of a general nature about statistics and its applications; or the teaching of statistics? Then you are looking for The American Statistician (TAS), published quarterly by the American Statistical Association. TAS contains timely articles organized into the following sections: Statistical Practice, General, Teacher''s Corner, History Corner, Interdisciplinary, Statistical Computing and Graphics, Reviews of Books and Teaching Materials, and Letters to the Editor.
期刊最新文献
Causal Inference with Complex Surveys: A Unified Perspective on Sample Selection and Exposure Selection Performance Analysis of NSUM Estimators in Social-Network Topologies Cross-validatory Z-Residual for Diagnosing Shared Frailty Models A Pareto tail plot without moment restrictions Sparse-group boosting: Unbiased group and variable selection
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1