人群排名分析

Proceedings of the 18th International Workshop on Web and Databases Pub Date : 2015-05-31 DOI:10.1145/2767109.2767110

Julia Stoyanovich, Marie Jacob, Xuemei Gong

{"title":"人群排名分析","authors":"Julia Stoyanovich, Marie Jacob, Xuemei Gong","doi":"10.1145/2767109.2767110","DOIUrl":null,"url":null,"abstract":"Ranked data is ubiquitous in real-world applications, arising naturally when users express preferences about products and services, when voters cast ballots in elections, and when funding proposals are evaluated based on their merits or university departments based on their reputation. This paper focuses on crowdsourcing and novel analysis of ranked data. We describe the design of a data collection task in which Amazon MT workers were asked to rank movies. We present results of data analysis, correlating our ranked dataset with IMDb, where movies are rated on a discrete scale rather than ranked. We develop an intuitive measure of worker quality appropriate for this task, where no gold standard answer exists. We propose a model of local structure in ranked datasets, reflecting that subsets of the workers agree in their ranking over subsets of the items, develop a data mining algorithm that identifies such structure, and evaluate in on our dataset. Our dataset is publicly available at https://github.com/stoyanovich/CrowdRank.","PeriodicalId":316270,"journal":{"name":"Proceedings of the 18th International Workshop on Web and Databases","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"14","resultStr":"{\"title\":\"Analyzing Crowd Rankings\",\"authors\":\"Julia Stoyanovich, Marie Jacob, Xuemei Gong\",\"doi\":\"10.1145/2767109.2767110\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Ranked data is ubiquitous in real-world applications, arising naturally when users express preferences about products and services, when voters cast ballots in elections, and when funding proposals are evaluated based on their merits or university departments based on their reputation. This paper focuses on crowdsourcing and novel analysis of ranked data. We describe the design of a data collection task in which Amazon MT workers were asked to rank movies. We present results of data analysis, correlating our ranked dataset with IMDb, where movies are rated on a discrete scale rather than ranked. We develop an intuitive measure of worker quality appropriate for this task, where no gold standard answer exists. We propose a model of local structure in ranked datasets, reflecting that subsets of the workers agree in their ranking over subsets of the items, develop a data mining algorithm that identifies such structure, and evaluate in on our dataset. Our dataset is publicly available at https://github.com/stoyanovich/CrowdRank.\",\"PeriodicalId\":316270,\"journal\":{\"name\":\"Proceedings of the 18th International Workshop on Web and Databases\",\"volume\":\"81 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-31\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"14\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 18th International Workshop on Web and Databases\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2767109.2767110\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 18th International Workshop on Web and Databases","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2767109.2767110","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 14

摘要

排名数据在现实世界的应用程序中无处不在，当用户表达对产品和服务的偏好时，当选民在选举中投票时，当根据他们的优点评估资助提案或根据他们的声誉评估大学院系时，排名数据自然产生。本文的重点是众包和排名数据的新颖分析。我们描述了一个数据收集任务的设计，在这个任务中，亚马逊MT工作人员被要求对电影进行排名。我们展示了数据分析的结果，将我们的排名数据集与IMDb相关联，在IMDb中，电影是在离散的尺度上进行评级的，而不是排名。在没有黄金标准答案的情况下，我们开发了一种适合于这项任务的工人素质的直观测量方法。我们提出了一个排名数据集中的局部结构模型，反映了工人的子集在项目子集上的排名一致，开发了一个识别这种结构的数据挖掘算法，并在我们的数据集上进行评估。我们的数据集可以在https://github.com/stoyanovich/CrowdRank上公开获取。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Analyzing Crowd Rankings

Ranked data is ubiquitous in real-world applications, arising naturally when users express preferences about products and services, when voters cast ballots in elections, and when funding proposals are evaluated based on their merits or university departments based on their reputation. This paper focuses on crowdsourcing and novel analysis of ranked data. We describe the design of a data collection task in which Amazon MT workers were asked to rank movies. We present results of data analysis, correlating our ranked dataset with IMDb, where movies are rated on a discrete scale rather than ranked. We develop an intuitive measure of worker quality appropriate for this task, where no gold standard answer exists. We propose a model of local structure in ranked datasets, reflecting that subsets of the workers agree in their ranking over subsets of the items, develop a data mining algorithm that identifies such structure, and evaluate in on our dataset. Our dataset is publicly available at https://github.com/stoyanovich/CrowdRank.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 18th International Workshop on Web and Databases

自引率

0.00%

发文量

期刊最新文献

Discovering Subsumption Relationships for Web-Based Ontologies Truth Finding with Attribute Partitioning Long-term Optimization of Update Frequencies for Decaying Information Analyzing Crowd Rankings The elephant in the room: getting value from Big Data