{"title":"重新审视了选美比赛:使用游戏衡量相关性的共识排名","authors":"Christopher G. Harris","doi":"10.1145/2594776.2594780","DOIUrl":null,"url":null,"abstract":"In this paper, we examine the Keynesian Beauty Contest, a well-known examination of rational agents used to explain the role of consensus predictions in decision making such as price fluctuations in equity markets. Using a game, we study the crowd's ability to judge relevance for both images and textual documents. In addition to asking participants to determine if a document is relevant, we also ask them to rank all choices. One group of participants (N=137) was asked to make judgments based on their own assessment while another group of participants (N = 137) was asked to make judgments based on their estimate of a consensus decision. In addition to measuring recall and precision, our game also uses rank-biased overlap (RBO) to compare each participant's ranked list with the overall consensus decision. Results show the group asked to make ranking decisions based on their estimate of consensus had significantly higher recall for judging relevance in text documents and significantly higher recall and precision when judging relevance for a set of images. We believe this has implications for the determination of consensus across multiple contexts.","PeriodicalId":170006,"journal":{"name":"GamifIR '14","volume":"57 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"The beauty contest revisited: measuring consensus rankings of relevance using a game\",\"authors\":\"Christopher G. Harris\",\"doi\":\"10.1145/2594776.2594780\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper, we examine the Keynesian Beauty Contest, a well-known examination of rational agents used to explain the role of consensus predictions in decision making such as price fluctuations in equity markets. Using a game, we study the crowd's ability to judge relevance for both images and textual documents. In addition to asking participants to determine if a document is relevant, we also ask them to rank all choices. One group of participants (N=137) was asked to make judgments based on their own assessment while another group of participants (N = 137) was asked to make judgments based on their estimate of a consensus decision. In addition to measuring recall and precision, our game also uses rank-biased overlap (RBO) to compare each participant's ranked list with the overall consensus decision. Results show the group asked to make ranking decisions based on their estimate of consensus had significantly higher recall for judging relevance in text documents and significantly higher recall and precision when judging relevance for a set of images. We believe this has implications for the determination of consensus across multiple contexts.\",\"PeriodicalId\":170006,\"journal\":{\"name\":\"GamifIR '14\",\"volume\":\"57 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2014-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"GamifIR '14\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/2594776.2594780\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"GamifIR '14","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/2594776.2594780","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The beauty contest revisited: measuring consensus rankings of relevance using a game
In this paper, we examine the Keynesian Beauty Contest, a well-known examination of rational agents used to explain the role of consensus predictions in decision making such as price fluctuations in equity markets. Using a game, we study the crowd's ability to judge relevance for both images and textual documents. In addition to asking participants to determine if a document is relevant, we also ask them to rank all choices. One group of participants (N=137) was asked to make judgments based on their own assessment while another group of participants (N = 137) was asked to make judgments based on their estimate of a consensus decision. In addition to measuring recall and precision, our game also uses rank-biased overlap (RBO) to compare each participant's ranked list with the overall consensus decision. Results show the group asked to make ranking decisions based on their estimate of consensus had significantly higher recall for judging relevance in text documents and significantly higher recall and precision when judging relevance for a set of images. We believe this has implications for the determination of consensus across multiple contexts.