会议同行评议的系统级分析

Proceedings of the 23rd ACM Conference on Economics and Computation Pub Date : 2022-07-12 DOI:10.1145/3490486.3538235

Yichi Zhang, Fang-Yi Yu, G. Schoenebeck, D. Kempe

{"title":"会议同行评议的系统级分析","authors":"Yichi Zhang, Fang-Yi Yu, G. Schoenebeck, D. Kempe","doi":"10.1145/3490486.3538235","DOIUrl":null,"url":null,"abstract":"We undertake a system-level analysis of the conference peer review process. The process involves three constituencies with different objectives: authors want their papers accepted at prestigious venues (and quickly), conferences want to present a program with many high-quality and few low-quality papers, and reviewers want to avoid being overburdened by reviews. These objectives are far from aligned; the key obstacle is that the evaluation of the merits of a submission (both by the authors and the reviewers) is inherently noisy. Over the years, conferences have experimented with numerous policies and innovations to navigate the tradeoffs. These experiments include setting various bars for acceptance, varying the number of reviews per submission, requiring prior reviews to be included with resubmissions, and others. The purpose of the present work is to investigate, both analytically and using agent-based simulations, how well various policies work, and more importantly, why they do or do not work. We model the conference-author interactions as a Stackelberg game in which a prestigious conference commits to a threshold acceptance policy which will be applied to the (noisy) reviews of each submitted paper; the authors best-respond by submitting or not submitting to the conference, the alternative being a \"sure accept\" (such as arXiv or a lightly refereed venue). Our findings include: observing that the conference should typically set a higher acceptance threshold than the actual desired quality, which we call the resubmission gap and quantify in terms of various parameters; observing that the reviewing load is heavily driven by resubmissions of borderline papers --- therefore, a judicious choice of acceptance threshold may lead to fewer reviews while incurring an acceptable loss in quality; observing that depending on the paper quality distribution, stricter reviewing may lead to higher or lower acceptance rates --- the former is the result of self selection by the authors. As a rule of thumb, a relatively small number of reviews per paper, coupled with a strict acceptance policy, tends to do well in trading off these two objectives; finding that a relatively small increase in review quality or in self assessment by the authors is much more effective for conference quality control (without a large increase in review burden) than increases in the quantity of reviews per paper.; showing that keeping track of past reviews of papers can help reduce the review burden without a decrease in conference quality. For robustness, we consider different models of paper quality and learn some of the parameters from real data.","PeriodicalId":209859,"journal":{"name":"Proceedings of the 23rd ACM Conference on Economics and Computation","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2022-07-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"A System-Level Analysis of Conference Peer Review\",\"authors\":\"Yichi Zhang, Fang-Yi Yu, G. Schoenebeck, D. Kempe\",\"doi\":\"10.1145/3490486.3538235\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We undertake a system-level analysis of the conference peer review process. The process involves three constituencies with different objectives: authors want their papers accepted at prestigious venues (and quickly), conferences want to present a program with many high-quality and few low-quality papers, and reviewers want to avoid being overburdened by reviews. These objectives are far from aligned; the key obstacle is that the evaluation of the merits of a submission (both by the authors and the reviewers) is inherently noisy. Over the years, conferences have experimented with numerous policies and innovations to navigate the tradeoffs. These experiments include setting various bars for acceptance, varying the number of reviews per submission, requiring prior reviews to be included with resubmissions, and others. The purpose of the present work is to investigate, both analytically and using agent-based simulations, how well various policies work, and more importantly, why they do or do not work. We model the conference-author interactions as a Stackelberg game in which a prestigious conference commits to a threshold acceptance policy which will be applied to the (noisy) reviews of each submitted paper; the authors best-respond by submitting or not submitting to the conference, the alternative being a \\\"sure accept\\\" (such as arXiv or a lightly refereed venue). Our findings include: observing that the conference should typically set a higher acceptance threshold than the actual desired quality, which we call the resubmission gap and quantify in terms of various parameters; observing that the reviewing load is heavily driven by resubmissions of borderline papers --- therefore, a judicious choice of acceptance threshold may lead to fewer reviews while incurring an acceptable loss in quality; observing that depending on the paper quality distribution, stricter reviewing may lead to higher or lower acceptance rates --- the former is the result of self selection by the authors. As a rule of thumb, a relatively small number of reviews per paper, coupled with a strict acceptance policy, tends to do well in trading off these two objectives; finding that a relatively small increase in review quality or in self assessment by the authors is much more effective for conference quality control (without a large increase in review burden) than increases in the quantity of reviews per paper.; showing that keeping track of past reviews of papers can help reduce the review burden without a decrease in conference quality. For robustness, we consider different models of paper quality and learn some of the parameters from real data.\",\"PeriodicalId\":209859,\"journal\":{\"name\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2022-07-12\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 23rd ACM Conference on Economics and Computation\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3490486.3538235\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 23rd ACM Conference on Economics and Computation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3490486.3538235","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

我们对会议同行评议过程进行系统级分析。这个过程涉及三个具有不同目标的群体:作者希望他们的论文在著名的场所被接受(而且要快)，会议希望展示一个有很多高质量论文和很少低质量论文的项目，审稿人希望避免审稿负担过重。这些目标远非一致;关键的障碍是(作者和审稿人)对提交的优点的评估本身就是嘈杂的。多年来，会议已经尝试了许多政策和创新，以应对权衡。这些实验包括设置不同的接受标准，改变每次提交的审查数量，要求在重新提交时包含先前的审查，等等。当前工作的目的是通过分析和使用基于代理的模拟来调查各种策略的工作效果，更重要的是，为什么它们起作用或不起作用。我们将会议与作者的互动建模为Stackelberg游戏，其中一个著名的会议承诺一个阈值接受政策，该政策将应用于每一篇提交论文的(嘈杂的)评论;作者最好的回应是提交或不提交会议，另一种选择是“确定接受”(如arXiv或轻度评审的地点)。我们的发现包括:观察到会议通常应该设定比实际期望质量更高的接受阈值，我们称之为重新提交差距，并根据各种参数进行量化;观察到审稿负荷很大程度上是由边缘论文的重新提交驱动的——因此，明智地选择接受阈值可能会导致审稿减少，同时导致可接受的质量损失;观察到根据论文质量分布的不同，更严格的审稿可能导致更高或更低的接受率——前者是作者自我选择的结果。根据经验，每篇论文相对较少的评论，加上严格的接受政策，往往可以很好地权衡这两个目标;发现相对较小的评审质量或作者自我评估的提高对会议质量控制(不增加评审负担)比每篇论文评审数量的增加更有效。表明跟踪过去的论文评审可以在不降低会议质量的情况下帮助减轻评审负担。为了提高鲁棒性，我们考虑了不同的纸张质量模型，并从实际数据中学习了一些参数。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

A System-Level Analysis of Conference Peer Review

We undertake a system-level analysis of the conference peer review process. The process involves three constituencies with different objectives: authors want their papers accepted at prestigious venues (and quickly), conferences want to present a program with many high-quality and few low-quality papers, and reviewers want to avoid being overburdened by reviews. These objectives are far from aligned; the key obstacle is that the evaluation of the merits of a submission (both by the authors and the reviewers) is inherently noisy. Over the years, conferences have experimented with numerous policies and innovations to navigate the tradeoffs. These experiments include setting various bars for acceptance, varying the number of reviews per submission, requiring prior reviews to be included with resubmissions, and others. The purpose of the present work is to investigate, both analytically and using agent-based simulations, how well various policies work, and more importantly, why they do or do not work. We model the conference-author interactions as a Stackelberg game in which a prestigious conference commits to a threshold acceptance policy which will be applied to the (noisy) reviews of each submitted paper; the authors best-respond by submitting or not submitting to the conference, the alternative being a "sure accept" (such as arXiv or a lightly refereed venue). Our findings include: observing that the conference should typically set a higher acceptance threshold than the actual desired quality, which we call the resubmission gap and quantify in terms of various parameters; observing that the reviewing load is heavily driven by resubmissions of borderline papers --- therefore, a judicious choice of acceptance threshold may lead to fewer reviews while incurring an acceptable loss in quality; observing that depending on the paper quality distribution, stricter reviewing may lead to higher or lower acceptance rates --- the former is the result of self selection by the authors. As a rule of thumb, a relatively small number of reviews per paper, coupled with a strict acceptance policy, tends to do well in trading off these two objectives; finding that a relatively small increase in review quality or in self assessment by the authors is much more effective for conference quality control (without a large increase in review burden) than increases in the quantity of reviews per paper.; showing that keeping track of past reviews of papers can help reduce the review burden without a decrease in conference quality. For robustness, we consider different models of paper quality and learn some of the parameters from real data.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 23rd ACM Conference on Economics and Computation

自引率

0.00%

发文量

期刊最新文献

On Two-sided Matching in Infinite Markets Herd Design Efficient Capacity Provisioning for Firms with Multiple Locations: The Case of the Public Cloud Tight Incentive Analysis on Sybil Attacks to Market Equilibrium of Resource Exchange over General Networks General Graphs are Easier than Bipartite Graphs: Tight Bounds for Secretary Matching