提高自动记录筛选的召回率：重采样算法

IF 6.1 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Research Synthesis Methods Pub Date : 2024-01-07 DOI:10.1002/jrsm.1690

Zhipeng Hou, Elizabeth Tipton

{"title":"提高自动记录筛选的召回率：重采样算法","authors":"Zhipeng Hou, Elizabeth Tipton","doi":"10.1002/jrsm.1690","DOIUrl":null,"url":null,"abstract":"Literature screening is the process of identifying all relevant records from a pool of candidate paper records in systematic review, meta-analysis, and other research synthesis tasks. This process is time consuming, expensive, and prone to human error. Screening prioritization methods attempt to help reviewers identify most relevant records while only screening a proportion of candidate records with high priority. In previous studies, screening prioritization is often referred to as automatic literature screening or automatic literature identification. Numerous screening prioritization methods have been proposed in recent years. However, there is a lack of screening prioritization methods with reliable performance. Our objective is to develop a screening prioritization algorithm with reliable performance for practical use, for example, an algorithm that guarantees an 80% chance of identifying at least <math>\n <mrow>\n <mn>80</mn>\n <mo>%</mo>\n </mrow></math> of the relevant records. Based on a target-based method proposed in Cormack and Grossman, we propose a screening prioritization algorithm using sampling with replacement. The algorithm is a wrapper algorithm that can work with any current screening prioritization algorithm to guarantee the performance. We prove, with mathematics and probability theory, that the algorithm guarantees the performance. We also run numeric experiments to test the performance of our algorithm when applied in practice. The numeric experiment results show this algorithm achieve reliable performance under different circumstances. The proposed screening prioritization algorithm can be reliably used in real world research synthesis tasks.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 3","pages":"372-383"},"PeriodicalIF":6.1000,"publicationDate":"2024-01-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1690","citationCount":"0","resultStr":"{\"title\":\"Enhancing recall in automated record screening: A resampling algorithm\",\"authors\":\"Zhipeng Hou, Elizabeth Tipton\",\"doi\":\"10.1002/jrsm.1690\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Literature screening is the process of identifying all relevant records from a pool of candidate paper records in systematic review, meta-analysis, and other research synthesis tasks. This process is time consuming, expensive, and prone to human error. Screening prioritization methods attempt to help reviewers identify most relevant records while only screening a proportion of candidate records with high priority. In previous studies, screening prioritization is often referred to as automatic literature screening or automatic literature identification. Numerous screening prioritization methods have been proposed in recent years. However, there is a lack of screening prioritization methods with reliable performance. Our objective is to develop a screening prioritization algorithm with reliable performance for practical use, for example, an algorithm that guarantees an 80% chance of identifying at least <math>\\n <mrow>\\n <mn>80</mn>\\n <mo>%</mo>\\n </mrow></math> of the relevant records. Based on a target-based method proposed in Cormack and Grossman, we propose a screening prioritization algorithm using sampling with replacement. The algorithm is a wrapper algorithm that can work with any current screening prioritization algorithm to guarantee the performance. We prove, with mathematics and probability theory, that the algorithm guarantees the performance. We also run numeric experiments to test the performance of our algorithm when applied in practice. The numeric experiment results show this algorithm achieve reliable performance under different circumstances. The proposed screening prioritization algorithm can be reliably used in real world research synthesis tasks.\",\"PeriodicalId\":226,\"journal\":{\"name\":\"Research Synthesis Methods\",\"volume\":\"15 3\",\"pages\":\"372-383\"},\"PeriodicalIF\":6.1000,\"publicationDate\":\"2024-01-07\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1690\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Research Synthesis Methods\",\"FirstCategoryId\":\"99\",\"ListUrlMain\":\"https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1690\",\"RegionNum\":2,\"RegionCategory\":\"生物学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q1\",\"JCRName\":\"MATHEMATICAL & COMPUTATIONAL BIOLOGY\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Research Synthesis Methods","FirstCategoryId":"99","ListUrlMain":"https://onlinelibrary.wiley.com/doi/10.1002/jrsm.1690","RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"MATHEMATICAL & COMPUTATIONAL BIOLOGY","Score":null,"Total":0}

引用次数: 0

摘要

文献筛选是从系统综述、荟萃分析和其他研究综述任务的候选纸质记录库中找出所有相关记录的过程。这一过程耗时长、成本高，而且容易出现人为错误。筛选优先级的方法试图帮助审稿人识别最相关的记录，同时只筛选一部分具有高优先级的候选记录。在以往的研究中，筛选优先级通常被称为自动文献筛选或自动文献识别。近年来提出了许多筛选优先级的方法。然而，目前还缺乏性能可靠的筛选优先级排序方法。我们的目标是为实际应用开发一种性能可靠的筛选优先级算法，例如，一种能保证 80% 的几率识别出至少 80% $$ 80\% $$ 的相关记录的算法。基于 Cormack 和 Grossman 提出的基于目标的方法，我们提出了一种使用替换抽样的筛选优先级算法。该算法是一种包装算法，可以与当前任何筛选优先级算法一起使用，以保证性能。我们用数学和概率论证明，该算法可以保证性能。我们还进行了数值实验，测试算法在实际应用中的性能。数值实验结果表明，该算法在不同情况下都能实现可靠的性能。所提出的筛选优先级算法可以可靠地应用于现实世界的研究综合任务中。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

摘要图片

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Enhancing recall in automated record screening: A resampling algorithm

Literature screening is the process of identifying all relevant records from a pool of candidate paper records in systematic review, meta-analysis, and other research synthesis tasks. This process is time consuming, expensive, and prone to human error. Screening prioritization methods attempt to help reviewers identify most relevant records while only screening a proportion of candidate records with high priority. In previous studies, screening prioritization is often referred to as automatic literature screening or automatic literature identification. Numerous screening prioritization methods have been proposed in recent years. However, there is a lack of screening prioritization methods with reliable performance. Our objective is to develop a screening prioritization algorithm with reliable performance for practical use, for example, an algorithm that guarantees an 80% chance of identifying at least $80 %$ of the relevant records. Based on a target-based method proposed in Cormack and Grossman, we propose a screening prioritization algorithm using sampling with replacement. The algorithm is a wrapper algorithm that can work with any current screening prioritization algorithm to guarantee the performance. We prove, with mathematics and probability theory, that the algorithm guarantees the performance. We also run numeric experiments to test the performance of our algorithm when applied in practice. The numeric experiment results show this algorithm achieve reliable performance under different circumstances. The proposed screening prioritization algorithm can be reliably used in real world research synthesis tasks.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Research Synthesis Methods MATHEMATICAL & COMPUTATIONAL BIOLOGYMULTID-MULTIDISCIPLINARY SCIENCES

CiteScore

16.90

自引率

3.10%

发文量

期刊介绍： Research Synthesis Methods is a reputable, peer-reviewed journal that focuses on the development and dissemination of methods for conducting systematic research synthesis. Our aim is to advance the knowledge and application of research synthesis methods across various disciplines. Our journal provides a platform for the exchange of ideas and knowledge related to designing, conducting, analyzing, interpreting, reporting, and applying research synthesis. While research synthesis is commonly practiced in the health and social sciences, our journal also welcomes contributions from other fields to enrich the methodologies employed in research synthesis across scientific disciplines. By bridging different disciplines, we aim to foster collaboration and cross-fertilization of ideas, ultimately enhancing the quality and effectiveness of research synthesis methods. Whether you are a researcher, practitioner, or stakeholder involved in research synthesis, our journal strives to offer valuable insights and practical guidance for your work.