首页 > 最新文献

Research Synthesis Methods最新文献

英文 中文
Machine learning for identifying randomised controlled trials when conducting systematic reviews: Development and evaluation of its impact on practice. 在进行系统评价时用于识别随机对照试验的机器学习:其对实践影响的发展和评估。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 Epub Date: 2025-03-21 DOI: 10.1017/rsm.2025.3
Xuan Qin, Minghong Yao, Xiaochao Luo, Jiali Liu, Yu Ma, Yanmei Liu, Hao Li, Ke Deng, Kang Zou, Ling Li, Xin Sun

Machine learning (ML) models have been developed to identify randomised controlled trials (RCTs) to accelerate systematic reviews (SRs). However, their use has been limited due to concerns about their performance and practical benefits. We developed a high-recall ensemble learning model using Cochrane RCT data to enhance the identification of RCTs for rapid title and abstract screening in SRs and evaluated the model externally with our annotated RCT datasets. Additionally, we assessed the practical impact in terms of labour time savings and recall improvement under two scenarios: ML-assisted double screening (where ML and one reviewer screened all citations in parallel) and ML-assisted stepwise screening (where ML flagged all potential RCTs, and at least two reviewers subsequently filtered the flagged citations). Our model achieved twice the precision compared to the existing SVM model while maintaining a recall of 0.99 in both internal and external tests. In a practical evaluation with ML-assisted double screening, our model led to significant labour time savings (average 45.4%) and improved recall (average 0.998 compared to 0.919 for a single reviewer). In ML-assisted stepwise screening, the model performed similarly to standard manual screening but with average labour time savings of 74.4%. In conclusion, compared with existing methods, the proposed model can reduce workload while maintaining comparable recall when identifying RCTs during the title and abstract screening stages, thereby accelerating SRs. We propose practical recommendations to effectively apply ML-assisted manual screening when conducting SRs, depending on reviewer availability (ML-assisted double screening) or time constraints (ML-assisted stepwise screening).

机器学习(ML)模型已被开发用于识别随机对照试验(rct),以加速系统评价(SRs)。然而,由于对其性能和实际效益的担忧,它们的使用受到限制。我们利用Cochrane RCT数据开发了一个高查全率集成学习模型,以增强RCT在SRs中的快速标题和摘要筛选的识别,并使用我们的注释RCT数据集对模型进行外部评估。此外,我们评估了在两种情况下节省劳动时间和提高召回率方面的实际影响:机器学习辅助双重筛选(机器学习和一名审稿人并行筛选所有引用)和机器学习辅助逐步筛选(机器学习标记所有潜在的随机对照试验,至少两名审稿人随后过滤标记的引用)。与现有的SVM模型相比,我们的模型实现了两倍的精度,同时在内部和外部测试中保持了0.99的召回率。在使用机器学习辅助双重筛选的实际评估中,我们的模型显著节省了劳动时间(平均45.4%),提高了召回率(平均0.998,而单个评论者的召回率为0.919)。在机器学习辅助逐步筛选中,该模型的表现与标准人工筛选相似,但平均节省了74.4%的劳动时间。综上所述,与现有方法相比,本文提出的模型可以减少工作量,同时在标题和摘要筛选阶段识别rct时保持可比较的召回率,从而加快SRs。根据审稿人的可用性(机器学习辅助的双重筛选)或时间限制(机器学习辅助的逐步筛选),我们提出了在进行SRs时有效应用机器学习辅助的手动筛选的实用建议。
{"title":"Machine learning for identifying randomised controlled trials when conducting systematic reviews: Development and evaluation of its impact on practice.","authors":"Xuan Qin, Minghong Yao, Xiaochao Luo, Jiali Liu, Yu Ma, Yanmei Liu, Hao Li, Ke Deng, Kang Zou, Ling Li, Xin Sun","doi":"10.1017/rsm.2025.3","DOIUrl":"10.1017/rsm.2025.3","url":null,"abstract":"<p><p>Machine learning (ML) models have been developed to identify randomised controlled trials (RCTs) to accelerate systematic reviews (SRs). However, their use has been limited due to concerns about their performance and practical benefits. We developed a high-recall ensemble learning model using Cochrane RCT data to enhance the identification of RCTs for rapid title and abstract screening in SRs and evaluated the model externally with our annotated RCT datasets. Additionally, we assessed the practical impact in terms of labour time savings and recall improvement under two scenarios: ML-assisted double screening (where ML and one reviewer screened all citations in parallel) and ML-assisted stepwise screening (where ML flagged all potential RCTs, and at least two reviewers subsequently filtered the flagged citations). Our model achieved twice the precision compared to the existing SVM model while maintaining a recall of 0.99 in both internal and external tests. In a practical evaluation with ML-assisted double screening, our model led to significant labour time savings (average 45.4%) and improved recall (average 0.998 compared to 0.919 for a single reviewer). In ML-assisted stepwise screening, the model performed similarly to standard manual screening but with average labour time savings of 74.4%. In conclusion, compared with existing methods, the proposed model can reduce workload while maintaining comparable recall when identifying RCTs during the title and abstract screening stages, thereby accelerating SRs. We propose practical recommendations to effectively apply ML-assisted manual screening when conducting SRs, depending on reviewer availability (ML-assisted double screening) or time constraints (ML-assisted stepwise screening).</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"350-363"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527483/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103188","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Effect modification and non-collapsibility together may lead to conflicting treatment decisions: A review of marginal and conditional estimands and recommendations for decision-making. 效果修改和不可折叠性一起可能导致相互冲突的治疗决策:对边际和条件估计和决策建议的回顾。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2025.2
David M Phillippo, Antonio Remiro-Azócar, Anna Heath, Gianluca Baio, Sofia Dias, A E Ades, Nicky J Welton

Effect modification occurs when a covariate alters the relative effectiveness of treatment compared to control. It is widely understood that, when effect modification is present, treatment recommendations may vary by population and by subgroups within the population. Population-adjustment methods are increasingly used to adjust for differences in effect modifiers between study populations and to produce population-adjusted estimates in a relevant target population for decision-making. It is also widely understood that marginal and conditional estimands for non-collapsible effect measures, such as odds ratios or hazard ratios, do not in general coincide even without effect modification. However, the consequences of both non-collapsibility and effect modification together are little-discussed in the literature.In this article, we set out the definitions of conditional and marginal estimands, illustrate their properties when effect modification is present, and discuss the implications for decision-making. In particular, we show that effect modification can result in conflicting treatment rankings between conditional and marginal estimates. This is because conditional and marginal estimands correspond to different decision questions that are no longer aligned when effect modification is present. For time-to-event outcomes, the presence of covariates implies that marginal hazard ratios are time-varying, and effect modification can cause marginal hazard curves to cross. We conclude with practical recommendations for decision-making in the presence of effect modification, based on pragmatic comparisons of both conditional and marginal estimates in the decision target population. Currently, multilevel network meta-regression is the only population-adjustment method capable of producing both conditional and marginal estimates, in any decision target population.

当协变量改变了治疗相对于对照的相对有效性时,效果改变就发生了。人们普遍认为,当效果发生改变时,治疗建议可能因人群和人群中的亚组而异。人口调整方法越来越多地用于调整研究人群之间效应修饰因子的差异,并为决策提供有关目标人群的人口调整估计值。人们还普遍认识到,即使没有效应修正,非可折叠效应度量的边际和条件估计,如优势比或风险比,通常也不会一致。然而,文献中很少讨论非溃散性和效应修饰的后果。在本文中,我们列出了条件估计和边际估计的定义,说明了它们在存在效应修正时的性质,并讨论了决策的含义。我们特别指出,效应修正可能导致条件估计和边际估计之间的治疗排名冲突。这是因为条件估计和边际估计对应于不同的决策问题,当存在效果修改时,这些问题不再对齐。对于时间-事件结果,协变量的存在意味着边际风险比是时变的,效应修正会导致边际风险曲线交叉。基于对决策目标人群的条件估计和边际估计的务实比较,我们总结了在存在效应修正的情况下对决策的实际建议。目前,多层次网络元回归是唯一能够在任何决策目标人群中产生条件估计和边际估计的人口调整方法。
{"title":"Effect modification and non-collapsibility together may lead to conflicting treatment decisions: A review of marginal and conditional estimands and recommendations for decision-making.","authors":"David M Phillippo, Antonio Remiro-Azócar, Anna Heath, Gianluca Baio, Sofia Dias, A E Ades, Nicky J Welton","doi":"10.1017/rsm.2025.2","DOIUrl":"10.1017/rsm.2025.2","url":null,"abstract":"<p><p>Effect modification occurs when a covariate alters the relative effectiveness of treatment compared to control. It is widely understood that, when effect modification is present, treatment recommendations may vary by population and by subgroups within the population. Population-adjustment methods are increasingly used to adjust for differences in effect modifiers between study populations and to produce population-adjusted estimates in a relevant target population for decision-making. It is also widely understood that marginal and conditional estimands for non-collapsible effect measures, such as odds ratios or hazard ratios, do not in general coincide even without effect modification. However, the consequences of both non-collapsibility and effect modification together are little-discussed in the literature.In this article, we set out the definitions of conditional and marginal estimands, illustrate their properties when effect modification is present, and discuss the implications for decision-making. In particular, we show that effect modification can result in conflicting treatment rankings between conditional and marginal estimates. This is because conditional and marginal estimands correspond to different decision questions that are no longer aligned when effect modification is present. For time-to-event outcomes, the presence of covariates implies that marginal hazard ratios are time-varying, and effect modification can cause marginal hazard curves to cross. We conclude with practical recommendations for decision-making in the presence of effect modification, based on pragmatic comparisons of both conditional and marginal estimates in the decision target population. Currently, multilevel network meta-regression is the only population-adjustment method capable of producing both conditional and marginal estimates, in any decision target population.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"323-349"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527544/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103230","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CausalMetaR: An R package for performing causally interpretable meta-analyses - ERRATUM. 一个R软件包,用于执行因果关系可解释的元分析-勘误。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 DOI: 10.1017/rsm.2025.22
Guanbo Wang, Sean McGrath, Yi Lian
{"title":"CausalMetaR: An R package for performing causally interpretable meta-analyses - ERRATUM.","authors":"Guanbo Wang, Sean McGrath, Yi Lian","doi":"10.1017/rsm.2025.22","DOIUrl":"10.1017/rsm.2025.22","url":null,"abstract":"","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"441"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527490/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103177","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-analytic rain cloud plots: Improving evidence communication through data visualization design principles. 元分析雨云图:通过数据可视化设计原则改善证据交流。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2025.4
Kaitlyn G Fitzgerald, David Khella, Avery Charles, Elizabeth Tipton

Results of meta-analyses are of interest not only to researchers but often to policy-makers and other decision-makers (e.g., in education and medicine), and visualizations play an important role in communicating data and statistical evidence to the broader public. Therefore, the potential audience of meta-analytic visualizations is broad. However, the most common meta-analytic visualization - the forest plot - uses non-optimal design principles that do not align with data visualization best practices and relies on statistical knowledge and conventions not likely to be familiar to a broad audience. Previously, the Meta-Analytic Rain Cloud (MARC) plot has been shown to be an effective alternative to a forest plot when communicating the results of a small meta-analysis to education practitioners. However, the original MARC plot design was not well-suited for meta-analyses with large numbers of effect sizes as is common across the social sciences. This paper presents an extension of the MARC plot, intended for effective communication of moderate to large meta-analyses (k = 10, 20, 50, 100 studies). We discuss the design principles of the MARC plot, grounded in the data visualization and cognitive science literature. We then present the methods and results of a randomized survey experiment to evaluate the revised MARC plot in comparison to the original MARC plot, the forest plot, and a bar plot. We find that the revised MARC plot is more effective for communicating moderate to large meta-analyses to non-research audiences, offering a 0.30, 0.34, and 1.07 standard deviation improvement in chart users' scores compared to the original MARC plot, forest plot, and bar plot, respectively.

荟萃分析的结果不仅对研究人员感兴趣,而且通常对政策制定者和其他决策者(例如,在教育和医学领域)感兴趣,可视化在向更广泛的公众传达数据和统计证据方面发挥着重要作用。因此,元分析可视化的潜在受众是广泛的。然而,最常见的元分析可视化—森林图—使用非最佳设计原则,这些原则与数据可视化最佳实践不一致,并且依赖于不太可能为广大受众所熟悉的统计知识和惯例。以前,在向教育从业者传达小型元分析结果时,元分析雨云(MARC)图已被证明是森林图的有效替代方案。然而,最初的MARC情节设计并不适合具有大量效应量的元分析,这在社会科学中很常见。本文提出了MARC图的扩展,旨在有效沟通中等到大型荟萃分析(k = 10,20,50,100项研究)。我们在数据可视化和认知科学文献的基础上讨论了MARC图的设计原则。然后,我们提出了一项随机调查实验的方法和结果,将修改后的MARC图与原始MARC图、森林图和柱状图进行比较。我们发现,修订后的MARC图更有效地向非研究受众传达中大型元分析,与原始MARC图、森林图和柱状图相比,图表用户的得分分别提高了0.30、0.34和1.07个标准差。
{"title":"Meta-analytic rain cloud plots: Improving evidence communication through data visualization design principles.","authors":"Kaitlyn G Fitzgerald, David Khella, Avery Charles, Elizabeth Tipton","doi":"10.1017/rsm.2025.4","DOIUrl":"10.1017/rsm.2025.4","url":null,"abstract":"<p><p>Results of meta-analyses are of interest not only to researchers but often to policy-makers and other decision-makers (e.g., in education and medicine), and visualizations play an important role in communicating data and statistical evidence to the broader public. Therefore, the potential audience of meta-analytic visualizations is broad. However, the most common meta-analytic visualization - the forest plot - uses non-optimal design principles that do not align with data visualization best practices and relies on statistical knowledge and conventions not likely to be familiar to a broad audience. Previously, the Meta-Analytic Rain Cloud (MARC) plot has been shown to be an effective alternative to a forest plot when communicating the results of a small meta-analysis to education practitioners. However, the original MARC plot design was not well-suited for meta-analyses with large numbers of effect sizes as is common across the social sciences. This paper presents an extension of the MARC plot, intended for effective communication of moderate to large meta-analyses (<i>k</i> = 10, 20, 50, 100 studies). We discuss the design principles of the MARC plot, grounded in the data visualization and cognitive science literature. We then present the methods and results of a randomized survey experiment to evaluate the revised MARC plot in comparison to the original MARC plot, the forest plot, and a bar plot. We find that the revised MARC plot is more effective for communicating moderate to large meta-analyses to non-research audiences, offering a 0.30, 0.34, and 1.07 standard deviation improvement in chart users' scores compared to the original MARC plot, forest plot, and bar plot, respectively.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"364-382"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527513/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103151","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Methods for information-sharing in network meta-analysis: Implications for inference and policy. 网络元分析中的信息共享方法:对推理和政策的启示。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2024.17
Georgios F Nikolaidis, Beth Woods, Stephen Palmer, Sylwia Bujkiewicz, Marta O Soares

Limited evidence on relative effectiveness is common in Health Technology Assessment (HTA), often due to sparse evidence on the population of interest or study-design constraints. When evidence directly relating to the policy decision is limited, the evidence base could be extended to incorporate indirectly related evidence. For instance, a sparse evidence base in children could borrow strength from evidence in adults to improve estimation and reduce uncertainty. In HTA, indirect evidence has typically been either disregarded ('splitting'; no information-sharing) or included without considering any differences ('lumping'; full information-sharing). However, sophisticated methods that impose moderate degrees of information-sharing have been proposed. We describe and implement multiple information-sharing methods in a case-study evaluating the effectiveness, cost-effectiveness and value of further research of intravenous immunoglobulin for severe sepsis and septic shock. We also provide metrics to determine the degree of information-sharing. Results indicate that method choice can have significant impact. Across information-sharing models, odds ratio estimates ranged between 0.55 and 0.90 and incremental cost-effectiveness ratios between £16,000-52,000 per quality-adjusted life year gained. The need for a future trial also differed by information-sharing model. Heterogeneity in the indirect evidence should also be carefully considered, as it may significantly impact estimates. We conclude that when indirect evidence is relevant to an assessment of effectiveness, the full range of information-sharing methods should be considered. The final selection should be based on a deliberative process that considers not only the plausibility of the methods' assumptions but also the imposed degree of information-sharing.

关于相对有效性的有限证据在卫生技术评估(HTA)中很常见,这通常是由于关于感兴趣人群或研究设计限制的证据较少。当与决策直接相关的证据有限时,证据基础可以扩大,纳入间接相关的证据。例如,儿童的稀疏证据基础可以从成人的证据中借鉴力量,以提高估计并减少不确定性。在HTA中,间接证据通常要么被忽略(“分裂”;没有信息共享),要么被纳入而不考虑任何差异(“集中”;完全信息共享)。然而,已经提出了一些复杂的方法来实现适度的信息共享。我们在一项案例研究中描述并实施了多种信息共享方法,以评估静脉注射免疫球蛋白治疗严重脓毒症和感染性休克的有效性、成本效益和进一步研究的价值。我们还提供了度量来确定信息共享的程度。结果表明,方法的选择对结果有显著影响。在信息共享模型中,优势比估计在0.55到0.90之间,每增加一个质量调整生命年,增量成本效益比在1.6万到5.2万英镑之间。对未来试验的需求也因信息共享模式而异。还应仔细考虑间接证据的异质性,因为它可能对估计产生重大影响。我们的结论是,当间接证据与有效性评估相关时,应考虑全方位的信息共享方法。最后的选择应以审议过程为基础,该过程不仅考虑方法假设的合理性,而且还要考虑所规定的信息共享程度。
{"title":"Methods for information-sharing in network meta-analysis: Implications for inference and policy.","authors":"Georgios F Nikolaidis, Beth Woods, Stephen Palmer, Sylwia Bujkiewicz, Marta O Soares","doi":"10.1017/rsm.2024.17","DOIUrl":"10.1017/rsm.2024.17","url":null,"abstract":"<p><p>Limited evidence on relative effectiveness is common in Health Technology Assessment (HTA), often due to sparse evidence on the population of interest or study-design constraints. When evidence directly relating to the policy decision is limited, the evidence base could be extended to incorporate indirectly related evidence. For instance, a sparse evidence base in children could borrow strength from evidence in adults to improve estimation and reduce uncertainty. In HTA, indirect evidence has typically been either disregarded ('splitting'; no information-sharing) or included without considering any differences ('lumping'; full information-sharing). However, sophisticated methods that impose moderate degrees of information-sharing have been proposed. We describe and implement multiple information-sharing methods in a case-study evaluating the effectiveness, cost-effectiveness and value of further research of intravenous immunoglobulin for severe sepsis and septic shock. We also provide metrics to determine the degree of information-sharing. Results indicate that method choice can have significant impact. Across information-sharing models, odds ratio estimates ranged between 0.55 and 0.90 and incremental cost-effectiveness ratios between £16,000-52,000 per quality-adjusted life year gained. The need for a future trial also differed by information-sharing model. Heterogeneity in the indirect evidence should also be carefully considered, as it may significantly impact estimates. We conclude that when indirect evidence is relevant to an assessment of effectiveness, the full range of information-sharing methods should be considered. The final selection should be based on a deliberative process that considers not only the plausibility of the methods' assumptions but also the imposed degree of information-sharing.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"291-307"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Key concepts and reporting recommendations for mapping reviews: A scoping review of 68 guidance and methodological studies. 测绘审查的关键概念和报告建议:68项指导和方法研究的范围审查。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-04-01 DOI: 10.1017/rsm.2024.9
Yanfei Li, Elizabeth Ghogomu, Xu Hui, E Fenfen, Fiona Campbell, Hanan Khalil, Xiuxia Li, Marie Gaarder, Promise M Nduku, Howard White, Liangying Hou, Nan Chen, Shenggang Xu, Ning Ma, Xiaoye Hu, Xian Liu, Vivian Welch, Kehu Yang

Mapping reviews (MRs) are crucial for identifying research gaps and enhancing evidence utilization. Despite their increasing use in health and social sciences, inconsistencies persist in both their conceptualization and reporting. This study aims to clarify the conceptual framework and gather reporting items from existing guidance and methodological studies. A comprehensive search was conducted across nine databases and 11 institutional websites, including documents up to January 2024. A total of 68 documents were included, addressing 24 MR terms and 55 definitions, with 39 documents discussing distinctions and overlaps among these terms. From the documents included, 28 reporting items were identified, covering all the steps of the process. Seven documents mentioned reporting on the title, four on the abstract, and 14 on the background. Ten methods-related items appeared in 56 documents, with the median number of documents supporting each item being 34 (interquartile range [IQR]: 27, 39). Four results-related items were mentioned in 18 documents (median: 14.5, IQR: 11.5, 16), and four discussion-related items appeared in 25 documents (median: 5.5, IQR: 3, 13). There was very little guidance about reporting conclusions, acknowledgments, author contributions, declarations of interest, and funding sources. This study proposes a draft 28-item reporting checklist for MRs and has identified terminologies and concepts used to describe MRs. These findings will first be used to inform a Delphi consensus process to develop reporting guidelines for MRs. Additionally, the checklist and definitions could be used to guide researchers in reporting high-quality MRs.

绘图审查(MRs)对于确定研究差距和加强证据利用至关重要。尽管在卫生和社会科学中越来越多地使用它们,但在概念化和报告方面仍然存在不一致之处。本研究旨在澄清概念框架,并从现有的指导和方法研究中收集报告项目。对9个数据库和11个机构网站进行了全面的搜索,包括截至2024年1月的文件。共包括68份文件,涉及24个MR术语和55个定义,其中39份文件讨论了这些术语之间的区别和重叠。从所包括的文件中,确定了28个报告项目,涵盖了进程的所有步骤。报告题目的文件有7份,摘要的有4份,背景的有14份。56篇文献中出现了10个与方法相关的条目,每个条目支持的文献中位数为34篇(四分位数间距[IQR]: 27,39)。18篇文献中提到了四个与结果相关的项目(中位数:14.5,IQR: 11.5, 16), 25篇文献中提到了四个与讨论相关的项目(中位数:5.5,IQR: 3, 13)。关于报告结论、致谢、作者贡献、利益声明和资金来源的指导很少。本研究提出了一份28项MRs报告清单草案,并确定了用于描述MRs的术语和概念。这些发现将首先用于为Delphi共识过程提供信息,以制定MRs报告指南。此外,清单和定义可用于指导研究人员报告高质量的MRs。
{"title":"Key concepts and reporting recommendations for mapping reviews: A scoping review of 68 guidance and methodological studies.","authors":"Yanfei Li, Elizabeth Ghogomu, Xu Hui, E Fenfen, Fiona Campbell, Hanan Khalil, Xiuxia Li, Marie Gaarder, Promise M Nduku, Howard White, Liangying Hou, Nan Chen, Shenggang Xu, Ning Ma, Xiaoye Hu, Xian Liu, Vivian Welch, Kehu Yang","doi":"10.1017/rsm.2024.9","DOIUrl":"10.1017/rsm.2024.9","url":null,"abstract":"<p><p>Mapping reviews (MRs) are crucial for identifying research gaps and enhancing evidence utilization. Despite their increasing use in health and social sciences, inconsistencies persist in both their conceptualization and reporting. This study aims to clarify the conceptual framework and gather reporting items from existing guidance and methodological studies. A comprehensive search was conducted across nine databases and 11 institutional websites, including documents up to January 2024. A total of 68 documents were included, addressing 24 MR terms and 55 definitions, with 39 documents discussing distinctions and overlaps among these terms. From the documents included, 28 reporting items were identified, covering all the steps of the process. Seven documents mentioned reporting on the title, four on the abstract, and 14 on the background. Ten methods-related items appeared in 56 documents, with the median number of documents supporting each item being 34 (interquartile range [IQR]: 27, 39). Four results-related items were mentioned in 18 documents (median: 14.5, IQR: 11.5, 16), and four discussion-related items appeared in 25 documents (median: 5.5, IQR: 3, 13). There was very little guidance about reporting conclusions, acknowledgments, author contributions, declarations of interest, and funding sources. This study proposes a draft 28-item reporting checklist for MRs and has identified terminologies and concepts used to describe MRs. These findings will first be used to inform a Delphi consensus process to develop reporting guidelines for MRs. Additionally, the checklist and definitions could be used to guide researchers in reporting high-quality MRs.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"157-174"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12631146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-analysis with Jeffreys priors: Empirical frequentist properties. 杰弗里斯先验的元分析:经验频率论性质。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-12 DOI: 10.1017/rsm.2024.2
Maya B Mathur

In small meta-analyses (e.g., up to 20 studies), the best-performing frequentist methods can yield very wide confidence intervals for the meta-analytic mean, as well as biased and imprecise estimates of the heterogeneity. We investigate the frequentist performance of alternative Bayesian methods that use the invariant Jeffreys prior. This prior has the usual Bayesian motivation, but also has a purely frequentist motivation: the resulting posterior modes correspond to the established Firth bias correction of the maximum likelihood estimator. We consider two forms of the Jeffreys prior for random-effects meta-analysis: the previously established "Jeffreys1" prior treats the heterogeneity as a nuisance parameter, whereas the "Jeffreys2" prior treats both the mean and the heterogeneity as estimands of interest. In a large simulation study, we assess the performance of both Jeffreys priors, considering different types of Bayesian estimates and intervals. We assess point and interval estimation for both the mean and the heterogeneity parameters, comparing to the best-performing frequentist methods. For small meta-analyses of binary outcomes, the Jeffreys2 prior may offer advantages over standard frequentist methods for point and interval estimation of the mean parameter. In these cases, Jeffreys2 can substantially improve efficiency while more often showing nominal frequentist coverage. However, for small meta-analyses of continuous outcomes, standard frequentist methods seem to remain the best choices. The best-performing method for estimating the heterogeneity varied according to the heterogeneity itself. Röver & Friede's R package bayesmeta implements both Jeffreys priors. We also generalize the Jeffreys2 prior to the case of meta-regression.

在小型荟萃分析(例如,多达20项研究)中,表现最好的频率方法可以为荟萃分析平均值产生非常宽的置信区间,以及对异质性的偏差和不精确估计。我们研究了使用不变杰弗里斯先验的替代贝叶斯方法的频率性能。该先验具有通常的贝叶斯动机,但也具有纯粹的频率动机:所得的后验模式对应于最大似然估计量的已建立的Firth偏差校正。我们考虑了两种形式的Jeffreys先验随机效应元分析:先前建立的“Jeffreys1”先验将异质性视为一个讨厌的参数,而“Jeffreys2”先验将均值和异质性都视为感兴趣的估计。在一个大型的模拟研究中,我们评估了杰弗里斯先验的性能,考虑了不同类型的贝叶斯估计和区间。我们评估了均值和异质性参数的点和区间估计,并与性能最好的频率方法进行了比较。对于二元结果的小型荟萃分析,Jeffreys2先验可能比平均参数的点和区间估计的标准频率方法提供优势。在这些情况下,Jeffreys2可以大大提高效率,同时更经常显示名义频率覆盖。然而,对于连续结果的小型荟萃分析,标准频率方法似乎仍然是最好的选择。估计异质性的最佳方法根据异质性本身而变化。Röver & Friede的R包bayesmeta实现了Jeffreys的两个先验。我们还在元回归之前推广了Jeffreys2。
{"title":"Meta-analysis with Jeffreys priors: Empirical frequentist properties.","authors":"Maya B Mathur","doi":"10.1017/rsm.2024.2","DOIUrl":"10.1017/rsm.2024.2","url":null,"abstract":"<p><p>In small meta-analyses (e.g., up to 20 studies), the best-performing frequentist methods can yield very wide confidence intervals for the meta-analytic mean, as well as biased and imprecise estimates of the heterogeneity. We investigate the frequentist performance of alternative Bayesian methods that use the invariant Jeffreys prior. This prior has the usual Bayesian motivation, but also has a purely frequentist motivation: the resulting posterior modes correspond to the established Firth bias correction of the maximum likelihood estimator. We consider two forms of the Jeffreys prior for random-effects meta-analysis: the previously established \"Jeffreys1\" prior treats the heterogeneity as a nuisance parameter, whereas the \"Jeffreys2\" prior treats both the mean and the heterogeneity as estimands of interest. In a large simulation study, we assess the performance of both Jeffreys priors, considering different types of Bayesian estimates and intervals. We assess point and interval estimation for both the mean and the heterogeneity parameters, comparing to the best-performing frequentist methods. For small meta-analyses of binary outcomes, the Jeffreys2 prior may offer advantages over standard frequentist methods for point and interval estimation of the mean parameter. In these cases, Jeffreys2 can substantially improve efficiency while more often showing nominal frequentist coverage. However, for small meta-analyses of continuous outcomes, standard frequentist methods seem to remain the best choices. The best-performing method for estimating the heterogeneity varied according to the heterogeneity itself. Röver & Friede's R package bayesmeta implements both Jeffreys priors. We also generalize the Jeffreys2 prior to the case of meta-regression.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"87-122"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can machine learning help accelerate article screening for systematic reviews? Yes, when article separability in embedding space is high. 机器学习能帮助加快文章筛选系统评论吗?是的,当嵌入空间中的物品可分性较高时。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2024.16
Farhan Ali, Amanda Swee-Ching Tan, Serena Jun-Wei Wang

Systematic reviews play important roles but manual efforts can be time-consuming given a growing literature. There is a need to use and evaluate automated strategies to accelerate systematic reviews. Here, we comprehensively tested machine learning (ML) models from classical and deep learning model families. We also assessed the performance of prompt engineering via few-shot learning of GPT-3.5 and GPT-4 large language models (LLMs). We further attempted to understand when ML models can help automate screening. These ML models were applied to actual datasets of systematic reviews in education. Results showed that the performance of classical and deep ML models varied widely across datasets, ranging from 1.2 to 75.6% of work saved at 95% recall. LLM prompt engineering produced similarly wide performance variation. We searched for various indicators of whether and how ML screening can help. We discovered that the separability of clusters of relevant versus irrelevant articles in high-dimensional embedding space can strongly predict whether ML screening can help (overall R = 0.81). This simple and generalizable heuristic applied well across datasets and different ML model families. In conclusion, ML screening performance varies tremendously, but researchers and software developers can consider using our cluster separability heuristic in various ways in an ML-assisted screening pipeline.

系统的回顾扮演着重要的角色,但是由于文献的增长,手工的工作可能会很耗时。有必要使用和评估自动化策略来加速系统审查。在这里,我们全面测试了来自经典和深度学习模型族的机器学习(ML)模型。我们还通过对GPT-3.5和GPT-4大型语言模型(LLMs)的少量学习来评估提示工程的性能。我们进一步尝试了解机器学习模型何时可以帮助自动筛选。这些机器学习模型应用于教育系统评价的实际数据集。结果表明,经典和深度机器学习模型的性能在数据集上差异很大,在95%召回率下节省的工作量从1.2到75.6%不等。LLM提示工程产生了同样广泛的性能变化。我们搜索了ML筛查是否有帮助以及如何有帮助的各种指标。我们发现,高维嵌入空间中相关文章与不相关文章聚类的可分离性可以强烈预测ML筛选是否有帮助(总体R = 0.81)。这种简单而可推广的启发式方法很好地应用于数据集和不同的ML模型家族。总之,机器学习筛选性能差异很大,但研究人员和软件开发人员可以考虑在机器学习辅助筛选管道中以各种方式使用我们的聚类可分离性启发式。
{"title":"Can machine learning help accelerate article screening for systematic reviews? Yes, when article separability in embedding space is high.","authors":"Farhan Ali, Amanda Swee-Ching Tan, Serena Jun-Wei Wang","doi":"10.1017/rsm.2024.16","DOIUrl":"10.1017/rsm.2024.16","url":null,"abstract":"<p><p>Systematic reviews play important roles but manual efforts can be time-consuming given a growing literature. There is a need to use and evaluate automated strategies to accelerate systematic reviews. Here, we comprehensively tested machine learning (ML) models from classical and deep learning model families. We also assessed the performance of prompt engineering via few-shot learning of GPT-3.5 and GPT-4 large language models (LLMs). We further attempted to understand when ML models can help automate screening. These ML models were applied to actual datasets of systematic reviews in education. Results showed that the performance of classical and deep ML models varied widely across datasets, ranging from 1.2 to 75.6% of work saved at 95% recall. LLM prompt engineering produced similarly wide performance variation. We searched for various indicators of whether and how ML screening can help. We discovered that the separability of clusters of relevant versus irrelevant articles in high-dimensional embedding space can strongly predict whether ML screening can help (overall <i>R</i> = 0.81). This simple and generalizable heuristic applied well across datasets and different ML model families. In conclusion, ML screening performance varies tremendously, but researchers and software developers can consider using our cluster separability heuristic in various ways in an ML-assisted screening pipeline.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"194-210"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621506/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A practical guide to evaluating sensitivity of literature search strings for systematic reviews using relative recall. 一个实用的指南评估敏感性的文献搜索字符串的系统评价使用相对召回。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-07 DOI: 10.1017/rsm.2024.6
Malgorzata Lagisz, Yefeng Yang, Sarah Young, Shinichi Nakagawa

Systematic searches of published literature are a vital component of systematic reviews. When search strings are not "sensitive," they may miss many relevant studies limiting, or even biasing, the range of evidence available for synthesis. Concerningly, conducting and reporting evaluations (validations) of the sensitivity of the used search strings is rare, according to our survey of published systematic reviews and protocols. Potential reasons may involve a lack of familiarity or inaccessibility of complex sensitivity evaluation approaches. We first clarify the main concepts and principles of search string evaluation. We then present a simple procedure for estimating a relative recall of a search string. It is based on a pre-defined set of "benchmark" publications. The relative recall, that is, the sensitivity of the search string, is the retrieval overlap between the evaluated search string and a search string that captures only the benchmark publications. If there is little overlap (i.e., low recall or sensitivity), the evaluated search string should be improved to ensure that most of the relevant literature can be captured. The presented benchmarking approach can be applied to one or more online databases or search platforms. It is illustrated by five accessible, hands-on tutorials for commonly used online literature sources. Overall, our work provides an assessment of the current state of search string evaluations in published systematic reviews and protocols. It also paves the way to improve evaluation and reporting practices to make evidence synthesis more transparent and robust.

对已发表文献的系统检索是系统综述的重要组成部分。当搜索字符串不“敏感”时,他们可能会错过许多相关的研究,限制甚至是偏见,可用于合成的证据范围。值得关注的是,根据我们对已发表的系统评论和协议的调查,对使用的搜索字符串的敏感性进行和报告评估(验证)是罕见的。潜在的原因可能涉及对复杂的敏感性评价方法缺乏熟悉或难以接近。我们首先阐明了搜索字符串求值的主要概念和原则。然后,我们提出了一个简单的过程来估计搜索字符串的相对召回率。它基于一组预定义的“基准”出版物。相对查全率,即搜索字符串的灵敏度,是计算的搜索字符串与仅捕获基准出版物的搜索字符串之间的检索重叠。如果重叠很少(即召回率或灵敏度较低),则应改进评估的搜索字符串,以确保可以捕获大多数相关文献。所提出的基准测试方法可以应用于一个或多个在线数据库或搜索平台。它是由五个易于访问的,动手教程常用的网络文学资源说明。总的来说,我们的工作提供了对已发表的系统评论和协议中搜索字符串评估的当前状态的评估。它还为改进评估和报告做法铺平了道路,使证据合成更加透明和可靠。
{"title":"A practical guide to evaluating sensitivity of literature search strings for systematic reviews using relative recall.","authors":"Malgorzata Lagisz, Yefeng Yang, Sarah Young, Shinichi Nakagawa","doi":"10.1017/rsm.2024.6","DOIUrl":"10.1017/rsm.2024.6","url":null,"abstract":"<p><p>Systematic searches of published literature are a vital component of systematic reviews. When search strings are not \"sensitive,\" they may miss many relevant studies limiting, or even biasing, the range of evidence available for synthesis. Concerningly, conducting and reporting evaluations (validations) of the sensitivity of the used search strings is rare, according to our survey of published systematic reviews and protocols. Potential reasons may involve a lack of familiarity or inaccessibility of complex sensitivity evaluation approaches. We first clarify the main concepts and principles of search string evaluation. We then present a simple procedure for estimating a relative recall of a search string. It is based on a pre-defined set of \"benchmark\" publications. The relative recall, that is, the sensitivity of the search string, is the retrieval overlap between the evaluated search string and a search string that captures only the benchmark publications. If there is little overlap (i.e., low recall or sensitivity), the evaluated search string should be improved to ensure that most of the relevant literature can be captured. The presented benchmarking approach can be applied to one or more online databases or search platforms. It is illustrated by five accessible, hands-on tutorials for commonly used online literature sources. Overall, our work provides an assessment of the current state of search string evaluations in published systematic reviews and protocols. It also paves the way to improve evaluation and reporting practices to make evidence synthesis more transparent and robust.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"1-14"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621535/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated citation searching in systematic review production: A simulation study. 系统综述生产中的自动引文检索:模拟研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-07 DOI: 10.1017/rsm.2024.15
Darren Rajit, Lan Du, Helena Teede, Joanne Enticott

Bibliographic aggregators like OpenAlex and Semantic Scholar offer scope for automated citation searching within systematic review production, promising increased efficiency. This study aimed to evaluate the performance of automated citation searching compared to standard search strategies and examine factors that influence performance. Automated citation searching was simulated on 27 systematic reviews across the OpenAlex and Semantic Scholar databases, across three study areas (health, environmental management and social policy). Performance, measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F1-F3 scores (weighted average of recall and precision), was compared to the performance of search strategies originally employed by each systematic review. The associations between systematic review study area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, API choice, and performance was analyzed. Automated citation searching outperformed the reference standard in terms of precision (p < 0.05) and F1 score (p < 0.05) but failed to outperform in terms of recall (p < 0.05) and F3 score (p < 0.05). Study area influenced the performance of automated citation searching, with performance being higher within the field of environmental management compared to social policy. Automated citation searching is best used as a supplementary search strategy in systematic review production where recall is more important that precision, due to inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts where precision is as important as recall.

像OpenAlex和Semantic Scholar这样的书目聚合器提供了在系统评论生产中进行自动引文搜索的范围,有望提高效率。本研究旨在评估自动引文检索与标准检索策略的性能,并考察影响性能的因素。在OpenAlex和Semantic Scholar数据库的27个系统综述中模拟了自动引文搜索,涉及三个研究领域(健康、环境管理和社会政策)。通过召回率(识别相关文章的比例)、精度(从所有识别的文章中识别出的相关文章的比例)和F1-F3分数(召回率和精度的加权平均值)来衡量的性能,与每个系统评价最初采用的搜索策略的性能进行比较。分析系统综述研究区域、纳入文献数量、种子文献数量、种子文献类型、研究类型纳入标准、API选择和性能之间的关系。自动引文搜索在精度方面优于参考标准
{"title":"Automated citation searching in systematic review production: A simulation study.","authors":"Darren Rajit, Lan Du, Helena Teede, Joanne Enticott","doi":"10.1017/rsm.2024.15","DOIUrl":"10.1017/rsm.2024.15","url":null,"abstract":"<p><p>Bibliographic aggregators like OpenAlex and Semantic Scholar offer scope for automated citation searching within systematic review production, promising increased efficiency. This study aimed to evaluate the performance of automated citation searching compared to standard search strategies and examine factors that influence performance. Automated citation searching was simulated on 27 systematic reviews across the OpenAlex and Semantic Scholar databases, across three study areas (health, environmental management and social policy). Performance, measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F1-F3 scores (weighted average of recall and precision), was compared to the performance of search strategies originally employed by each systematic review. The associations between systematic review study area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, API choice, and performance was analyzed. Automated citation searching outperformed the reference standard in terms of precision (p < 0.05) and F1 score (p < 0.05) but failed to outperform in terms of recall (p < 0.05) and F3 score (p < 0.05). Study area influenced the performance of automated citation searching, with performance being higher within the field of environmental management compared to social policy. Automated citation searching is best used as a supplementary search strategy in systematic review production where recall is more important that precision, due to inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts where precision is as important as recall.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"211-227"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Research Synthesis Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1