首页 > 最新文献

Research Synthesis Methods最新文献

英文 中文
Methods for information-sharing in network meta-analysis: Implications for inference and policy. 网络元分析中的信息共享方法:对推理和政策的启示。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-03-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2024.17
Georgios F Nikolaidis, Beth Woods, Stephen Palmer, Sylwia Bujkiewicz, Marta O Soares

Limited evidence on relative effectiveness is common in Health Technology Assessment (HTA), often due to sparse evidence on the population of interest or study-design constraints. When evidence directly relating to the policy decision is limited, the evidence base could be extended to incorporate indirectly related evidence. For instance, a sparse evidence base in children could borrow strength from evidence in adults to improve estimation and reduce uncertainty. In HTA, indirect evidence has typically been either disregarded ('splitting'; no information-sharing) or included without considering any differences ('lumping'; full information-sharing). However, sophisticated methods that impose moderate degrees of information-sharing have been proposed. We describe and implement multiple information-sharing methods in a case-study evaluating the effectiveness, cost-effectiveness and value of further research of intravenous immunoglobulin for severe sepsis and septic shock. We also provide metrics to determine the degree of information-sharing. Results indicate that method choice can have significant impact. Across information-sharing models, odds ratio estimates ranged between 0.55 and 0.90 and incremental cost-effectiveness ratios between £16,000-52,000 per quality-adjusted life year gained. The need for a future trial also differed by information-sharing model. Heterogeneity in the indirect evidence should also be carefully considered, as it may significantly impact estimates. We conclude that when indirect evidence is relevant to an assessment of effectiveness, the full range of information-sharing methods should be considered. The final selection should be based on a deliberative process that considers not only the plausibility of the methods' assumptions but also the imposed degree of information-sharing.

关于相对有效性的有限证据在卫生技术评估(HTA)中很常见,这通常是由于关于感兴趣人群或研究设计限制的证据较少。当与决策直接相关的证据有限时,证据基础可以扩大,纳入间接相关的证据。例如,儿童的稀疏证据基础可以从成人的证据中借鉴力量,以提高估计并减少不确定性。在HTA中,间接证据通常要么被忽略(“分裂”;没有信息共享),要么被纳入而不考虑任何差异(“集中”;完全信息共享)。然而,已经提出了一些复杂的方法来实现适度的信息共享。我们在一项案例研究中描述并实施了多种信息共享方法,以评估静脉注射免疫球蛋白治疗严重脓毒症和感染性休克的有效性、成本效益和进一步研究的价值。我们还提供了度量来确定信息共享的程度。结果表明,方法的选择对结果有显著影响。在信息共享模型中,优势比估计在0.55到0.90之间,每增加一个质量调整生命年,增量成本效益比在1.6万到5.2万英镑之间。对未来试验的需求也因信息共享模式而异。还应仔细考虑间接证据的异质性,因为它可能对估计产生重大影响。我们的结论是,当间接证据与有效性评估相关时,应考虑全方位的信息共享方法。最后的选择应以审议过程为基础,该过程不仅考虑方法假设的合理性,而且还要考虑所规定的信息共享程度。
{"title":"Methods for information-sharing in network meta-analysis: Implications for inference and policy.","authors":"Georgios F Nikolaidis, Beth Woods, Stephen Palmer, Sylwia Bujkiewicz, Marta O Soares","doi":"10.1017/rsm.2024.17","DOIUrl":"10.1017/rsm.2024.17","url":null,"abstract":"<p><p>Limited evidence on relative effectiveness is common in Health Technology Assessment (HTA), often due to sparse evidence on the population of interest or study-design constraints. When evidence directly relating to the policy decision is limited, the evidence base could be extended to incorporate indirectly related evidence. For instance, a sparse evidence base in children could borrow strength from evidence in adults to improve estimation and reduce uncertainty. In HTA, indirect evidence has typically been either disregarded ('splitting'; no information-sharing) or included without considering any differences ('lumping'; full information-sharing). However, sophisticated methods that impose moderate degrees of information-sharing have been proposed. We describe and implement multiple information-sharing methods in a case-study evaluating the effectiveness, cost-effectiveness and value of further research of intravenous immunoglobulin for severe sepsis and septic shock. We also provide metrics to determine the degree of information-sharing. Results indicate that method choice can have significant impact. Across information-sharing models, odds ratio estimates ranged between 0.55 and 0.90 and incremental cost-effectiveness ratios between £16,000-52,000 per quality-adjusted life year gained. The need for a future trial also differed by information-sharing model. Heterogeneity in the indirect evidence should also be carefully considered, as it may significantly impact estimates. We conclude that when indirect evidence is relevant to an assessment of effectiveness, the full range of information-sharing methods should be considered. The final selection should be based on a deliberative process that considers not only the plausibility of the methods' assumptions but also the imposed degree of information-sharing.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 2","pages":"291-307"},"PeriodicalIF":6.1,"publicationDate":"2025-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12527489/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Key concepts and reporting recommendations for mapping reviews: A scoping review of 68 guidance and methodological studies. 测绘审查的关键概念和报告建议:68项指导和方法研究的范围审查。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-04-01 DOI: 10.1017/rsm.2024.9
Yanfei Li, Elizabeth Ghogomu, Xu Hui, E Fenfen, Fiona Campbell, Hanan Khalil, Xiuxia Li, Marie Gaarder, Promise M Nduku, Howard White, Liangying Hou, Nan Chen, Shenggang Xu, Ning Ma, Xiaoye Hu, Xian Liu, Vivian Welch, Kehu Yang

Mapping reviews (MRs) are crucial for identifying research gaps and enhancing evidence utilization. Despite their increasing use in health and social sciences, inconsistencies persist in both their conceptualization and reporting. This study aims to clarify the conceptual framework and gather reporting items from existing guidance and methodological studies. A comprehensive search was conducted across nine databases and 11 institutional websites, including documents up to January 2024. A total of 68 documents were included, addressing 24 MR terms and 55 definitions, with 39 documents discussing distinctions and overlaps among these terms. From the documents included, 28 reporting items were identified, covering all the steps of the process. Seven documents mentioned reporting on the title, four on the abstract, and 14 on the background. Ten methods-related items appeared in 56 documents, with the median number of documents supporting each item being 34 (interquartile range [IQR]: 27, 39). Four results-related items were mentioned in 18 documents (median: 14.5, IQR: 11.5, 16), and four discussion-related items appeared in 25 documents (median: 5.5, IQR: 3, 13). There was very little guidance about reporting conclusions, acknowledgments, author contributions, declarations of interest, and funding sources. This study proposes a draft 28-item reporting checklist for MRs and has identified terminologies and concepts used to describe MRs. These findings will first be used to inform a Delphi consensus process to develop reporting guidelines for MRs. Additionally, the checklist and definitions could be used to guide researchers in reporting high-quality MRs.

绘图审查(MRs)对于确定研究差距和加强证据利用至关重要。尽管在卫生和社会科学中越来越多地使用它们,但在概念化和报告方面仍然存在不一致之处。本研究旨在澄清概念框架,并从现有的指导和方法研究中收集报告项目。对9个数据库和11个机构网站进行了全面的搜索,包括截至2024年1月的文件。共包括68份文件,涉及24个MR术语和55个定义,其中39份文件讨论了这些术语之间的区别和重叠。从所包括的文件中,确定了28个报告项目,涵盖了进程的所有步骤。报告题目的文件有7份,摘要的有4份,背景的有14份。56篇文献中出现了10个与方法相关的条目,每个条目支持的文献中位数为34篇(四分位数间距[IQR]: 27,39)。18篇文献中提到了四个与结果相关的项目(中位数:14.5,IQR: 11.5, 16), 25篇文献中提到了四个与讨论相关的项目(中位数:5.5,IQR: 3, 13)。关于报告结论、致谢、作者贡献、利益声明和资金来源的指导很少。本研究提出了一份28项MRs报告清单草案,并确定了用于描述MRs的术语和概念。这些发现将首先用于为Delphi共识过程提供信息,以制定MRs报告指南。此外,清单和定义可用于指导研究人员报告高质量的MRs。
{"title":"Key concepts and reporting recommendations for mapping reviews: A scoping review of 68 guidance and methodological studies.","authors":"Yanfei Li, Elizabeth Ghogomu, Xu Hui, E Fenfen, Fiona Campbell, Hanan Khalil, Xiuxia Li, Marie Gaarder, Promise M Nduku, Howard White, Liangying Hou, Nan Chen, Shenggang Xu, Ning Ma, Xiaoye Hu, Xian Liu, Vivian Welch, Kehu Yang","doi":"10.1017/rsm.2024.9","DOIUrl":"10.1017/rsm.2024.9","url":null,"abstract":"<p><p>Mapping reviews (MRs) are crucial for identifying research gaps and enhancing evidence utilization. Despite their increasing use in health and social sciences, inconsistencies persist in both their conceptualization and reporting. This study aims to clarify the conceptual framework and gather reporting items from existing guidance and methodological studies. A comprehensive search was conducted across nine databases and 11 institutional websites, including documents up to January 2024. A total of 68 documents were included, addressing 24 MR terms and 55 definitions, with 39 documents discussing distinctions and overlaps among these terms. From the documents included, 28 reporting items were identified, covering all the steps of the process. Seven documents mentioned reporting on the title, four on the abstract, and 14 on the background. Ten methods-related items appeared in 56 documents, with the median number of documents supporting each item being 34 (interquartile range [IQR]: 27, 39). Four results-related items were mentioned in 18 documents (median: 14.5, IQR: 11.5, 16), and four discussion-related items appeared in 25 documents (median: 5.5, IQR: 3, 13). There was very little guidance about reporting conclusions, acknowledgments, author contributions, declarations of interest, and funding sources. This study proposes a draft 28-item reporting checklist for MRs and has identified terminologies and concepts used to describe MRs. These findings will first be used to inform a Delphi consensus process to develop reporting guidelines for MRs. Additionally, the checklist and definitions could be used to guide researchers in reporting high-quality MRs.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"157-174"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12631146/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103100","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Meta-analysis with Jeffreys priors: Empirical frequentist properties. 杰弗里斯先验的元分析:经验频率论性质。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-12 DOI: 10.1017/rsm.2024.2
Maya B Mathur

In small meta-analyses (e.g., up to 20 studies), the best-performing frequentist methods can yield very wide confidence intervals for the meta-analytic mean, as well as biased and imprecise estimates of the heterogeneity. We investigate the frequentist performance of alternative Bayesian methods that use the invariant Jeffreys prior. This prior has the usual Bayesian motivation, but also has a purely frequentist motivation: the resulting posterior modes correspond to the established Firth bias correction of the maximum likelihood estimator. We consider two forms of the Jeffreys prior for random-effects meta-analysis: the previously established "Jeffreys1" prior treats the heterogeneity as a nuisance parameter, whereas the "Jeffreys2" prior treats both the mean and the heterogeneity as estimands of interest. In a large simulation study, we assess the performance of both Jeffreys priors, considering different types of Bayesian estimates and intervals. We assess point and interval estimation for both the mean and the heterogeneity parameters, comparing to the best-performing frequentist methods. For small meta-analyses of binary outcomes, the Jeffreys2 prior may offer advantages over standard frequentist methods for point and interval estimation of the mean parameter. In these cases, Jeffreys2 can substantially improve efficiency while more often showing nominal frequentist coverage. However, for small meta-analyses of continuous outcomes, standard frequentist methods seem to remain the best choices. The best-performing method for estimating the heterogeneity varied according to the heterogeneity itself. Röver & Friede's R package bayesmeta implements both Jeffreys priors. We also generalize the Jeffreys2 prior to the case of meta-regression.

在小型荟萃分析(例如,多达20项研究)中,表现最好的频率方法可以为荟萃分析平均值产生非常宽的置信区间,以及对异质性的偏差和不精确估计。我们研究了使用不变杰弗里斯先验的替代贝叶斯方法的频率性能。该先验具有通常的贝叶斯动机,但也具有纯粹的频率动机:所得的后验模式对应于最大似然估计量的已建立的Firth偏差校正。我们考虑了两种形式的Jeffreys先验随机效应元分析:先前建立的“Jeffreys1”先验将异质性视为一个讨厌的参数,而“Jeffreys2”先验将均值和异质性都视为感兴趣的估计。在一个大型的模拟研究中,我们评估了杰弗里斯先验的性能,考虑了不同类型的贝叶斯估计和区间。我们评估了均值和异质性参数的点和区间估计,并与性能最好的频率方法进行了比较。对于二元结果的小型荟萃分析,Jeffreys2先验可能比平均参数的点和区间估计的标准频率方法提供优势。在这些情况下,Jeffreys2可以大大提高效率,同时更经常显示名义频率覆盖。然而,对于连续结果的小型荟萃分析,标准频率方法似乎仍然是最好的选择。估计异质性的最佳方法根据异质性本身而变化。Röver & Friede的R包bayesmeta实现了Jeffreys的两个先验。我们还在元回归之前推广了Jeffreys2。
{"title":"Meta-analysis with Jeffreys priors: Empirical frequentist properties.","authors":"Maya B Mathur","doi":"10.1017/rsm.2024.2","DOIUrl":"10.1017/rsm.2024.2","url":null,"abstract":"<p><p>In small meta-analyses (e.g., up to 20 studies), the best-performing frequentist methods can yield very wide confidence intervals for the meta-analytic mean, as well as biased and imprecise estimates of the heterogeneity. We investigate the frequentist performance of alternative Bayesian methods that use the invariant Jeffreys prior. This prior has the usual Bayesian motivation, but also has a purely frequentist motivation: the resulting posterior modes correspond to the established Firth bias correction of the maximum likelihood estimator. We consider two forms of the Jeffreys prior for random-effects meta-analysis: the previously established \"Jeffreys1\" prior treats the heterogeneity as a nuisance parameter, whereas the \"Jeffreys2\" prior treats both the mean and the heterogeneity as estimands of interest. In a large simulation study, we assess the performance of both Jeffreys priors, considering different types of Bayesian estimates and intervals. We assess point and interval estimation for both the mean and the heterogeneity parameters, comparing to the best-performing frequentist methods. For small meta-analyses of binary outcomes, the Jeffreys2 prior may offer advantages over standard frequentist methods for point and interval estimation of the mean parameter. In these cases, Jeffreys2 can substantially improve efficiency while more often showing nominal frequentist coverage. However, for small meta-analyses of continuous outcomes, standard frequentist methods seem to remain the best choices. The best-performing method for estimating the heterogeneity varied according to the heterogeneity itself. Röver & Friede's R package bayesmeta implements both Jeffreys priors. We also generalize the Jeffreys2 prior to the case of meta-regression.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"87-122"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621536/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Can machine learning help accelerate article screening for systematic reviews? Yes, when article separability in embedding space is high. 机器学习能帮助加快文章筛选系统评论吗?是的,当嵌入空间中的物品可分性较高时。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2024.16
Farhan Ali, Amanda Swee-Ching Tan, Serena Jun-Wei Wang

Systematic reviews play important roles but manual efforts can be time-consuming given a growing literature. There is a need to use and evaluate automated strategies to accelerate systematic reviews. Here, we comprehensively tested machine learning (ML) models from classical and deep learning model families. We also assessed the performance of prompt engineering via few-shot learning of GPT-3.5 and GPT-4 large language models (LLMs). We further attempted to understand when ML models can help automate screening. These ML models were applied to actual datasets of systematic reviews in education. Results showed that the performance of classical and deep ML models varied widely across datasets, ranging from 1.2 to 75.6% of work saved at 95% recall. LLM prompt engineering produced similarly wide performance variation. We searched for various indicators of whether and how ML screening can help. We discovered that the separability of clusters of relevant versus irrelevant articles in high-dimensional embedding space can strongly predict whether ML screening can help (overall R = 0.81). This simple and generalizable heuristic applied well across datasets and different ML model families. In conclusion, ML screening performance varies tremendously, but researchers and software developers can consider using our cluster separability heuristic in various ways in an ML-assisted screening pipeline.

系统的回顾扮演着重要的角色,但是由于文献的增长,手工的工作可能会很耗时。有必要使用和评估自动化策略来加速系统审查。在这里,我们全面测试了来自经典和深度学习模型族的机器学习(ML)模型。我们还通过对GPT-3.5和GPT-4大型语言模型(LLMs)的少量学习来评估提示工程的性能。我们进一步尝试了解机器学习模型何时可以帮助自动筛选。这些机器学习模型应用于教育系统评价的实际数据集。结果表明,经典和深度机器学习模型的性能在数据集上差异很大,在95%召回率下节省的工作量从1.2到75.6%不等。LLM提示工程产生了同样广泛的性能变化。我们搜索了ML筛查是否有帮助以及如何有帮助的各种指标。我们发现,高维嵌入空间中相关文章与不相关文章聚类的可分离性可以强烈预测ML筛选是否有帮助(总体R = 0.81)。这种简单而可推广的启发式方法很好地应用于数据集和不同的ML模型家族。总之,机器学习筛选性能差异很大,但研究人员和软件开发人员可以考虑在机器学习辅助筛选管道中以各种方式使用我们的聚类可分离性启发式。
{"title":"Can machine learning help accelerate article screening for systematic reviews? Yes, when article separability in embedding space is high.","authors":"Farhan Ali, Amanda Swee-Ching Tan, Serena Jun-Wei Wang","doi":"10.1017/rsm.2024.16","DOIUrl":"10.1017/rsm.2024.16","url":null,"abstract":"<p><p>Systematic reviews play important roles but manual efforts can be time-consuming given a growing literature. There is a need to use and evaluate automated strategies to accelerate systematic reviews. Here, we comprehensively tested machine learning (ML) models from classical and deep learning model families. We also assessed the performance of prompt engineering via few-shot learning of GPT-3.5 and GPT-4 large language models (LLMs). We further attempted to understand when ML models can help automate screening. These ML models were applied to actual datasets of systematic reviews in education. Results showed that the performance of classical and deep ML models varied widely across datasets, ranging from 1.2 to 75.6% of work saved at 95% recall. LLM prompt engineering produced similarly wide performance variation. We searched for various indicators of whether and how ML screening can help. We discovered that the separability of clusters of relevant versus irrelevant articles in high-dimensional embedding space can strongly predict whether ML screening can help (overall <i>R</i> = 0.81). This simple and generalizable heuristic applied well across datasets and different ML model families. In conclusion, ML screening performance varies tremendously, but researchers and software developers can consider using our cluster separability heuristic in various ways in an ML-assisted screening pipeline.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"194-210"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621506/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103086","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A practical guide to evaluating sensitivity of literature search strings for systematic reviews using relative recall. 一个实用的指南评估敏感性的文献搜索字符串的系统评价使用相对召回。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-07 DOI: 10.1017/rsm.2024.6
Malgorzata Lagisz, Yefeng Yang, Sarah Young, Shinichi Nakagawa

Systematic searches of published literature are a vital component of systematic reviews. When search strings are not "sensitive," they may miss many relevant studies limiting, or even biasing, the range of evidence available for synthesis. Concerningly, conducting and reporting evaluations (validations) of the sensitivity of the used search strings is rare, according to our survey of published systematic reviews and protocols. Potential reasons may involve a lack of familiarity or inaccessibility of complex sensitivity evaluation approaches. We first clarify the main concepts and principles of search string evaluation. We then present a simple procedure for estimating a relative recall of a search string. It is based on a pre-defined set of "benchmark" publications. The relative recall, that is, the sensitivity of the search string, is the retrieval overlap between the evaluated search string and a search string that captures only the benchmark publications. If there is little overlap (i.e., low recall or sensitivity), the evaluated search string should be improved to ensure that most of the relevant literature can be captured. The presented benchmarking approach can be applied to one or more online databases or search platforms. It is illustrated by five accessible, hands-on tutorials for commonly used online literature sources. Overall, our work provides an assessment of the current state of search string evaluations in published systematic reviews and protocols. It also paves the way to improve evaluation and reporting practices to make evidence synthesis more transparent and robust.

对已发表文献的系统检索是系统综述的重要组成部分。当搜索字符串不“敏感”时,他们可能会错过许多相关的研究,限制甚至是偏见,可用于合成的证据范围。值得关注的是,根据我们对已发表的系统评论和协议的调查,对使用的搜索字符串的敏感性进行和报告评估(验证)是罕见的。潜在的原因可能涉及对复杂的敏感性评价方法缺乏熟悉或难以接近。我们首先阐明了搜索字符串求值的主要概念和原则。然后,我们提出了一个简单的过程来估计搜索字符串的相对召回率。它基于一组预定义的“基准”出版物。相对查全率,即搜索字符串的灵敏度,是计算的搜索字符串与仅捕获基准出版物的搜索字符串之间的检索重叠。如果重叠很少(即召回率或灵敏度较低),则应改进评估的搜索字符串,以确保可以捕获大多数相关文献。所提出的基准测试方法可以应用于一个或多个在线数据库或搜索平台。它是由五个易于访问的,动手教程常用的网络文学资源说明。总的来说,我们的工作提供了对已发表的系统评论和协议中搜索字符串评估的当前状态的评估。它还为改进评估和报告做法铺平了道路,使证据合成更加透明和可靠。
{"title":"A practical guide to evaluating sensitivity of literature search strings for systematic reviews using relative recall.","authors":"Malgorzata Lagisz, Yefeng Yang, Sarah Young, Shinichi Nakagawa","doi":"10.1017/rsm.2024.6","DOIUrl":"10.1017/rsm.2024.6","url":null,"abstract":"<p><p>Systematic searches of published literature are a vital component of systematic reviews. When search strings are not \"sensitive,\" they may miss many relevant studies limiting, or even biasing, the range of evidence available for synthesis. Concerningly, conducting and reporting evaluations (validations) of the sensitivity of the used search strings is rare, according to our survey of published systematic reviews and protocols. Potential reasons may involve a lack of familiarity or inaccessibility of complex sensitivity evaluation approaches. We first clarify the main concepts and principles of search string evaluation. We then present a simple procedure for estimating a relative recall of a search string. It is based on a pre-defined set of \"benchmark\" publications. The relative recall, that is, the sensitivity of the search string, is the retrieval overlap between the evaluated search string and a search string that captures only the benchmark publications. If there is little overlap (i.e., low recall or sensitivity), the evaluated search string should be improved to ensure that most of the relevant literature can be captured. The presented benchmarking approach can be applied to one or more online databases or search platforms. It is illustrated by five accessible, hands-on tutorials for commonly used online literature sources. Overall, our work provides an assessment of the current state of search string evaluations in published systematic reviews and protocols. It also paves the way to improve evaluation and reporting practices to make evidence synthesis more transparent and robust.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"1-14"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621535/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automated citation searching in systematic review production: A simulation study. 系统综述生产中的自动引文检索:模拟研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-07 DOI: 10.1017/rsm.2024.15
Darren Rajit, Lan Du, Helena Teede, Joanne Enticott

Bibliographic aggregators like OpenAlex and Semantic Scholar offer scope for automated citation searching within systematic review production, promising increased efficiency. This study aimed to evaluate the performance of automated citation searching compared to standard search strategies and examine factors that influence performance. Automated citation searching was simulated on 27 systematic reviews across the OpenAlex and Semantic Scholar databases, across three study areas (health, environmental management and social policy). Performance, measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F1-F3 scores (weighted average of recall and precision), was compared to the performance of search strategies originally employed by each systematic review. The associations between systematic review study area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, API choice, and performance was analyzed. Automated citation searching outperformed the reference standard in terms of precision (p < 0.05) and F1 score (p < 0.05) but failed to outperform in terms of recall (p < 0.05) and F3 score (p < 0.05). Study area influenced the performance of automated citation searching, with performance being higher within the field of environmental management compared to social policy. Automated citation searching is best used as a supplementary search strategy in systematic review production where recall is more important that precision, due to inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts where precision is as important as recall.

像OpenAlex和Semantic Scholar这样的书目聚合器提供了在系统评论生产中进行自动引文搜索的范围,有望提高效率。本研究旨在评估自动引文检索与标准检索策略的性能,并考察影响性能的因素。在OpenAlex和Semantic Scholar数据库的27个系统综述中模拟了自动引文搜索,涉及三个研究领域(健康、环境管理和社会政策)。通过召回率(识别相关文章的比例)、精度(从所有识别的文章中识别出的相关文章的比例)和F1-F3分数(召回率和精度的加权平均值)来衡量的性能,与每个系统评价最初采用的搜索策略的性能进行比较。分析系统综述研究区域、纳入文献数量、种子文献数量、种子文献类型、研究类型纳入标准、API选择和性能之间的关系。自动引文搜索在精度方面优于参考标准
{"title":"Automated citation searching in systematic review production: A simulation study.","authors":"Darren Rajit, Lan Du, Helena Teede, Joanne Enticott","doi":"10.1017/rsm.2024.15","DOIUrl":"10.1017/rsm.2024.15","url":null,"abstract":"<p><p>Bibliographic aggregators like OpenAlex and Semantic Scholar offer scope for automated citation searching within systematic review production, promising increased efficiency. This study aimed to evaluate the performance of automated citation searching compared to standard search strategies and examine factors that influence performance. Automated citation searching was simulated on 27 systematic reviews across the OpenAlex and Semantic Scholar databases, across three study areas (health, environmental management and social policy). Performance, measured by recall (proportion of relevant articles identified), precision (proportion of relevant articles identified from all articles identified), and F1-F3 scores (weighted average of recall and precision), was compared to the performance of search strategies originally employed by each systematic review. The associations between systematic review study area, number of included articles, number of seed articles, seed article type, study type inclusion criteria, API choice, and performance was analyzed. Automated citation searching outperformed the reference standard in terms of precision (p < 0.05) and F1 score (p < 0.05) but failed to outperform in terms of recall (p < 0.05) and F3 score (p < 0.05). Study area influenced the performance of automated citation searching, with performance being higher within the field of environmental management compared to social policy. Automated citation searching is best used as a supplementary search strategy in systematic review production where recall is more important that precision, due to inferior recall and F3 score. However, observed outperformance in terms of F1 score and precision suggests that automated citation searching could be helpful in contexts where precision is as important as recall.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"211-227"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621532/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103114","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Capturing causal claims: A fine-tuned text mining model for extracting causal sentences from social science papers. 捕获因果断言:一个用于从社会科学论文中提取因果句的微调文本挖掘模型。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-10 DOI: 10.1017/rsm.2024.13
Rasoul Norouzi, Bennett Kleinberg, Jeroen K Vermunt, Caspar J van Lissa

Understanding causality is crucial for social scientific research to develop strong theories and inform practice. However, explicit discussion of causality is often lacking in social science literature due to ambiguous causal language. This paper introduces a text mining model fine-tuned to extract causal sentences from full-text social science papers. A dataset of 529 causal and 529 non-causal sentences manually annotated from the Cooperation Databank (CoDa) was curated to train and evaluate the model. Several pre-trained language models (BERT, SciBERT, RoBERTa, LLAMA, and Mistral) were fine-tuned on this dataset and general-purpose causality datasets. Model performance was evaluated on held-out social science and general-purpose test sets. Results showed that fine-tuning transformer models on the social science dataset significantly improved causal sentence extraction, even with limited data, compared to the models fine-tuned only on the general-purpose data. Results indicate the importance of domain-specific fine-tuning and data for accurately capturing causal language in academic writing. This automated causal sentence extraction method enables comprehensive, large-scale analysis of causal claims across the social sciences. By systematically cataloging existing causal statements, this work lays the foundation for further research to uncover the mechanisms underlying social phenomena, inform theory development, and strengthen the methodological rigor of the field.

了解因果关系对于社会科学研究发展强有力的理论和为实践提供信息至关重要。然而,由于模棱两可的因果语言,社会科学文献往往缺乏对因果关系的明确讨论。本文介绍了一种经过微调的文本挖掘模型,用于从全文社科论文中提取因果句。从合作数据库(CoDa)中手动标注的529个因果句和529个非因果句的数据集用于训练和评估模型。几个预训练的语言模型(BERT、SciBERT、RoBERTa、LLAMA和Mistral)在该数据集和通用因果关系数据集上进行了微调。模型的性能在固定的社会科学和通用测试集上进行评估。结果表明,与仅在通用数据上进行微调的模型相比,在社会科学数据集上微调的变压器模型显著提高了因果句提取,即使数据有限。结果表明,在学术写作中,特定领域的微调和数据对于准确捕捉因果语言的重要性。这种自动化的因果句子提取方法能够对社会科学中的因果主张进行全面、大规模的分析。通过系统地对现有的因果陈述进行编目,本工作为进一步研究揭示社会现象背后的机制,为理论发展提供信息,并加强该领域方法论的严谨性奠定了基础。
{"title":"Capturing causal claims: A fine-tuned text mining model for extracting causal sentences from social science papers.","authors":"Rasoul Norouzi, Bennett Kleinberg, Jeroen K Vermunt, Caspar J van Lissa","doi":"10.1017/rsm.2024.13","DOIUrl":"10.1017/rsm.2024.13","url":null,"abstract":"<p><p>Understanding causality is crucial for social scientific research to develop strong theories and inform practice. However, explicit discussion of causality is often lacking in social science literature due to ambiguous causal language. This paper introduces a text mining model fine-tuned to extract causal sentences from full-text social science papers. A dataset of 529 causal and 529 non-causal sentences manually annotated from the Cooperation Databank (CoDa) was curated to train and evaluate the model. Several pre-trained language models (BERT, SciBERT, RoBERTa, LLAMA, and Mistral) were fine-tuned on this dataset and general-purpose causality datasets. Model performance was evaluated on held-out social science and general-purpose test sets. Results showed that fine-tuning transformer models on the social science dataset significantly improved causal sentence extraction, even with limited data, compared to the models fine-tuned only on the general-purpose data. Results indicate the importance of domain-specific fine-tuning and data for accurately capturing causal language in academic writing. This automated causal sentence extraction method enables comprehensive, large-scale analysis of causal claims across the social sciences. By systematically cataloging existing causal statements, this work lays the foundation for further research to uncover the mechanisms underlying social phenomena, inform theory development, and strengthen the methodological rigor of the field.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"139-156"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621500/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103125","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Six ways to handle dependent effect sizes in meta-analytic structural equation modeling: Is there a gold standard? 处理元分析结构方程模型中相关效应大小的六种方法:是否存在黄金标准?
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-03-13 DOI: 10.1017/rsm.2024.10
Zeynep Şiir Bilici, Wim Van den Noortgate, Suzanne Jak

The current meta-analytic structural equation modeling (MASEM) techniques cannot properly deal with cases where there are multiple effect sizes available for the same relationship from the same study. Existing applications either treat these effect sizes as independent, randomly select one effect size amongst many, or create an average effect size. None of these approaches deal with the inherent dependency in effect sizes, and either leads to biased estimates or loss of information and power. An alternative technique is to use univariate three-level modeling in the two-stage approach to model these dependencies. These different strategies for dealing with dependent effect sizes in the context of MASEM have not been previously compared in a simulation study. This study aims to compare the performance of these strategies across different conditions; varying the number of studies, the number of dependent effect sizes within studies, the correlation between the dependent effect sizes, the magnitude of the path coefficient, and the between-studies variance. We examine the relative bias in parameter estimates and standard errors, coverage proportions of confidence intervals, as well as mean standard error and power as measures of efficiency. The results suggest that there is not one method that performs well across all these criteria, pointing to the need for better methods.

当前的元分析结构方程模型(MASEM)技术不能正确处理同一研究中同一关系有多个效应量的情况。现有的应用程序要么将这些效应量视为独立的,要么在众多效应量中随机选择一个效应量,要么创建一个平均效应量。这些方法都没有处理效应大小的内在依赖性,要么导致有偏见的估计,要么导致信息和权力的损失。另一种技术是在两阶段方法中使用单变量三层建模来对这些依赖性进行建模。在以前的模拟研究中没有比较过这些不同的策略来处理在MASEM背景下的依赖效应大小。本研究旨在比较这些策略在不同条件下的表现;改变研究的数量、研究中依赖效应量的数量、依赖效应量之间的相关性、路径系数的大小和研究间方差。我们检查了参数估计和标准误差的相对偏差,置信区间的覆盖比例,以及作为效率度量的平均标准误差和功率。结果表明,没有一种方法能在所有这些标准中表现良好,这表明需要更好的方法。
{"title":"Six ways to handle dependent effect sizes in meta-analytic structural equation modeling: Is there a gold standard?","authors":"Zeynep Şiir Bilici, Wim Van den Noortgate, Suzanne Jak","doi":"10.1017/rsm.2024.10","DOIUrl":"10.1017/rsm.2024.10","url":null,"abstract":"<p><p>The current meta-analytic structural equation modeling (MASEM) techniques cannot properly deal with cases where there are multiple effect sizes available for the same relationship from the same study. Existing applications either treat these effect sizes as independent, randomly select one effect size amongst many, or create an average effect size. None of these approaches deal with the inherent dependency in effect sizes, and either leads to biased estimates or loss of information and power. An alternative technique is to use univariate three-level modeling in the two-stage approach to model these dependencies. These different strategies for dealing with dependent effect sizes in the context of MASEM have not been previously compared in a simulation study. This study aims to compare the performance of these strategies across different conditions; varying the number of studies, the number of dependent effect sizes within studies, the correlation between the dependent effect sizes, the magnitude of the path coefficient, and the between-studies variance. We examine the relative bias in parameter estimates and standard errors, coverage proportions of confidence intervals, as well as mean standard error and power as measures of efficiency. The results suggest that there is not one method that performs well across all these criteria, pointing to the need for better methods.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"60-86"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12621510/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Prioritizing qualitative meta-synthesis findings in a mixed methods systematic review study: A description of the method. 在混合方法系统评价研究中优先考虑定性综合研究结果:方法描述。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-04-01 DOI: 10.1017/rsm.2024.8
Robin Coatsworth-Puspoky, Wendy Duggleby, Sherry Dahlke, Kathleen F Hunter

Aim(s): To describe a sequential mixed methods review method that prioritized synthesized qualitative evidence from primary studies to explain the complexities of older persons with multiple chronic conditions' unplanned readmission experiences.

Background: Segregated mixed methods review studies frequently prioritize quantitative evidence synthesis to examine the effectiveness of interventions; utilizing qualitative evidence to explain quantitative data. There is a lack of guidance about how to prioritize qualitative evidence.

Results: Five procedural steps were developed to prioritize qualitative evidence synthesis. In Step 1, research questions were developed. In Step 2, databases were searched, studies were mapped to their method (qualitative or quantitative) and appraised. In Step 3, meta-synthesis and applied thematic analysis were used to synthesize extracted qualitative evidence about the psychosocial processes and factors that influenced unplanned readmission. In Step 4, quantitative evidence was synthesized using vote counting to determine the factors influencing unplanned readmission. In Step 5, a matrix was used to compare, determine the agreement between the qualitative and quantitative evidence, juxtapose findings, and uphold validity. Factors were mapped to the model of psychosocial processes and analytic themes.

Conclusion: Prioritizing qualitative evidence synthesis in a mixed methods review study prioritizes participants' experiences, perspectives, and voices to understand complex clinical problems from participants who experienced the event. Synthesizing and integrating evidence facilitates the construction of holistic new understandings about phenomenon and expands mixed methods systematic review methods.

Implications: Prioritizing patients' perspectives is useful for developing new client-centered interventions, establishing best practices for future reviews, generating theories, and expanding research methods.

目的:描述一种顺序混合方法综述方法,该方法优先考虑来自原始研究的综合定性证据,以解释患有多种慢性疾病的老年人意外再入院经历的复杂性。背景:分离混合方法综述研究经常优先考虑定量证据综合来检查干预措施的有效性;利用定性证据解释定量数据。缺乏关于如何优先考虑定性证据的指导。结果:制定了五个程序步骤来优先考虑定性证据合成。在步骤1中,研究问题被提出。在步骤2中,检索数据库,将研究映射到他们的方法(定性或定量)并进行评价。在第3步中,采用元综合和应用主题分析来综合提取的关于影响意外再入院的心理社会过程和因素的定性证据。第4步,采用计票法合成定量证据,确定影响非计划再入院的因素。在步骤5中,使用矩阵进行比较,确定定性和定量证据之间的一致性,并置结果,并维护有效性。因素被映射到社会心理过程和分析主题的模型。结论:在混合方法回顾研究中,优先考虑定性证据合成,优先考虑参与者的经历、观点和声音,以便从经历过事件的参与者那里理解复杂的临床问题。综合和整合证据有助于构建对现象的整体新认识,并扩展了混合方法和系统综述方法。意义:优先考虑患者的观点有助于开发新的以客户为中心的干预措施,为未来的评论建立最佳实践,产生理论和扩展研究方法。
{"title":"Prioritizing qualitative meta-synthesis findings in a mixed methods systematic review study: A description of the method.","authors":"Robin Coatsworth-Puspoky, Wendy Duggleby, Sherry Dahlke, Kathleen F Hunter","doi":"10.1017/rsm.2024.8","DOIUrl":"10.1017/rsm.2024.8","url":null,"abstract":"<p><strong>Aim(s): </strong>To describe a sequential mixed methods review method that prioritized synthesized qualitative evidence from primary studies to explain the complexities of older persons with multiple chronic conditions' unplanned readmission experiences.</p><p><strong>Background: </strong>Segregated mixed methods review studies frequently prioritize quantitative evidence synthesis to examine the effectiveness of interventions; utilizing qualitative evidence to explain quantitative data. There is a lack of guidance about how to prioritize qualitative evidence.</p><p><strong>Results: </strong>Five procedural steps were developed to prioritize qualitative evidence synthesis. In Step 1, research questions were developed. In Step 2, databases were searched, studies were mapped to their method (qualitative or quantitative) and appraised. In Step 3, meta-synthesis and applied thematic analysis were used to synthesize extracted qualitative evidence about the psychosocial processes and factors that influenced unplanned readmission. In Step 4, quantitative evidence was synthesized using vote counting to determine the factors influencing unplanned readmission. In Step 5, a matrix was used to compare, determine the agreement between the qualitative and quantitative evidence, juxtapose findings, and uphold validity. Factors were mapped to the model of psychosocial processes and analytic themes.</p><p><strong>Conclusion: </strong>Prioritizing qualitative evidence synthesis in a mixed methods review study prioritizes participants' experiences, perspectives, and voices to understand complex clinical problems from participants who experienced the event. Synthesizing and integrating evidence facilitates the construction of holistic new understandings about phenomenon and expands mixed methods systematic review methods.</p><p><strong>Implications: </strong>Prioritizing patients' perspectives is useful for developing new client-centered interventions, establishing best practices for future reviews, generating theories, and expanding research methods.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"123-138"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12631148/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103199","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing the biases of the conventional meta-analysis of correlations. 减少传统相关性元分析的偏差。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-01-01 Epub Date: 2025-04-01 DOI: 10.1017/rsm.2024.5
T D Stanley, Hristos Doucouliagos, Tomas Havranek

Conventional meta-analyses (both fixed and random effects) of correlations are biased due to the mechanical relationship between the estimated correlation and its standard error. Simulations that are closely calibrated to match actual research conditions widely seen across correlational studies in psychology corroborate these biases and suggest two solutions: UWLS+3 and HS. UWLS+3 is a simple inverse-variance weighted average (the unrestricted weighted least squares) that adjusts the degrees of freedom and thereby reduces small-sample bias to scientific negligibility. UWLS+3 as well as the Hunter and Schmidt approach (HS) are less biased than conventional random-effects estimates of correlations and Fisher's z, whether or not there is publication selection bias. However, publication selection bias remains a ubiquitous source of bias and false-positive findings. Despite the relationship between the estimated correlation and its standard error in the absence of selective reporting, the precision-effect test/precision-effect estimate with standard error (PET-PEESE) nearly eradicates publication selection bias. Surprisingly, PET-PEESE keeps the rate of false positives (i.e., type I errors) within their nominal levels under the typical conditions widely seen across psychological research whether there is publication selection bias, or not.

由于估计的相关性和标准误差之间的机械关系,传统的相关性元分析(固定效应和随机效应)是有偏差的。在心理学相关研究中广泛看到的与实际研究条件相匹配的模拟结果证实了这些偏见,并提出了两种解决方案:UWLS+3和HS。UWLS+3是一个简单的反方差加权平均值(不受限制的加权最小二乘),它调整了自由度,从而将小样本偏差降低到科学的可忽略性。无论是否存在发表选择偏倚,UWLS+3以及Hunter and Schmidt方法(HS)的偏倚都小于传统的随机效应相关性估计和Fisher’s z。然而,出版物选择偏倚仍然是普遍存在的偏倚和假阳性结果的来源。尽管在没有选择性报告的情况下,估计的相关性与其标准误差之间存在关系,但精度效应检验/标准误差精度效应估计(PET-PEESE)几乎消除了发表选择偏倚。令人惊讶的是,PET-PEESE将假阳性率(即I型错误)保持在其名义水平内,无论是否存在出版物选择偏倚,在心理学研究中普遍存在的典型条件下。
{"title":"Reducing the biases of the conventional meta-analysis of correlations.","authors":"T D Stanley, Hristos Doucouliagos, Tomas Havranek","doi":"10.1017/rsm.2024.5","DOIUrl":"10.1017/rsm.2024.5","url":null,"abstract":"<p><p>Conventional meta-analyses (both fixed and random effects) of correlations are biased due to the mechanical relationship between the estimated correlation and its standard error. Simulations that are closely calibrated to match actual research conditions widely seen across correlational studies in psychology corroborate these biases and suggest two solutions: UWLS<sub>+3</sub> and HS. UWLS<sub>+3</sub> is a simple inverse-variance weighted average (the unrestricted weighted least squares) that adjusts the degrees of freedom and thereby reduces small-sample bias to scientific negligibility. UWLS<sub>+3</sub> as well as the Hunter and Schmidt approach (HS) are less biased than conventional random-effects estimates of correlations and Fisher's <i>z</i>, whether or not there is publication selection bias. However, publication selection bias remains a ubiquitous source of bias and false-positive findings. Despite the relationship between the estimated correlation and its standard error in the absence of selective reporting, the precision-effect test/precision-effect estimate with standard error (PET-PEESE) nearly eradicates publication selection bias. Surprisingly, PET-PEESE keeps the rate of false positives (i.e., type I errors) within their nominal levels under the typical conditions widely seen across psychological research whether there is publication selection bias, or not.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"16 1","pages":"42-59"},"PeriodicalIF":6.1,"publicationDate":"2025-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12631149/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103156","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Research Synthesis Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1