首页 > 最新文献

Research Synthesis Methods最新文献

英文 中文
Lord's Paradox and two network meta-analysis models. 洛德悖论和两个网络元分析模型。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-09-18 DOI: 10.1017/rsm.2025.10036
Yu-Kang Tu, James S Hodges

The contrast-based model (CBM) is the most popular network meta-analysis (NMA) method, although alternative approaches, e.g., the baseline model (BM), have been proposed but seldom used. This article aims to illuminate the difference between the CBM and BM and explores when they produce different results. These models differ in key assumptions: The CBM assumes treatment contrasts are exchangeable across trials and models the reference (baseline) treatment's outcome levels as fixed effects, while the BM further assumes that the baseline treatment's outcome levels are exchangeable across trials and treats them as random effects. We show algebraically and graphically that the difference between the CBM and BM is analogous to the difference between the two analyses in a statistical conundrum called Lord's Paradox, in which the t-test and analysis of covariance (ANCOVA) yield conflicting conclusions about the group difference in weight gain. We show that this conflict arises because the t-test compares the observed weight change, whereas ANCOVA compares an adjusted weight change. In NMA, analogously, the CBM compares observed treatment contrasts, while the BM compares adjusted treatment contrasts. We demonstrate how the difference in modeling baseline effects can cause the CBM and BM to give different results. The analogy of Lord's Paradox provides insights into the different assumptions of the CBM and BM regarding the relationship between baseline effects and treatment contrasts. When these two models produce substantially different results, it may indicate a violation of the transitivity assumption. Therefore, we should be cautious in interpreting the results from either model.

基于对比的模型(CBM)是最流行的网络元分析(NMA)方法,尽管其他方法,如基线模型(BM),已经被提出,但很少使用。本文旨在阐明CBM和BM之间的区别,并探讨它们在什么情况下产生不同的结果。这些模型在关键假设上有所不同:CBM假设治疗对比在试验之间是可交换的,并将参考(基线)治疗的结果水平建模为固定效应,而BM进一步假设基线治疗的结果水平在试验之间是可交换的,并将其视为随机效应。我们用代数和图形表明,CBM和BM之间的差异类似于统计学难题Lord’s Paradox中两种分析之间的差异,其中t检验和协方差分析(ANCOVA)得出了关于体重增加组差异的相互矛盾的结论。我们表明,这种冲突的产生是因为t检验比较了观察到的权重变化,而ANCOVA比较了调整后的权重变化。在NMA中,类似地,CBM比较观察到的治疗对比,而BM比较调整后的治疗对比。我们演示了建模基线效应的差异如何导致CBM和BM给出不同的结果。洛德悖论的类比提供了对CBM和BM关于基线效应和治疗对比之间关系的不同假设的见解。当这两个模型产生的结果有很大不同时,这可能表明违反了传递性假设。因此,我们在解释任何一个模型的结果时都应该谨慎。
{"title":"Lord's Paradox and two network meta-analysis models.","authors":"Yu-Kang Tu, James S Hodges","doi":"10.1017/rsm.2025.10036","DOIUrl":"10.1017/rsm.2025.10036","url":null,"abstract":"<p><p>The contrast-based model (CBM) is the most popular network meta-analysis (NMA) method, although alternative approaches, e.g., the baseline model (BM), have been proposed but seldom used. This article aims to illuminate the difference between the CBM and BM and explores when they produce different results. These models differ in key assumptions: The CBM assumes treatment contrasts are exchangeable across trials and models the reference (baseline) treatment's outcome levels as fixed effects, while the BM further assumes that the baseline treatment's outcome levels are exchangeable across trials and treats them as random effects. We show algebraically and graphically that the difference between the CBM and BM is analogous to the difference between the two analyses in a statistical conundrum called Lord's Paradox, in which the <i>t</i>-test and analysis of covariance (ANCOVA) yield conflicting conclusions about the group difference in weight gain. We show that this conflict arises because the <i>t</i>-test compares the <i>observed</i> weight change, whereas ANCOVA compares an <i>adjusted</i> weight change. In NMA, analogously, the CBM compares observed treatment contrasts, while the BM compares adjusted treatment contrasts. We demonstrate how the difference in modeling baseline effects can cause the CBM and BM to give different results. The analogy of Lord's Paradox provides insights into the different assumptions of the CBM and BM regarding the relationship between baseline effects and treatment contrasts. When these two models produce substantially different results, it may indicate a violation of the transitivity assumption. Therefore, we should be cautious in interpreting the results from either model.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"111-122"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823209/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Examining covariate-specific treatment effects in individual participant data meta-analysis: Framing aggregation bias in terms of trial-level confounding and funnel plots. 在个体参与者数据荟萃分析中检验协变量特异性治疗效果:根据试验水平混淆和漏斗图构建聚集偏差。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-23 DOI: 10.1017/rsm.2025.10043
Lianne K Siegel, Joseph S Koopmeiners, Jamie Hartmann-Boyce, Peter J Godolphin, Abdel G Babiker, Giota Touloumi, Kirk U Knowlton, Richard D Riley

To understand a treatment's potential impact at the individual level, it is crucial to explore whether the effect differs across patient subgroups and covariate values. Meta-analysis provides an important tool for detecting treatment-covariate interactions, as it can improve power compared to a single study. However, aggregation bias can occur when estimating individual-level treatment-covariate interactions in meta-analysis, due to trial-level confounding. This refers to when the association between the covariate and treatment effect across trials (at the aggregate level) differs from that observed within trials (at the individual level). It is, thus, recommended that heterogeneity in the treatment effect at the individual level should be disentangled from that at the trial level, ideally using an individual participant data (IPD) meta-analysis. Here, we explain this issue and provide new intuition about how trial-level confounding is impacted by differences in within-trial distributions of covariates and how this corresponds to asymmetry in subgroup-specific funnel plots in the case of categorical covariates. We then propose a sensitivity analysis to assess the robustness of interaction estimates to potential trial-level confounding. We illustrate these concepts using simulated and real data from an IPD meta-analysis of trials conducted on the TICO/ACTIV-3 platform, which assessed passive immunotherapy treatments for inpatients with COVID-19.

为了了解治疗在个体水平上的潜在影响,探索不同患者亚组和协变量值的影响是否不同是至关重要的。荟萃分析提供了检测治疗-协变量相互作用的重要工具,因为与单一研究相比,它可以提高疗效。然而,在荟萃分析中,由于试验水平的混淆,在估计个体水平治疗-协变量相互作用时,可能会出现聚集偏倚。这是指当协变量和治疗效果之间的关联跨试验(在总体水平)不同于在试验中观察到的(在个体水平)。因此,建议将个体水平治疗效果的异质性与试验水平的异质性分开,理想情况下使用个体参与者数据(IPD)荟萃分析。在这里,我们解释了这个问题,并提供了关于试验水平混淆如何受到试验内协变量分布差异的影响的新直觉,以及在分类协变量的情况下,这如何对应于亚组特定漏斗图的不对称性。然后,我们提出敏感性分析,以评估相互作用估计对潜在试验水平混淆的稳健性。我们使用在TICO/ACTIV-3平台上进行的IPD荟萃分析的模拟和真实数据来说明这些概念,该试验评估了COVID-19住院患者的被动免疫治疗。
{"title":"Examining covariate-specific treatment effects in individual participant data meta-analysis: Framing aggregation bias in terms of trial-level confounding and funnel plots.","authors":"Lianne K Siegel, Joseph S Koopmeiners, Jamie Hartmann-Boyce, Peter J Godolphin, Abdel G Babiker, Giota Touloumi, Kirk U Knowlton, Richard D Riley","doi":"10.1017/rsm.2025.10043","DOIUrl":"10.1017/rsm.2025.10043","url":null,"abstract":"<p><p>To understand a treatment's potential impact at the individual level, it is crucial to explore whether the effect differs across patient subgroups and covariate values. Meta-analysis provides an important tool for detecting treatment-covariate interactions, as it can improve power compared to a single study. However, aggregation bias can occur when estimating individual-level treatment-covariate interactions in meta-analysis, due to trial-level confounding. This refers to when the association between the covariate and treatment effect <i>across</i> trials (at the aggregate level) differs from that observed <i>within</i> trials (at the individual level). It is, thus, recommended that heterogeneity in the treatment effect at the individual level should be disentangled from that at the trial level, ideally using an individual participant data (IPD) meta-analysis. Here, we explain this issue and provide new intuition about how trial-level confounding is impacted by differences in within-trial distributions of covariates and how this corresponds to asymmetry in subgroup-specific funnel plots in the case of categorical covariates. We then propose a sensitivity analysis to assess the robustness of interaction estimates to potential trial-level confounding. We illustrate these concepts using simulated and real data from an IPD meta-analysis of trials conducted on the TICO/ACTIV-3 platform, which assessed passive immunotherapy treatments for inpatients with COVID-19.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"194-209"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823212/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data analysis and presentation methods in umbrella reviews/overviews of reviews in health care: A cross-sectional study. 总体综述/卫生保健综述中的数据分析和呈现方法:一项横断面研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-14 DOI: 10.1017/rsm.2025.10040
Cindy Stern, Jiaoli Li, Jennifer Stone, Hanan Khalil, Kim Sears, Romy Menghao Jia, Patraporn Bhatarasakoon, Edoardo Aromataris, Ritin Fernandez

Umbrella reviews (URs) synthesize findings from multiple systematic reviews on a specific topic. Methodological approaches for analyzing and presenting UR results vary, and reviewers often adapt methods to align with research objectives. This study examined the characteristics of analysis and presentation methods used in healthcare-related URs. A systematic PubMed search identified URs published between 2023 and 2024. Inclusion criteria focused on healthcare URs using systematic reviews as the unit of analysis. A random sample of 100 eligible URs was included. A customized, piloted data extraction form was used to collect bibliographic, conduct, and reporting data independently. Descriptive analysis and narrative synthesis summarized findings. The most common terminology for eligible studies was "umbrella reviews" (65%) or "overviews" (30%). Question frameworks included PICO (43%) and PICOS (14%), with quantitative systematic reviews included in most URs (98%), and 68% including randomized controlled trials. The most frequent methodological guidance source was Cochrane (32%). Data analysis commonly used narrative synthesis and meta-analysis, with Stata, RevMan, and GRADEPro GDT employed for presentation. Information about study overlap and certainty assessment was rarely reported.Variation exists in how data are analyzed and presented in URs, with key elements often omitted. These findings highlight the need for clearer methodological guidance to enhance consistency and reporting in future URs.

伞形评论(URs)综合了针对特定主题的多个系统评论的结果。分析和呈现UR结果的方法方法各不相同,审稿人经常根据研究目标调整方法。本研究考察了医疗保健相关尿路的分析和呈现方法的特点。PubMed系统搜索确定了2023年至2024年之间发表的ur。纳入标准侧重于使用系统审查作为分析单元的医疗保健ur。随机抽取100名符合条件的URs。一个定制的试点数据提取表单用于独立收集书目、行为和报告数据。描述性分析和叙述性综合总结了研究结果。对于符合条件的研究,最常见的术语是“概括性综述”(65%)或“概述”(30%)。问题框架包括PICO(43%)和PICOS(14%),定量系统评价包括大多数URs(98%), 68%包括随机对照试验。最常见的方法学指导来源是Cochrane(32%)。数据分析常用叙事综合和元分析,采用Stata、RevMan、GRADEPro GDT进行呈现。关于研究重叠和确定性评估的信息很少被报道。数据在uri中的分析和呈现方式存在差异,关键元素经常被忽略。这些发现突出表明,需要更明确的方法指导,以加强未来URs的一致性和报告。
{"title":"Data analysis and presentation methods in umbrella reviews/overviews of reviews in health care: A cross-sectional study.","authors":"Cindy Stern, Jiaoli Li, Jennifer Stone, Hanan Khalil, Kim Sears, Romy Menghao Jia, Patraporn Bhatarasakoon, Edoardo Aromataris, Ritin Fernandez","doi":"10.1017/rsm.2025.10040","DOIUrl":"10.1017/rsm.2025.10040","url":null,"abstract":"<p><p>Umbrella reviews (URs) synthesize findings from multiple systematic reviews on a specific topic. Methodological approaches for analyzing and presenting UR results vary, and reviewers often adapt methods to align with research objectives. This study examined the characteristics of analysis and presentation methods used in healthcare-related URs. A systematic PubMed search identified URs published between 2023 and 2024. Inclusion criteria focused on healthcare URs using systematic reviews as the unit of analysis. A random sample of 100 eligible URs was included. A customized, piloted data extraction form was used to collect bibliographic, conduct, and reporting data independently. Descriptive analysis and narrative synthesis summarized findings. The most common terminology for eligible studies was \"umbrella reviews\" (65%) or \"overviews\" (30%). Question frameworks included PICO (43%) and PICOS (14%), with quantitative systematic reviews included in most URs (98%), and 68% including randomized controlled trials. The most frequent methodological guidance source was Cochrane (32%). Data analysis commonly used narrative synthesis and meta-analysis, with Stata, RevMan, and GRADEPro GDT employed for presentation. Information about study overlap and certainty assessment was rarely reported.Variation exists in how data are analyzed and presented in URs, with key elements often omitted. These findings highlight the need for clearer methodological guidance to enhance consistency and reporting in future URs.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"210-224"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823197/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Making sense of conducting a critical interpretive synthesis: A scoping review. 理解进行关键的解释性综合:范围审查。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-08 DOI: 10.1017/rsm.2025.10041
Saritte Perlman, Eliana Ben-Sheleg, Moriah E Ellen

Critical interpretive synthesis was introduced in 2006 to address various shortcomings of systematic reviews such as their limitations in synthesizing heterogeneous data, integrating diverse study types, and generating theoretical insights. This review sought to outline the methodological process of conducting critical interpretive syntheses by identifying the methods currently in use, mapping the processes that have been used to date, and highlighting directions for further research. To achieve this, a scoping review of critical interpretive syntheses published between 2006 and 2023 was conducted. Initial searches identified 1628 publications and after removal of duplicates and exclusions, 212 reviews were included in the study. Most reviews focused on health-related subjects. Authors chose to utilize the method due to its iterative, inductive, and recursive nature. Both question-based and topic-based reviews were conducted. Literature searches relied on electronic databases and reference chaining. Mapping to the original six-phase model showed most variability in use of sampling and quality assessment phases, which were each done in 50.7% of reviews. Data extraction utilized a data extraction table. Synthesis involved constant comparison, critique, and consolidation of themes into constructs, and a synthesizing argument. Refining critical interpretive synthesis methodology and its best practices are important for optimizing the utility and impact and ensuring findings are relevant and actionable for informing policy, practice, and future research.

2006年引入了批判性解释性综合,以解决系统综述的各种缺点,例如它们在综合异构数据、整合不同研究类型和产生理论见解方面的局限性。本综述试图通过确定目前使用的方法,绘制迄今为止使用的过程,并强调进一步研究的方向,概述进行批判性解释性综合的方法过程。为了实现这一目标,对2006年至2023年间发表的关键解释性综合进行了范围审查。最初的检索确定了1628篇出版物,在删除重复和排除后,212篇综述被纳入研究。大多数评论集中在与健康相关的主题上。作者选择利用该方法是因为它的迭代、归纳和递归性质。进行了基于问题和基于主题的审查。文献检索依赖于电子数据库和参考文献链。映射到最初的六阶段模型显示了抽样和质量评估阶段的使用的最大可变性,每个阶段在50.7%的审查中完成。数据提取利用了数据提取表。综合包括不断的比较、批判和将主题整合到结构中,以及综合论证。完善关键的解释性综合方法及其最佳实践对于优化效用和影响以及确保研究结果对政策、实践和未来研究具有相关性和可操作性非常重要。
{"title":"Making sense of conducting a critical interpretive synthesis: A scoping review.","authors":"Saritte Perlman, Eliana Ben-Sheleg, Moriah E Ellen","doi":"10.1017/rsm.2025.10041","DOIUrl":"10.1017/rsm.2025.10041","url":null,"abstract":"<p><p>Critical interpretive synthesis was introduced in 2006 to address various shortcomings of systematic reviews such as their limitations in synthesizing heterogeneous data, integrating diverse study types, and generating theoretical insights. This review sought to outline the methodological process of conducting critical interpretive syntheses by identifying the methods currently in use, mapping the processes that have been used to date, and highlighting directions for further research. To achieve this, a scoping review of critical interpretive syntheses published between 2006 and 2023 was conducted. Initial searches identified 1628 publications and after removal of duplicates and exclusions, 212 reviews were included in the study. Most reviews focused on health-related subjects. Authors chose to utilize the method due to its iterative, inductive, and recursive nature. Both question-based and topic-based reviews were conducted. Literature searches relied on electronic databases and reference chaining. Mapping to the original six-phase model showed most variability in use of sampling and quality assessment phases, which were each done in 50.7% of reviews. Data extraction utilized a data extraction table. Synthesis involved constant comparison, critique, and consolidation of themes into constructs, and a synthesizing argument. Refining critical interpretive synthesis methodology and its best practices are important for optimizing the utility and impact and ensuring findings are relevant and actionable for informing policy, practice, and future research.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"30-41"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823206/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103421","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Estimands and their implications for evidence synthesis for oncology: A simulation study of treatment switching in meta-analysis. 估计及其对肿瘤学证据合成的影响:荟萃分析中治疗转换的模拟研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-16 DOI: 10.1017/rsm.2025.10039
Rebecca Kathleen Metcalfe, Antonio Remiro-Azócar, Quang Vuong, Anders Gorst-Rasmussen, Oliver Keene, Shomoita Alam, Jay J H Park

The ICH E9(R1) addendum provides guidelines on accounting for intercurrent events in clinical trials using the estimands framework. However, there has been limited attention to the estimands framework for meta-analysis. Using treatment switching, a well-known intercurrent event that occurs frequently in oncology, we conducted a simulation study to explore the bias introduced by pooling together estimates targeting different estimands in a meta-analysis of randomized clinical trials (RCTs) that allowed treatment switching. We simulated overall survival data of a collection of RCTs that allowed patients in the control group to switch to the intervention treatment after disease progression under fixed effects and random effects models. For each RCT, we calculated effect estimates for a treatment policy estimand that ignored treatment switching, and a hypothetical estimand that accounted for treatment switching either by fitting rank-preserving structural failure time models or by censoring switchers. Then, we performed random effects and fixed effects meta-analyses to pool together RCT effect estimates while varying the proportions of trials providing treatment policy and hypothetical effect estimates. We compared the results of meta-analyses that pooled different types of effect estimates with those that pooled only treatment policy or hypothetical estimates. We found that pooling estimates targeting different estimands results in pooled estimators that do not target any estimand of interest, and that pooling estimates of varying estimands can generate misleading results, even under a random effects model. Adopting the estimands framework for meta-analysis may improve alignment between meta-analytic results and the clinical research question of interest.

ICH E9(R1)附录提供了使用估算框架对临床试验中并发事件进行会计处理的指南。然而,对meta分析的估算框架的关注有限。治疗转换是肿瘤学中经常发生的一个众所周知的交叉事件,我们进行了一项模拟研究,以探索在允许治疗转换的随机临床试验(rct)的荟萃分析中,将针对不同估计的估计汇集在一起所引入的偏倚。我们模拟了一组随机对照试验的总体生存数据,这些随机对照试验允许对照组患者在疾病进展后在固定效应和随机效应模型下切换到干预治疗。对于每个RCT,我们计算了忽略治疗切换的治疗策略估计的效果估计,以及通过拟合保秩结构失效时间模型或通过审查切换者来考虑治疗切换的假设估计。然后,我们进行随机效应和固定效应荟萃分析,将提供治疗政策和假设效应估计的试验比例不同的RCT效应估计汇总在一起。我们比较了合并不同类型效果估计的荟萃分析结果与仅合并治疗政策或假设估计的荟萃分析结果。我们发现,针对不同估计的池化估计会导致不针对任何感兴趣的估计的池化估计,并且即使在随机效应模型下,对不同估计的池化估计也会产生误导性的结果。采用估算框架进行meta分析可以改善meta分析结果与感兴趣的临床研究问题之间的一致性。
{"title":"Estimands and their implications for evidence synthesis for oncology: A simulation study of treatment switching in meta-analysis.","authors":"Rebecca Kathleen Metcalfe, Antonio Remiro-Azócar, Quang Vuong, Anders Gorst-Rasmussen, Oliver Keene, Shomoita Alam, Jay J H Park","doi":"10.1017/rsm.2025.10039","DOIUrl":"10.1017/rsm.2025.10039","url":null,"abstract":"<p><p>The ICH E9(R1) addendum provides guidelines on accounting for intercurrent events in clinical trials using the estimands framework. However, there has been limited attention to the estimands framework for meta-analysis. Using treatment switching, a well-known intercurrent event that occurs frequently in oncology, we conducted a simulation study to explore the bias introduced by pooling together estimates targeting different estimands in a meta-analysis of randomized clinical trials (RCTs) that allowed treatment switching. We simulated overall survival data of a collection of RCTs that allowed patients in the control group to switch to the intervention treatment after disease progression under fixed effects and random effects models. For each RCT, we calculated effect estimates for a treatment policy estimand that ignored treatment switching, and a hypothetical estimand that accounted for treatment switching either by fitting rank-preserving structural failure time models or by censoring switchers. Then, we performed random effects and fixed effects meta-analyses to pool together RCT effect estimates while varying the proportions of trials providing treatment policy and hypothetical effect estimates. We compared the results of meta-analyses that pooled different types of effect estimates with those that pooled only treatment policy or hypothetical estimates. We found that pooling estimates targeting different estimands results in pooled estimators that do not target any estimand of interest, and that pooling estimates of varying estimands can generate misleading results, even under a random effects model. Adopting the estimands framework for meta-analysis may improve alignment between meta-analytic results and the clinical research question of interest.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"170-193"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12824772/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103238","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developing an approach for assigning GRADE levels in a systematic overview of reviews of diagnostic test accuracy using general principles identified from current GRADE guidelines: A case study. 根据现行GRADE指南确定的一般原则,在诊断测试准确性的系统综述中开发一种分配GRADE水平的方法:一个案例研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-13 DOI: 10.1017/rsm.2025.10047
Andrew Dullea, Lydia O'Sullivan, Kirsty K O'Brien, Patricia Harrington, Marie Carrigan, Susan Ahern, Maeve McGarry, Karen Cardwell, Michelle O'Neill, Kieran A Walsh, Barbara Clyne, Susan M Smith, Mairin Ryan

Existing guidelines on overviews of reviews and umbrella reviews recommend an assessment of the certainty of evidence, but provide limited guidance on 'how to' apply the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to such a complex evidence synthesis. We share our experience of developing a 'general principles' approach to applying GRADE to a complex overview of reviews. The approach was developed in an iterative and exploratory manner during the planning and conduct of an overview of reviews of a novel molecular imaging technique for the staging of prostate cancer, involving a formal review by a group of 11 methodologists/health services researchers. This approach was developed during the evidence synthesis process, piloted, and then applied to our ongoing overview of reviews. A 'general principles' approach of applying the domains of GRADE to an overview of reviews and arriving at an overall summary judgement for each outcome is presented. Our approach details additional factors to consider, including addressing both the primary study risk of bias as assessed by the included reviews and the risk of bias of the systematic reviews themselves, as well as the statistical heterogeneity observed in meta-analyses conducted within the included reviews. Our approach distilled key principles from the relevant GRADE guidelines and allowed us to apply GRADE to a complex body of evidence in a consistent and transparent way. The approach taken and the methods used to develop our approach may inform researchers working on overviews of reviews, umbrella reviews, or future methodological guidelines.

现有的关于综述和总括性综述的指南建议对证据的确定性进行评估,但对“如何”将建议评估、发展和评价分级(GRADE)应用于如此复杂的证据综合提供了有限的指导。我们分享我们开发“一般原则”方法的经验,将GRADE应用于复杂的审查概述。该方法是在规划和开展一项用于前列腺癌分期的新型分子成像技术综述的过程中以迭代和探索的方式发展起来的,其中包括由11名方法学家/卫生服务研究人员组成的小组进行的正式审查。这种方法是在证据合成过程中开发的,经过试点,然后应用于我们正在进行的综述。提出了将GRADE领域应用于审查概述并得出每个结果的总体总结判断的“一般原则”方法。我们的方法详细说明了需要考虑的其他因素,包括解决纳入的综述评估的主要研究偏倚风险和系统评价本身的偏倚风险,以及在纳入的综述中进行的荟萃分析中观察到的统计异质性。我们的方法从相关的GRADE指南中提炼出关键原则,使我们能够以一致和透明的方式将GRADE应用于复杂的证据体。采用的方法和用于开发我们方法的方法可以为研究综述、总括性综述或未来方法指南的研究人员提供信息。
{"title":"Developing an approach for assigning GRADE levels in a systematic overview of reviews of diagnostic test accuracy using general principles identified from current GRADE guidelines: A case study.","authors":"Andrew Dullea, Lydia O'Sullivan, Kirsty K O'Brien, Patricia Harrington, Marie Carrigan, Susan Ahern, Maeve McGarry, Karen Cardwell, Michelle O'Neill, Kieran A Walsh, Barbara Clyne, Susan M Smith, Mairin Ryan","doi":"10.1017/rsm.2025.10047","DOIUrl":"10.1017/rsm.2025.10047","url":null,"abstract":"<p><p>Existing guidelines on overviews of reviews and umbrella reviews recommend an assessment of the certainty of evidence, but provide limited guidance on 'how to' apply the Grading of Recommendations Assessment, Development, and Evaluation (GRADE) to such a complex evidence synthesis. We share our experience of developing a 'general principles' approach to applying GRADE to a complex overview of reviews. The approach was developed in an iterative and exploratory manner during the planning and conduct of an overview of reviews of a novel molecular imaging technique for the staging of prostate cancer, involving a formal review by a group of 11 methodologists/health services researchers. This approach was developed during the evidence synthesis process, piloted, and then applied to our ongoing overview of reviews. A 'general principles' approach of applying the domains of GRADE to an overview of reviews and arriving at an overall summary judgement for each outcome is presented. Our approach details additional factors to consider, including addressing both the primary study risk of bias as assessed by the included reviews and the risk of bias of the systematic reviews themselves, as well as the statistical heterogeneity observed in meta-analyses conducted within the included reviews. Our approach distilled key principles from the relevant GRADE guidelines and allowed us to apply GRADE to a complex body of evidence in a consistent and transparent way. The approach taken and the methods used to develop our approach may inform researchers working on overviews of reviews, umbrella reviews, or future methodological guidelines.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"225-236"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823199/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103316","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring the methodological quality and risk of bias in 200 systematic reviews: A comparative study of ROBIS and AMSTAR-2 tools. 探索200个系统评价的方法学质量和偏倚风险:ROBIS和AMSTAR-2工具的比较研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-27 DOI: 10.1017/rsm.2025.10032
Carole Lunny, Nityanand Jain, Tina Nazari, Melodi Kosaner-Kließ, Lucas Santos, Ian Goodman, Alaa A M Osman, Stefano Berrone, Mohammad Najm Dadam, Connor T A Brenna, Heba Hussein, Gioia Dahdal, Diana Cespedes A, Nicola Ferri, Salmaan Kanji, Yuan Chi, Dawid Pieper, Beverly Shea, Amanda Parker, Dipika Neupane, Paul A Khan, Daniella Rangira, Kat Kolaski, Ben Ridley, Amina Berour, Kevin Sun, Radin Hamidi Rad, Zihui Ouyang, Emma K Reid, Iván Pérez-Neri, Sanabel O Barakat, Silvia Bargeri, Silvia Gianola, Greta Castellini, Sera Whitelaw, Adrienne Stevens, Shailesh B Kolekar, Kristy Wong, Paityn Major, Ebrahim Bagheri, Andrea C Tricco

AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews, version 2) and ROBIS are tools used to assess the methodological quality and the risk of bias in a systematic review (SR). We applied AMSTAR-2 and ROBIS to a sample of 200 published SRs. We investigated the overlap in their methodological constructs, responses by item, and overall, percentage agreement, direction of effect, and timing of assessments. AMSTAR-2 contains 16 items and ROBIS 24 items. Three items in AMSTAR-2 and nine in ROBIS did not overlap in construct. Of the 200 SRs, 73% were low or critically low quality using AMSTAR-2, and 81% had a high risk of bias using ROBIS. The median time to complete AMSTAR-2 and ROBIS was 51 and 64 minutes, respectively. When assessment times were calibrated to the number of items in each tool, each item took an average of 3.2 minutes per item for AMSTAR-2 compared to 2.7 minutes for ROBIS. Nine percent of SRs had opposing ratings (i.e., AMSTAR-2 was high quality while ROBIS was high risk). In both tools, three-quarters of items showed more than 70% agreement between raters after extensive training and piloting. AMSTAR-2 and ROBIS provide complementary rather than interchangeable assessments of systematic reviews. AMSTAR-2 may be preferable when efficiency is prioritized and methodological rigour is the focus, whereas ROBIS offers a deeper examination of potential biases and external validity. Given the widespread reliance on systematic reviews for policy and practice, selecting the appropriate appraisal tool remains crucial. Future research should explore strategies to integrate the strengths of both instruments while minimizing the burden on assessors.

AMSTAR-2(评估系统评价的测量工具,版本2)和ROBIS是用于评估系统评价(SR)的方法学质量和偏倚风险的工具。我们将AMSTAR-2和ROBIS应用于200份已发表的sr样本。我们调查了他们的方法学结构、项目反应和总体、百分比一致、效果方向和评估时间的重叠。AMSTAR-2包含16个项目,ROBIS包含24个项目。AMSTAR-2中的3个项目和ROBIS中的9个项目在结构上没有重叠。在200个SRs中,73%使用AMSTAR-2评价为低质量或极低质量,81%使用ROBIS评价为高偏倚风险。完成AMSTAR-2和ROBIS的中位时间分别为51分钟和64分钟。当评估时间根据每个工具中的项目数量进行校准时,AMSTAR-2的每个项目平均花费3.2分钟,而ROBIS的每个项目平均花费2.7分钟。9%的SRs有相反的评级(即,AMSTAR-2是高质量的,而ROBIS是高风险的)。在这两种工具中,经过广泛的培训和指导,评价者之间在四分之三的项目中达成了70%以上的共识。AMSTAR-2和ROBIS提供的是互补的而不是可互换的系统审查评估。当效率优先且方法严谨是重点时,AMSTAR-2可能更好,而ROBIS提供了对潜在偏差和外部有效性的更深入检查。鉴于政策和实践普遍依赖系统审查,选择适当的评估工具仍然至关重要。未来的研究应探索整合两种工具优势的策略,同时尽量减少评估者的负担。
{"title":"Exploring the methodological quality and risk of bias in 200 systematic reviews: A comparative study of ROBIS and AMSTAR-2 tools.","authors":"Carole Lunny, Nityanand Jain, Tina Nazari, Melodi Kosaner-Kließ, Lucas Santos, Ian Goodman, Alaa A M Osman, Stefano Berrone, Mohammad Najm Dadam, Connor T A Brenna, Heba Hussein, Gioia Dahdal, Diana Cespedes A, Nicola Ferri, Salmaan Kanji, Yuan Chi, Dawid Pieper, Beverly Shea, Amanda Parker, Dipika Neupane, Paul A Khan, Daniella Rangira, Kat Kolaski, Ben Ridley, Amina Berour, Kevin Sun, Radin Hamidi Rad, Zihui Ouyang, Emma K Reid, Iván Pérez-Neri, Sanabel O Barakat, Silvia Bargeri, Silvia Gianola, Greta Castellini, Sera Whitelaw, Adrienne Stevens, Shailesh B Kolekar, Kristy Wong, Paityn Major, Ebrahim Bagheri, Andrea C Tricco","doi":"10.1017/rsm.2025.10032","DOIUrl":"10.1017/rsm.2025.10032","url":null,"abstract":"<p><p>AMSTAR-2 (A Measurement Tool to Assess Systematic Reviews, version 2) and ROBIS are tools used to assess the methodological quality and the risk of bias in a systematic review (SR). We applied AMSTAR-2 and ROBIS to a sample of 200 published SRs. We investigated the overlap in their methodological constructs, responses by item, and overall, percentage agreement, direction of effect, and timing of assessments. AMSTAR-2 contains 16 items and ROBIS 24 items. Three items in AMSTAR-2 and nine in ROBIS did not overlap in construct. Of the 200 SRs, 73% were low or critically low quality using AMSTAR-2, and 81% had a high risk of bias using ROBIS. The median time to complete AMSTAR-2 and ROBIS was 51 and 64 minutes, respectively. When assessment times were calibrated to the number of items in each tool, each item took an average of 3.2 minutes per item for AMSTAR-2 compared to 2.7 minutes for ROBIS. Nine percent of SRs had opposing ratings (i.e., AMSTAR-2 was high quality while ROBIS was high risk). In both tools, three-quarters of items showed more than 70% agreement between raters after extensive training and piloting. AMSTAR-2 and ROBIS provide complementary rather than interchangeable assessments of systematic reviews. AMSTAR-2 may be preferable when efficiency is prioritized and methodological rigour is the focus, whereas ROBIS offers a deeper examination of potential biases and external validity. Given the widespread reliance on systematic reviews for policy and practice, selecting the appropriate appraisal tool remains crucial. Future research should explore strategies to integrate the strengths of both instruments while minimizing the burden on assessors.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"63-92"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823211/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103330","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Automating the data extraction process for systematic reviews using GPT-4o and o3. 使用gpt - 40和o3自动化系统审查的数据提取过程。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-09-17 DOI: 10.1017/rsm.2025.10030
Yuki Kataoka, Tomohiro Takayama, Keisuke Yoshimura, Ryuhei So, Yasushi Tsujimoto, Yosuke Yamagishi, Shiro Takagi, Yuki Furukawa, Masatsugu Sakata, Đorđe Bašić, Andrea Cipriani, Pim Cuijpers, Eirini Karyotaki, Mathias Harrer, Stefan Leucht, Ava Homiar, Edoardo G Ostinelli, Clara Miguel, Alessandro Rodolico, Toshi A Furukawa

Large language models have shown promise for automating data extraction (DE) in systematic reviews (SRs), but most existing approaches require manual interaction. We developed an open-source system using GPT-4o to automatically extract data with no human intervention during the extraction process. We developed the system on a dataset of 290 randomized controlled trials (RCTs) from a published SR about cognitive behavioral therapy for insomnia. We evaluated the system on two other datasets: 5 RCTs from an updated search for the same review and 10 RCTs used in a separate published study that had also evaluated automated DE. We developed the best approach across all variables in the development dataset using GPT-4o. The performance in the updated-search dataset using o3 was 74.9% sensitivity, 76.7% specificity, 75.7 precision, 93.5% variable detection comprehensiveness, and 75.3% accuracy. In both datasets, accuracy was higher for string variables (e.g., country, study design, drug names, and outcome definitions) compared with numeric variables. In the third external validation dataset, GPT-4o showed a lower performance with a mean accuracy of 84.4% compared with the previous study. However, by adjusting our DE method, while maintaining the same prompting technique, we achieved a mean accuracy of 96.3%, which was comparable to the previous manual extraction study. Our system shows potential for assisting the DE of string variables alongside a human reviewer. However, it cannot yet replace humans for numeric DE. Further evaluation across diverse review contexts is needed to establish broader applicability.

大型语言模型已经显示出在系统审查(SRs)中自动化数据提取(DE)的希望,但是大多数现有的方法需要人工交互。我们使用gpt - 40开发了一个开源系统,在提取过程中无需人工干预即可自动提取数据。我们在290个随机对照试验(rct)的数据集上开发了这个系统,这些试验来自一篇发表的关于失眠认知行为疗法的研究报告。我们在另外两个数据集上评估了该系统:来自同一综述的更新搜索的5个随机对照试验,以及来自另一项独立发表的研究的10个随机对照试验,该研究也评估了自动化DE。我们使用gpt - 40开发了跨越开发数据集中所有变量的最佳方法。使用o3的更新搜索数据集的性能为灵敏度74.9%,特异性76.7%,精度75.7,变量检测全面性93.5%,准确性75.3%。在这两个数据集中,与数字变量相比,字符串变量(如国家、研究设计、药物名称和结果定义)的准确性更高。在第三个外部验证数据集中,gpt - 40表现出较低的性能,平均准确率为84.4%。然而,通过调整我们的DE方法,在保持相同提示技术的情况下,我们获得了96.3%的平均准确率,与之前的人工提取研究相当。我们的系统显示了协助字符串变量DE和人工审阅的潜力。然而,它还不能取代人类的数值DE。需要在不同的审查背景下进行进一步的评估,以建立更广泛的适用性。
{"title":"Automating the data extraction process for systematic reviews using GPT-4o and o3.","authors":"Yuki Kataoka, Tomohiro Takayama, Keisuke Yoshimura, Ryuhei So, Yasushi Tsujimoto, Yosuke Yamagishi, Shiro Takagi, Yuki Furukawa, Masatsugu Sakata, Đorđe Bašić, Andrea Cipriani, Pim Cuijpers, Eirini Karyotaki, Mathias Harrer, Stefan Leucht, Ava Homiar, Edoardo G Ostinelli, Clara Miguel, Alessandro Rodolico, Toshi A Furukawa","doi":"10.1017/rsm.2025.10030","DOIUrl":"10.1017/rsm.2025.10030","url":null,"abstract":"<p><p>Large language models have shown promise for automating data extraction (DE) in systematic reviews (SRs), but most existing approaches require manual interaction. We developed an open-source system using GPT-4o to automatically extract data with no human intervention during the extraction process. We developed the system on a dataset of 290 randomized controlled trials (RCTs) from a published SR about cognitive behavioral therapy for insomnia. We evaluated the system on two other datasets: 5 RCTs from an updated search for the same review and 10 RCTs used in a separate published study that had also evaluated automated DE. We developed the best approach across all variables in the development dataset using GPT-4o. The performance in the updated-search dataset using o3 was 74.9% sensitivity, 76.7% specificity, 75.7 precision, 93.5% variable detection comprehensiveness, and 75.3% accuracy. In both datasets, accuracy was higher for string variables (e.g., country, study design, drug names, and outcome definitions) compared with numeric variables. In the third external validation dataset, GPT-4o showed a lower performance with a mean accuracy of 84.4% compared with the previous study. However, by adjusting our DE method, while maintaining the same prompting technique, we achieved a mean accuracy of 96.3%, which was comparable to the previous manual extraction study. Our system shows potential for assisting the DE of string variables alongside a human reviewer. However, it cannot yet replace humans for numeric DE. Further evaluation across diverse review contexts is needed to establish broader applicability.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"42-62"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823200/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103253","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
What can we learn from 1,000 meta-analyses across 10 different disciplines? 我们能从10个不同学科的1000项荟萃分析中学到什么?
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-02 DOI: 10.1017/rsm.2025.10035
Weilun Wu, Jianhua Duan, W Robert Reed, Elizabeth Tipton

This study analyzes 1,000 meta-analyses drawn from 10 disciplines-including medicine, psychology, education, biology, and economics-to document and compare methodological practices across fields. We find large differences in the size of meta-analyses, the number of effect sizes per study, and the types of effect sizes used. Disciplines also vary in their use of unpublished studies, the frequency and type of tests for publication bias, and whether they attempt to correct for it. Notably, many meta-analyses include multiple effect sizes from the same study, yet fail to account for statistical dependence in their analyses. We document the limited use of advanced methods-such as multilevel models and cluster-adjusted standard errors-that can accommodate dependent data structures. Correlations are frequently used as effect sizes in some disciplines, yet researchers often fail to address the methodological issues this introduces, including biased weighting and misleading tests for publication bias. We also find that meta-regression is underutilized, even when sample sizes are large enough to support it. This work serves as a resource for researchers conducting their first meta-analyses, as a benchmark for researchers designing simulation experiments, and as a reference for applied meta-analysts aiming to improve their methodological practices.

这项研究分析了来自10个学科——包括医学、心理学、教育、生物学和经济学——的1000项荟萃分析,以记录和比较不同领域的方法实践。我们发现meta分析的规模、每项研究的效应量数量和使用的效应量类型存在很大差异。各学科在未发表研究的使用、发表偏倚测试的频率和类型以及是否试图纠正偏倚方面也各不相同。值得注意的是,许多荟萃分析包括来自同一研究的多个效应量,但未能解释其分析中的统计依赖性。我们记录了高级方法(如多层模型和集群调整标准误差)的有限使用,这些方法可以容纳依赖的数据结构。在某些学科中,相关性经常被用作效应值,但研究人员往往无法解决由此引入的方法学问题,包括偏倚加权和误导性的发表偏倚检验。我们还发现,即使样本量足够大,元回归也没有得到充分利用。这项工作可作为研究人员进行第一次元分析的资源,作为研究人员设计模拟实验的基准,并作为旨在改进其方法实践的应用元分析的参考。
{"title":"What can we learn from 1,000 meta-analyses across 10 different disciplines?","authors":"Weilun Wu, Jianhua Duan, W Robert Reed, Elizabeth Tipton","doi":"10.1017/rsm.2025.10035","DOIUrl":"10.1017/rsm.2025.10035","url":null,"abstract":"<p><p>This study analyzes 1,000 meta-analyses drawn from 10 disciplines-including medicine, psychology, education, biology, and economics-to document and compare methodological practices across fields. We find large differences in the size of meta-analyses, the number of effect sizes per study, and the types of effect sizes used. Disciplines also vary in their use of unpublished studies, the frequency and type of tests for publication bias, and whether they attempt to correct for it. Notably, many meta-analyses include multiple effect sizes from the same study, yet fail to account for statistical dependence in their analyses. We document the limited use of advanced methods-such as multilevel models and cluster-adjusted standard errors-that can accommodate dependent data structures. Correlations are frequently used as effect sizes in some disciplines, yet researchers often fail to address the methodological issues this introduces, including biased weighting and misleading tests for publication bias. We also find that meta-regression is underutilized, even when sample sizes are large enough to support it. This work serves as a resource for researchers conducting their first meta-analyses, as a benchmark for researchers designing simulation experiments, and as a reference for applied meta-analysts aiming to improve their methodological practices.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"123-156"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823205/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Incorporating the possibility of cure into network meta-analyses: A case study from resected Stage III/IV melanoma. 将治愈的可能性纳入网络荟萃分析:切除III/IV期黑色素瘤的案例研究。
IF 6.1 2区 生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2026-01-01 Epub Date: 2025-10-15 DOI: 10.1017/rsm.2025.10038
Keith Chan, Sarah Goring, Kabirraaj Toor, Murat Kurt, Andriy Moshyk, Jeroen Jansen

In many areas of oncology, cancer drugs are now associated with long-term survivorship and mixture cure models (MCM) are increasingly being used for survival analysis. The objective of this article was to propose a methodology for conducting network meta-analysis (NMA) of MCM. This method was illustrated through a case study evaluating recurrence-free survival (RFS) with adjuvant therapy for stage III/IV resected melanoma. For the case study, the MCM NMA was conducted by: (1) fitting MCMs to each trial included within the network of evidence; and (2) incorporating the parameters of the MCMs into a multivariate NMA. Outputs included relative effect estimates for the MCM NMA as well as absolute estimates of survival (RFS), modeled within the Bayesian multivariate NMA, by incorporating absolute baseline effects of the reference treatment. The case study was intended for illustrative purposes of the MCM NMA methodology and is not meant for clinical interpretation. The case study demonstrated the feasibility of conducting an MCM NMA and highlighted key issues and considerations when conducting such analyses, including plausibility of cure, maturity of data, process for model selection, and the presentation and interpretation of results. MCM NMA provides a method of comparative survival that acknowledges the benefit newer treatments may confer on a subset of patients, resulting in long-term survival and reflection of this survival in extrapolation. In the future, this method may provide an additional metric to compare treatments that is of value to patients.

在肿瘤学的许多领域,癌症药物现在与长期生存和混合治疗模型(MCM)越来越多地被用于生存分析。本文的目的是提出一种进行MCM网络元分析(NMA)的方法。这种方法是通过评估辅助治疗III/IV期切除黑色素瘤的无复发生存(RFS)的案例研究来说明的。对于案例研究,MCM NMA是通过以下方式进行的:(1)将MCM拟合到证据网络中的每个试验中;(2)将mcm的参数纳入多元NMA。输出包括MCM NMA的相对效果估计,以及在贝叶斯多变量NMA中建模的生存绝对估计(RFS),通过合并参考治疗的绝对基线效果。该案例研究旨在说明MCM NMA方法的目的,并不意味着临床解释。该案例研究展示了进行MCM NMA的可行性,并强调了进行此类分析时的关键问题和考虑因素,包括治疗的合理性、数据的成熟度、模型选择的过程以及结果的呈现和解释。MCM NMA提供了一种比较生存的方法,它承认新治疗可能给一部分患者带来的好处,从而导致长期生存,并在外推中反映这种生存。在未来,这种方法可能会提供一个额外的指标来比较对患者有价值的治疗。
{"title":"Incorporating the possibility of cure into network meta-analyses: A case study from resected Stage III/IV melanoma.","authors":"Keith Chan, Sarah Goring, Kabirraaj Toor, Murat Kurt, Andriy Moshyk, Jeroen Jansen","doi":"10.1017/rsm.2025.10038","DOIUrl":"10.1017/rsm.2025.10038","url":null,"abstract":"<p><p>In many areas of oncology, cancer drugs are now associated with long-term survivorship and mixture cure models (MCM) are increasingly being used for survival analysis. The objective of this article was to propose a methodology for conducting network meta-analysis (NMA) of MCM. This method was illustrated through a case study evaluating recurrence-free survival (RFS) with adjuvant therapy for stage III/IV resected melanoma. For the case study, the MCM NMA was conducted by: (1) fitting MCMs to each trial included within the network of evidence; and (2) incorporating the parameters of the MCMs into a multivariate NMA. Outputs included relative effect estimates for the MCM NMA as well as absolute estimates of survival (RFS), modeled within the Bayesian multivariate NMA, by incorporating absolute baseline effects of the reference treatment. The case study was intended for illustrative purposes of the MCM NMA methodology and is not meant for clinical interpretation. The case study demonstrated the feasibility of conducting an MCM NMA and highlighted key issues and considerations when conducting such analyses, including plausibility of cure, maturity of data, process for model selection, and the presentation and interpretation of results. MCM NMA provides a method of comparative survival that acknowledges the benefit newer treatments may confer on a subset of patients, resulting in long-term survival and reflection of this survival in extrapolation. In the future, this method may provide an additional metric to compare treatments that is of value to patients.</p>","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"17 1","pages":"157-169"},"PeriodicalIF":6.1,"publicationDate":"2026-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12823198/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146103392","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Research Synthesis Methods
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1