Research Synthesis Methods最新文献

A tutorial on aggregating evidence from conceptual replication studies using the product Bayes factor 使用乘积贝叶斯因子汇总概念复制研究证据的教程。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-23 DOI: 10.1002/jrsm.1765

Caspar J. Van Lissa, Eli-Boaz Clapper, Rebecca Kuiper

The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and measures used. PBF shines as a solution for small sample meta-analyses, where the number of between-study differences is often large relative to the number of studies, precluding the use of meta-regression to account for these differences. Users should be mindful of the fact that the PBF answers a qualitatively different research question than other evidence synthesis methods. For example, whereas fixed-effect meta-analysis estimates the size of a population effect, the PBF quantifies to what extent an informative hypothesis is supported in all included studies. This tutorial paper showcases the user-friendly PBF functionality within the bain R-package. This new implementation of an existing method was validated using a simulation study, available in an Online Supplement. Results showed that PBF had a high overall accuracy, due to greater sensitivity and lower specificity, compared to random-effects meta-analysis, individual participant data meta-analysis, and vote counting. Tutorials demonstrate applications of the method on meta-analytic and individual participant data. The example datasets, based on published research, are included in bain so readers can reproduce the examples and apply the code to their own data. The PBF is a promising method for synthesizing evidence for informative hypotheses across conceptual replications that are not suitable for conventional meta-analysis.

乘积贝叶斯因子（PBF）综合了异质性重复研究中某一信息假设的证据。当固定效应荟萃分析或随机效应荟萃分析无法满足要求时，可以使用该方法。例如，当效应大小不可比且无法汇集时，或者当研究在使用的人群、研究设计和测量方法上存在显著差异时。PBF 是小样本荟萃分析的理想解决方案，在这种情况下，相对于研究数量，研究间差异的数量往往很大，因此无法使用荟萃回归来解释这些差异。用户应注意，与其他证据综合方法相比，PBF 所回答的研究问题在质量上有所不同。例如，固定效应荟萃分析估计的是群体效应的大小，而PBF量化的是所有纳入的研究在多大程度上支持了一个信息假设。本教程论文展示了 bain R 软件包中用户友好的 PBF 功能。对现有方法的这一新实施通过模拟研究进行了验证，结果见在线增刊。结果表明，与随机效应荟萃分析、单个参与者数据荟萃分析和投票计数相比，PBF 具有更高的灵敏度和更低的特异性，因此总体准确率较高。教程演示了该方法在荟萃分析和个体参与者数据上的应用。bain 中包含了基于已发表研究的示例数据集，读者可以复制示例并将代码应用到自己的数据中。PBF 是一种很有前途的方法，它可以在不适合传统荟萃分析的概念复制中综合信息假设的证据。

{"title":"A tutorial on aggregating evidence from conceptual replication studies using the product Bayes factor","authors":"Caspar J. Van Lissa, Eli-Boaz Clapper, Rebecca Kuiper","doi":"10.1002/jrsm.1765","DOIUrl":"10.1002/jrsm.1765","url":null,"abstract":"The product Bayes factor (PBF) synthesizes evidence for an informative hypothesis across heterogeneous replication studies. It can be used when fixed- or random effects meta-analysis fall short. For example, when effect sizes are incomparable and cannot be pooled, or when studies diverge significantly in the populations, study designs, and measures used. PBF shines as a solution for small sample meta-analyses, where the number of between-study differences is often large relative to the number of studies, precluding the use of meta-regression to account for these differences. Users should be mindful of the fact that the PBF answers a qualitatively different research question than other evidence synthesis methods. For example, whereas fixed-effect meta-analysis estimates the size of a population effect, the PBF quantifies to what extent an informative hypothesis is supported in all included studies. This tutorial paper showcases the user-friendly PBF functionality within the bain R-package. This new implementation of an existing method was validated using a simulation study, available in an Online Supplement. Results showed that PBF had a high overall accuracy, due to greater sensitivity and lower specificity, compared to random-effects meta-analysis, individual participant data meta-analysis, and vote counting. Tutorials demonstrate applications of the method on meta-analytic and individual participant data. The example datasets, based on published research, are included in bain so readers can reproduce the examples and apply the code to their own data. The PBF is a promising method for synthesizing evidence for informative hypotheses across conceptual replications that are not suitable for conventional meta-analysis.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1231-1243"},"PeriodicalIF":5.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1765","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142491729","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Evolving use of the Cochrane Risk of Bias 2 tool in biomedical systematic reviews 在生物医学系统综述中使用 Cochrane Risk of Bias 2 工具的演变。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-23 DOI: 10.1002/jrsm.1756

Livia Puljak, Andrija Babić, Ognjen Barčot, Tina Poklepović Peričić

引用次数: 0

Exploring methodological approaches used in network meta-analysis of psychological interventions: A scoping review 探索心理干预网络荟萃分析中使用的方法论：范围综述。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-23 DOI: 10.1002/jrsm.1764

Kansak Boonpattharatthiti, Garin Ruenin, Pun Kulwong, Jitsupa Lueawattanasakul, Chintra Saechao, Panitan Pitak, Deborah M. Caldwell, Nathorn Chaiyakunapruk, Teerapon Dhippayom

Psychological interventions are complex in nature and have been shown to benefit various clinical outcomes. Gaining insight into current practices would help identify specific aspects that need improvement to enhance the quality of network meta-analysis (NMA) in this field. This scoping review aimed to explore methodological approaches in the NMA of psychological interventions. We searched PubMed, EMBASE, and Cochrane CENTRAL in September 2023. We included NMAs of psychological interventions of randomized controlled trials that reported clinical outcomes. Three independent researchers assessed the eligibility and extracted relevant data. The findings were presented using descriptive statistics. Of the 1827 articles identified, 187 studies were included. Prior protocol registration was reported in 130 studies (69.5%). Forty-six studies (24.6%) attempted to search for gray literature. Ninety-four studies (50.3%) explicitly assessed transitivity. Nearly three-quarters (143 studies, 76.5%) classified treatment nodes by the type of psychological intervention, while 13 studies (7.0%) did so by lumping different intervention types into more broader intervention classes. Seven studies (3.7%) examined active components of the intervention using component NMA. Only three studies (1.6%) classified interventions based on factors affecting intervention practices, specifically intensity, provider, and delivery platform. Meanwhile, 29 studies (15.5%) explored the influential effects of these factors using meta-regression, subgroup analysis, or sensitivity analysis. The certainty of evidence was assessed in 80 studies (42.8%). The methodological approach in NMAs of psychological interventions should be improved, specifically in classifying psychological interventions into treatment nodes, exploring the effects of intervention-related factors, and assessing the certainty of evidence.

心理干预在本质上是复杂的，已被证明对各种临床结果有益。深入了解当前的做法有助于确定需要改进的具体方面，从而提高该领域网络荟萃分析（NMA）的质量。本范围综述旨在探讨心理干预NMA的方法学方法。我们在 2023 年 9 月检索了 PubMed、EMBASE 和 Cochrane CENTRAL。我们纳入了报告临床结果的随机对照试验的心理干预NMA。三位独立研究人员评估了研究资格并提取了相关数据。研究结果采用描述性统计。在确定的 1827 篇文章中，共纳入了 187 项研究。130项研究（69.5%）报告了事先的方案注册。46项研究（24.6%）尝试搜索灰色文献。94项研究（50.3%）明确评估了反式性。近四分之三的研究（143 项研究，76.5%）按照心理干预类型对治疗节点进行了分类，而 13 项研究（7.0%）则通过将不同的干预类型归入更广泛的干预类别来进行分类。七项研究（3.7%）使用成分 NMA 检查了干预的积极成分。只有三项研究（1.6%）根据影响干预措施的因素，特别是强度、提供者和实施平台，对干预措施进行了分类。同时，29 项研究（15.5%）使用元回归、亚组分析或敏感性分析探讨了这些因素的影响效果。对 80 项研究（42.8%）的证据确定性进行了评估。心理干预的 NMA 方法应加以改进，特别是在将心理干预划分为治疗节点、探讨干预相关因素的影响以及评估证据的确定性方面。

{"title":"Exploring methodological approaches used in network meta-analysis of psychological interventions: A scoping review","authors":"Kansak Boonpattharatthiti, Garin Ruenin, Pun Kulwong, Jitsupa Lueawattanasakul, Chintra Saechao, Panitan Pitak, Deborah M. Caldwell, Nathorn Chaiyakunapruk, Teerapon Dhippayom","doi":"10.1002/jrsm.1764","DOIUrl":"10.1002/jrsm.1764","url":null,"abstract":"Psychological interventions are complex in nature and have been shown to benefit various clinical outcomes. Gaining insight into current practices would help identify specific aspects that need improvement to enhance the quality of network meta-analysis (NMA) in this field. This scoping review aimed to explore methodological approaches in the NMA of psychological interventions. We searched PubMed, EMBASE, and Cochrane CENTRAL in September 2023. We included NMAs of psychological interventions of randomized controlled trials that reported clinical outcomes. Three independent researchers assessed the eligibility and extracted relevant data. The findings were presented using descriptive statistics. Of the 1827 articles identified, 187 studies were included. Prior protocol registration was reported in 130 studies (69.5%). Forty-six studies (24.6%) attempted to search for gray literature. Ninety-four studies (50.3%) explicitly assessed transitivity. Nearly three-quarters (143 studies, 76.5%) classified treatment nodes by the type of psychological intervention, while 13 studies (7.0%) did so by lumping different intervention types into more broader intervention classes. Seven studies (3.7%) examined active components of the intervention using component NMA. Only three studies (1.6%) classified interventions based on factors affecting intervention practices, specifically intensity, provider, and delivery platform. Meanwhile, 29 studies (15.5%) explored the influential effects of these factors using meta-regression, subgroup analysis, or sensitivity analysis. The certainty of evidence was assessed in 80 studies (42.8%). The methodological approach in NMAs of psychological interventions should be improved, specifically in classifying psychological interventions into treatment nodes, exploring the effects of intervention-related factors, and assessing the certainty of evidence.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1161-1174"},"PeriodicalIF":5.0,"publicationDate":"2024-10-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1764","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142491731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

An evaluation of the performance of stopping rules in AI-aided screening for psychological meta-analytical research 评估人工智能辅助筛选心理元分析研究中停止规则的性能。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-16 DOI: 10.1002/jrsm.1762

Lars König, Steffen Zitzmann, Tim Fütterer, Diego G. Campos, Ronny Scherer, Martin Hecht

Several AI-aided screening tools have emerged to tackle the ever-expanding body of literature. These tools employ active learning, where algorithms sort abstracts based on human feedback. However, researchers using these tools face a crucial dilemma: When should they stop screening without knowing the proportion of relevant studies? Although numerous stopping rules have been proposed to guide users in this decision, they have yet to undergo comprehensive evaluation. In this study, we evaluated the performance of three stopping rules: the knee method, a data-driven heuristic, and a prevalence estimation technique. We measured performance via sensitivity, specificity, and screening cost and explored the influence of the prevalence of relevant studies and the choice of the learning algorithm. We curated a dataset of abstract collections from meta-analyses across five psychological research domains. Our findings revealed performance differences between stopping rules regarding all performance measures and variations in the performance of stopping rules across different prevalence ratios. Moreover, despite the relatively minor impact of the learning algorithm, we found that specific combinations of stopping rules and learning algorithms were most effective for certain prevalence ratios of relevant abstracts. Based on these results, we derived practical recommendations for users of AI-aided screening tools. Furthermore, we discuss possible implications and offer suggestions for future research.

为了应对不断扩大的文献数量，出现了几种人工智能辅助筛选工具。这些工具采用了主动学习技术，算法会根据人类的反馈对摘要进行排序。然而，使用这些工具的研究人员面临着一个重要的难题：在不知道相关研究比例的情况下，何时应该停止筛选？虽然已经提出了许多停止规则来指导用户做出这一决定，但这些规则尚未经过全面评估。在本研究中，我们评估了三种终止规则的性能：膝关节法、数据驱动启发式和流行率估计技术。我们通过灵敏度、特异性和筛选成本来衡量性能，并探讨了相关研究的流行程度和学习算法选择的影响。我们整理了来自五个心理学研究领域荟萃分析的摘要数据集。我们的研究结果表明，停止规则在所有性能指标上都存在性能差异，而且停止规则的性能在不同的流行率下也存在差异。此外，尽管学习算法的影响相对较小，但我们发现特定的停止规则和学习算法组合对于特定流行率的相关摘要最为有效。基于这些结果，我们为人工智能辅助筛选工具的用户提出了实用建议。此外，我们还讨论了可能的影响，并对未来的研究提出了建议。

{"title":"An evaluation of the performance of stopping rules in AI-aided screening for psychological meta-analytical research","authors":"Lars König, Steffen Zitzmann, Tim Fütterer, Diego G. Campos, Ronny Scherer, Martin Hecht","doi":"10.1002/jrsm.1762","DOIUrl":"10.1002/jrsm.1762","url":null,"abstract":"Several AI-aided screening tools have emerged to tackle the ever-expanding body of literature. These tools employ active learning, where algorithms sort abstracts based on human feedback. However, researchers using these tools face a crucial dilemma: When should they stop screening without knowing the proportion of relevant studies? Although numerous stopping rules have been proposed to guide users in this decision, they have yet to undergo comprehensive evaluation. In this study, we evaluated the performance of three stopping rules: the knee method, a data-driven heuristic, and a prevalence estimation technique. We measured performance via sensitivity, specificity, and screening cost and explored the influence of the prevalence of relevant studies and the choice of the learning algorithm. We curated a dataset of abstract collections from meta-analyses across five psychological research domains. Our findings revealed performance differences between stopping rules regarding all performance measures and variations in the performance of stopping rules across different prevalence ratios. Moreover, despite the relatively minor impact of the learning algorithm, we found that specific combinations of stopping rules and learning algorithms were most effective for certain prevalence ratios of relevant abstracts. Based on these results, we derived practical recommendations for users of AI-aided screening tools. Furthermore, we discuss possible implications and offer suggestions for future research.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1120-1146"},"PeriodicalIF":5.0,"publicationDate":"2024-10-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1762","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454357","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Development and validation of a geographic search filter for MEDLINE (PubMed) to identify studies about Germany 为 MEDLINE（PubMed）开发并验证地理搜索过滤器，以识别有关德国的研究。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-15 DOI: 10.1002/jrsm.1763

Alexander Pachanov, Catharina Münte, Julian Hirt, Dawid Pieper

While geographic search filters exist, few of them are validated and there are currently none that focus on Germany. We aimed to develop and validate a highly sensitive geographic search filter for MEDLINE (PubMed) that identifies studies about Germany. First, using the relative recall method, we created a gold standard set of studies about Germany, dividing it into ‘development’ and ‘testing’ sets. Next, candidate search terms were identified using (i) term frequency analyses in the ‘development set’ and a random set of MEDLINE records; and (ii) a list of German geographic locations, compiled by our team. Then, we iteratively created the filter, evaluating it against the ‘development’ and ‘testing’ sets. To validate the filter, we conducted a number of case studies (CSs) and a simulation study. For this validation we used systematic reviews (SRs) that had included studies about Germany but did not restrict their search strategy geographically. When applying the filter to the original search strategies of the 17 SRs eligible for CSs, the median precision was 2.64% (interquartile range [IQR]: 1.34%–6.88%) versus 0.16% (IQR: 0.10%–0.49%) without the filter. The median number-needed-to-read (NNR) decreased from 625 (IQR: 211–1042) to 38 (IQR: 15–76). The filter achieved 100% sensitivity in 13 CSs, 85.71% in 2 CSs and 87.50% and 80% in the remaining 2 CSs. In a simulation study, the filter demonstrated an overall sensitivity of 97.19% and NNR of 42. The filter reliably identifies studies about Germany, enhancing screening efficiency and can be applied in evidence syntheses focusing on Germany.

虽然存在地理搜索过滤器，但其中很少有经过验证的，目前也没有任何一种过滤器是针对德国的。我们的目标是为 MEDLINE (PubMed)开发并验证一种高灵敏度的地理搜索过滤器，以识别有关德国的研究。首先，我们使用相对召回法创建了一个关于德国的金标准研究集，将其分为 "开发 "集和 "测试 "集。接下来，我们使用以下方法确定了候选搜索词：(i) 对 "发展集 "和随机 MEDLINE 记录集进行词频分析；(ii) 我们团队编制的德国地理位置列表。然后，我们反复创建过滤器，并根据 "开发集 "和 "测试集 "对其进行评估。为了验证该过滤器，我们进行了大量案例研究（CS）和模拟研究。在验证过程中，我们使用了系统综述（SR），这些综述包含了有关德国的研究，但并未对其搜索策略进行地域限制。当对符合 CSs 条件的 17 篇 SR 的原始检索策略应用筛选器时，中位精确度为 2.64%（四分位距[IQR]：1.34%-6.88%），而未应用筛选器时为 0.16%（四分位距[IQR]：0.10%-0.49%）。所需读数（NNR）的中位数从625（IQR：211-1042）降至38（IQR：15-76）。该过滤器在 13 个 CS 中的灵敏度达到 100%，在 2 个 CS 中达到 85.71%，在其余 2 个 CS 中分别达到 87.50% 和 80%。在一项模拟研究中，该过滤器的总体灵敏度为 97.19%，NNR 为 42。该过滤器能可靠地识别有关德国的研究，提高了筛选效率，可用于以德国为重点的证据综述。

{"title":"Development and validation of a geographic search filter for MEDLINE (PubMed) to identify studies about Germany","authors":"Alexander Pachanov, Catharina Münte, Julian Hirt, Dawid Pieper","doi":"10.1002/jrsm.1763","DOIUrl":"10.1002/jrsm.1763","url":null,"abstract":"While geographic search filters exist, few of them are validated and there are currently none that focus on Germany. We aimed to develop and validate a highly sensitive geographic search filter for MEDLINE (PubMed) that identifies studies about Germany. First, using the relative recall method, we created a gold standard set of studies about Germany, dividing it into ‘development’ and ‘testing’ sets. Next, candidate search terms were identified using (i) term frequency analyses in the ‘development set’ and a random set of MEDLINE records; and (ii) a list of German geographic locations, compiled by our team. Then, we iteratively created the filter, evaluating it against the ‘development’ and ‘testing’ sets. To validate the filter, we conducted a number of case studies (CSs) and a simulation study. For this validation we used systematic reviews (SRs) that had included studies about Germany but did not restrict their search strategy geographically. When applying the filter to the original search strategies of the 17 SRs eligible for CSs, the median precision was 2.64% (interquartile range [IQR]: 1.34%–6.88%) versus 0.16% (IQR: 0.10%–0.49%) without the filter. The median number-needed-to-read (NNR) decreased from 625 (IQR: 211–1042) to 38 (IQR: 15–76). The filter achieved 100% sensitivity in 13 CSs, 85.71% in 2 CSs and 87.50% and 80% in the remaining 2 CSs. In a simulation study, the filter demonstrated an overall sensitivity of 97.19% and NNR of 42. The filter reliably identifies studies about Germany, enhancing screening efficiency and can be applied in evidence syntheses focusing on Germany.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1147-1160"},"PeriodicalIF":5.0,"publicationDate":"2024-10-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1763","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142454358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Mapping between measurement scales in meta-analysis, with application to measures of body mass index in children 荟萃分析中测量尺度之间的映射，并应用于儿童体重指数的测量。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-10-02 DOI: 10.1002/jrsm.1758

Annabel L. Davies, A. E. Ades, Julian P. T. Higgins

Quantitative evidence synthesis methods aim to combine data from multiple medical trials to infer relative effects of different interventions. A challenge arises when trials report continuous outcomes on different measurement scales. To include all evidence in one coherent analysis, we require methods to “map” the outcomes onto a single scale. This is particularly challenging when trials report aggregate rather than individual data. We are motivated by a meta-analysis of interventions to prevent obesity in children. Trials report aggregate measurements of body mass index (BMI) either expressed as raw values or standardized for age and sex. We develop three methods for mapping between aggregate BMI data using known or estimated relationships between measurements on different scales at the individual level. The first is an analytical method based on the mathematical definitions of z-scores and percentiles. The other two approaches involve sampling individual participant data on which to perform the conversions. One method is a straightforward sampling routine, while the other involves optimization with respect to the reported outcomes. In contrast to the analytical approach, these methods also have wider applicability for mapping between any pair of measurement scales with known or estimable individual-level relationships. We verify and contrast our methods using simulation studies and trials from our data set which report outcomes on multiple scales. We find that all methods recreate mean values with reasonable accuracy, but for standard deviations, optimization outperforms the other methods. However, the optimization method is more likely to underestimate standard deviations and is vulnerable to non-convergence.

定量证据综合方法旨在将多项医学试验的数据结合起来，以推断不同干预措施的相对效果。当试验以不同的测量尺度报告连续性结果时，就会出现挑战。为了将所有证据纳入一个连贯的分析中，我们需要将结果 "映射 "到单一量表上的方法。当试验报告的是总体数据而非个体数据时，这一点尤其具有挑战性。我们对预防儿童肥胖的干预措施进行了荟萃分析。试验报告了身体质量指数（BMI）的总体测量结果，这些结果可以是原始值，也可以是年龄和性别标准化值。我们开发了三种方法，利用已知或估计的个体水平上不同尺度测量值之间的关系，在总体 BMI 数据之间进行映射。第一种是基于 z 值和百分位数数学定义的分析方法。另外两种方法涉及对个人参与者数据进行抽样，并在此基础上进行转换。其中一种方法是直接抽样，而另一种方法则涉及对报告结果的优化。与分析方法相比，这些方法还具有更广泛的适用性，可用于绘制任何一对具有已知或可估算个体水平关系的测量量表之间的关系图。我们使用模拟研究和数据集中报告多个量表结果的试验来验证和对比我们的方法。我们发现，所有方法都能以合理的准确度再现平均值，但在标准偏差方面，优化方法优于其他方法。不过，优化方法更容易低估标准偏差，而且容易出现不收敛现象。

{"title":"Mapping between measurement scales in meta-analysis, with application to measures of body mass index in children","authors":"Annabel L. Davies, A. E. Ades, Julian P. T. Higgins","doi":"10.1002/jrsm.1758","DOIUrl":"10.1002/jrsm.1758","url":null,"abstract":"Quantitative evidence synthesis methods aim to combine data from multiple medical trials to infer relative effects of different interventions. A challenge arises when trials report continuous outcomes on different measurement scales. To include all evidence in one coherent analysis, we require methods to “map” the outcomes onto a single scale. This is particularly challenging when trials report aggregate rather than individual data. We are motivated by a meta-analysis of interventions to prevent obesity in children. Trials report aggregate measurements of body mass index (BMI) either expressed as raw values or standardized for age and sex. We develop three methods for mapping between aggregate BMI data using known or estimated relationships between measurements on different scales at the individual level. The first is an analytical method based on the mathematical definitions of z-scores and percentiles. The other two approaches involve sampling individual participant data on which to perform the conversions. One method is a straightforward sampling routine, while the other involves optimization with respect to the reported outcomes. In contrast to the analytical approach, these methods also have wider applicability for mapping between any pair of measurement scales with known or estimable individual-level relationships. We verify and contrast our methods using simulation studies and trials from our data set which report outcomes on multiple scales. We find that all methods recreate mean values with reasonable accuracy, but for standard deviations, optimization outperforms the other methods. However, the optimization method is more likely to underestimate standard deviations and is vulnerable to non-convergence.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1072-1093"},"PeriodicalIF":5.0,"publicationDate":"2024-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1758","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142363614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Towards the automatic risk of bias assessment on randomized controlled trials: A comparison of RobotReviewer and humans 实现随机对照试验的偏倚风险自动评估：机器人审查员与人类的比较。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-09-26 DOI: 10.1002/jrsm.1761

Yuan Tian, Xi Yang, Suhail A. Doi, Luis Furuya-Kanamori, Lifeng Lin, Joey S. W. Kwong, Chang Xu

RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two different approaches: (1) manually by human reviewers, and (2) automatically by the RobotReviewer. The manual assessment was based on two groups independently, with two additional rounds of verification. The agreement between RobotReviewer and humans was measured via the concordance rate and Cohen's kappa statistics, based on the comparison of binary classification of the risk of bias (low vs. high/unclear) as restricted by RobotReviewer. The concordance rates varied by domain, ranging from 63.07% to 83.32%. Cohen's kappa statistics showed a poor agreement between humans and RobotReviewer for allocation concealment (κ = 0.25, 95% CI: 0.21–0.30), blinding of outcome assessors (κ = 0.27, 95% CI: 0.23–0.31); While moderate for random sequence generation (κ = 0.46, 95% CI: 0.41–0.50) and blinding of participants and personnel (κ = 0.59, 95% CI: 0.55–0.64). The findings demonstrate that there were domain-specific differences in the level of agreement between RobotReviewer and humans. We suggest that it might be a useful auxiliary tool, but the specific manner of its integration as a complementary tool requires further discussion.

RobotReviewer 是一种自动评估随机对照试验偏倚风险的工具，但其可靠性的证据有限。我们以 1955 项随机对照试验为基础，评估了 RobotReviewer 与人类在偏倚风险评估方面的一致性。这些试验的偏倚风险通过两种不同的方法进行评估：(1) 由人类审稿人手动评估；(2) 由机器人审稿器自动评估。人工评估由两组人员独立进行，并额外进行两轮验证。机器人审稿器和人类之间的一致性是通过一致率和科恩卡帕统计来衡量的，基于机器人审稿器限制的偏倚风险二元分类（低与高/不明确）的比较。不同领域的一致率各不相同，从 63.07% 到 83.32% 不等。Cohen's kappa 统计显示，人类与 RobotReviewer 在分配隐藏（κ = 0.25，95% CI：0.21-0.30）、结果评估者盲法（κ = 0.27，95% CI：0.23-0.31）方面的一致性较差；而在随机序列生成（κ = 0.46，95% CI：0.41-0.50）以及参与者和人员盲法（κ = 0.59，95% CI：0.55-0.64）方面的一致性适中。研究结果表明，RobotReviewer 与人类在特定领域的一致性水平存在差异。我们认为，它可能是一个有用的辅助工具，但其作为补充工具的具体整合方式还需要进一步讨论。

{"title":"Towards the automatic risk of bias assessment on randomized controlled trials: A comparison of RobotReviewer and humans","authors":"Yuan Tian, Xi Yang, Suhail A. Doi, Luis Furuya-Kanamori, Lifeng Lin, Joey S. W. Kwong, Chang Xu","doi":"10.1002/jrsm.1761","DOIUrl":"10.1002/jrsm.1761","url":null,"abstract":"RobotReviewer is a tool for automatically assessing the risk of bias in randomized controlled trials, but there is limited evidence of its reliability. We evaluated the agreement between RobotReviewer and humans regarding the risk of bias assessment based on 1955 randomized controlled trials. The risk of bias in these trials was assessed via two different approaches: (1) manually by human reviewers, and (2) automatically by the RobotReviewer. The manual assessment was based on two groups independently, with two additional rounds of verification. The agreement between RobotReviewer and humans was measured via the concordance rate and Cohen's kappa statistics, based on the comparison of binary classification of the risk of bias (low vs. high/unclear) as restricted by RobotReviewer. The concordance rates varied by domain, ranging from 63.07% to 83.32%. Cohen's kappa statistics showed a poor agreement between humans and RobotReviewer for allocation concealment (κ = 0.25, 95% CI: 0.21–0.30), blinding of outcome assessors (κ = 0.27, 95% CI: 0.23–0.31); While moderate for random sequence generation (κ = 0.46, 95% CI: 0.41–0.50) and blinding of participants and personnel (κ = 0.59, 95% CI: 0.55–0.64). The findings demonstrate that there were domain-specific differences in the level of agreement between RobotReviewer and humans. We suggest that it might be a useful auxiliary tool, but the specific manner of its integration as a complementary tool requires further discussion.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1111-1119"},"PeriodicalIF":5.0,"publicationDate":"2024-09-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142338037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Uncertain about uncertainty in matching-adjusted indirect comparisons? A simulation study to compare methods for variance estimation 匹配调整间接比较中的不确定性？比较方差估计方法的模拟研究。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-09-25 DOI: 10.1002/jrsm.1759

Conor O. Chandler, Irina Proskorovsky

In health technology assessment, matching-adjusted indirect comparison (MAIC) is the most common method for pairwise comparisons that control for imbalances in baseline characteristics across trials. One of the primary challenges in MAIC is the need to properly account for the additional uncertainty introduced by the matching process. Limited evidence and guidance are available on variance estimation in MAICs. Therefore, we conducted a comprehensive Monte Carlo simulation study to evaluate the performance of different statistical methods across 108 scenarios. Four general approaches for variance estimation were compared in both anchored and unanchored MAICs of binary and time-to-event outcomes: (1) conventional estimators (CE) using raw weights; (2) CE using weights rescaled to the effective sample size (ESS); (3) robust sandwich estimators; and (4) bootstrapping. Several variants of sandwich estimators and bootstrap methods were tested. Performance was quantified on the basis of empirical coverage probabilities for 95% confidence intervals and variability ratios. Variability was underestimated by CE + raw weights when population overlap was poor or moderate. Despite several theoretical limitations, CE + ESS weights accurately estimated uncertainty across most scenarios. Original implementations of sandwich estimators had a downward bias in MAICs with a small ESS, and finite sample adjustments led to marked improvements. Bootstrapping was unstable if population overlap was poor and the sample size was limited. All methods produced valid coverage probabilities and standard errors in cases of strong population overlap. Our findings indicate that the sample size, population overlap, and outcome type are important considerations for variance estimation in MAICs.

在卫生技术评估中，配对调整间接比较（MAIC）是最常用的配对比较方法，可控制各试验间基线特征的不平衡。MAIC 的主要挑战之一是需要适当考虑匹配过程带来的额外不确定性。有关 MAIC 方差估计的证据和指导有限。因此，我们进行了一项全面的蒙特卡罗模拟研究，以评估 108 种情况下不同统计方法的性能。我们比较了在二元和时间到事件结果的锚定和非锚定 MAIC 中进行方差估计的四种一般方法：(1) 使用原始权重的传统估计法 (CE)；(2) 使用根据有效样本量（ESS）重新标定的权重的传统估计法；(3) 稳健的三明治估计法；以及 (4) 自举法。对三明治估计器和自举法的几种变体进行了测试。根据 95% 置信区间和变异率的经验覆盖概率对性能进行量化。当种群重叠程度较低或中等时，CE+原始权重低估了变异性。尽管存在一些理论上的限制，但 CE + ESS 权重在大多数情况下都能准确估计不确定性。三明治估计器的原始实施在 ESS 较小的情况下 MAIC 有向下的偏差，有限样本调整可明显改善。如果人口重合度较低且样本量有限，则 Bootstrapping 方法不稳定。在人群重叠度较高的情况下，所有方法都能得出有效的覆盖概率和标准误差。我们的研究结果表明，样本量、人群重叠度和结果类型是 MAICs 方差估计的重要考虑因素。

{"title":"Uncertain about uncertainty in matching-adjusted indirect comparisons? A simulation study to compare methods for variance estimation","authors":"Conor O. Chandler, Irina Proskorovsky","doi":"10.1002/jrsm.1759","DOIUrl":"10.1002/jrsm.1759","url":null,"abstract":"In health technology assessment, matching-adjusted indirect comparison (MAIC) is the most common method for pairwise comparisons that control for imbalances in baseline characteristics across trials. One of the primary challenges in MAIC is the need to properly account for the additional uncertainty introduced by the matching process. Limited evidence and guidance are available on variance estimation in MAICs. Therefore, we conducted a comprehensive Monte Carlo simulation study to evaluate the performance of different statistical methods across 108 scenarios. Four general approaches for variance estimation were compared in both anchored and unanchored MAICs of binary and time-to-event outcomes: (1) conventional estimators (CE) using raw weights; (2) CE using weights rescaled to the effective sample size (ESS); (3) robust sandwich estimators; and (4) bootstrapping. Several variants of sandwich estimators and bootstrap methods were tested. Performance was quantified on the basis of empirical coverage probabilities for 95% confidence intervals and variability ratios. Variability was underestimated by CE + raw weights when population overlap was poor or moderate. Despite several theoretical limitations, CE + ESS weights accurately estimated uncertainty across most scenarios. Original implementations of sandwich estimators had a downward bias in MAICs with a small ESS, and finite sample adjustments led to marked improvements. Bootstrapping was unstable if population overlap was poor and the sample size was limited. All methods produced valid coverage probabilities and standard errors in cases of strong population overlap. Our findings indicate that the sample size, population overlap, and outcome type are important considerations for variance estimation in MAICs.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1094-1110"},"PeriodicalIF":5.0,"publicationDate":"2024-09-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1759","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142338038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Visualizing the assumptions of network meta-analysis 网络荟萃分析假设的可视化。

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-09-23 DOI: 10.1002/jrsm.1760

Yu-Kang Tu, Pei-Chun Lai, Yen-Ta Huang, James Hodges

Network meta-analysis (NMA) incorporates all available evidence into a general statistical framework for comparing multiple treatments. Standard NMAs make three major assumptions, namely homogeneity, similarity, and consistency, and violating these assumptions threatens an NMA's validity. In this article, we suggest a graphical approach to assessing these assumptions and distinguishing between qualitative and quantitative versions of these assumptions. In our plot, the absolute effect of each treatment arm is plotted against the level of effect modifiers, and the three assumptions of NMA can then be visually evaluated. We use four hypothetical scenarios to show how violating these assumptions can lead to different consequences and difficulties in interpreting an NMA. We present an example of an NMA evaluating steroid use to treat septic shock patients to demonstrate how to use our graphical approach to assess an NMA's assumptions and how this approach can help with interpreting the results. We also show that all three assumptions of NMA can be summarized as an exchangeability assumption. Finally, we discuss how reporting of NMAs can be improved to increase transparency of the analysis and interpretability of the results.

网络荟萃分析（NMA）将所有可用证据纳入一个通用统计框架，用于比较多种治疗方法。标准的 NMA 有三个主要假设，即同质性、相似性和一致性，违反这些假设会威胁到 NMA 的有效性。在本文中，我们提出了一种图形方法来评估这些假设，并区分这些假设的定性和定量版本。在我们的图表中，每个治疗臂的绝对效应与效应修饰因子的水平相对应，然后就可以直观地评估 NMA 的三个假设。我们使用四种假设情况来说明违反这些假设会导致不同的后果，以及在解释 NMA 时遇到的困难。我们以评估使用类固醇治疗脓毒性休克患者的 NMA 为例，说明如何使用我们的图形方法评估 NMA 的假设，以及这种方法如何有助于解释结果。我们还表明，NMA 的所有三个假设都可以概括为可交换性假设。最后，我们讨论了如何改进 NMA 报告，以提高分析的透明度和结果的可解释性。

{"title":"Visualizing the assumptions of network meta-analysis","authors":"Yu-Kang Tu, Pei-Chun Lai, Yen-Ta Huang, James Hodges","doi":"10.1002/jrsm.1760","DOIUrl":"10.1002/jrsm.1760","url":null,"abstract":"Network meta-analysis (NMA) incorporates all available evidence into a general statistical framework for comparing multiple treatments. Standard NMAs make three major assumptions, namely homogeneity, similarity, and consistency, and violating these assumptions threatens an NMA's validity. In this article, we suggest a graphical approach to assessing these assumptions and distinguishing between qualitative and quantitative versions of these assumptions. In our plot, the absolute effect of each treatment arm is plotted against the level of effect modifiers, and the three assumptions of NMA can then be visually evaluated. We use four hypothetical scenarios to show how violating these assumptions can lead to different consequences and difficulties in interpreting an NMA. We present an example of an NMA evaluating steroid use to treat septic shock patients to demonstrate how to use our graphical approach to assess an NMA's assumptions and how this approach can help with interpreting the results. We also show that all three assumptions of NMA can be summarized as an exchangeability assumption. Finally, we discuss how reporting of NMAs can be improved to increase transparency of the analysis and interpretability of the results.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1175-1182"},"PeriodicalIF":5.0,"publicationDate":"2024-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1760","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142306814","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Conducting power analysis for meta-analysis with dependent effect sizes: Common guidelines and an introduction to the POMADE R package 为具有依赖效应大小的荟萃分析进行功率分析：通用指南和 POMADE R 软件包简介

IF 5 2区生物学 Q1 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Research Synthesis Methods

Pub Date : 2024-09-18 DOI: 10.1002/jrsm.1752

Mikkel Helding Vembye, James Eric Pustejovsky, Therese Deocampo Pigott

Sample size and statistical power are important factors to consider when planning a research synthesis. Power analysis methods have been developed for fixed effect or random effects models, but until recently these methods were limited to simple data structures with a single, independent effect per study. Recent work has provided power approximation formulas for meta-analyses involving studies with multiple, dependent effect size estimates, which are common in syntheses of social science research. Prior work focused on developing and validating the approximations but did not address the practice challenges encountered in applying them for purposes of planning a synthesis involving dependent effect sizes. We aim to facilitate the application of these recent developments by providing practical guidance on how to conduct power analysis for planning a meta-analysis of dependent effect sizes and by introducing a new R package, POMADE, designed for this purpose. We present a comprehensive overview of resources for finding information about the study design features and model parameters needed to conduct power analysis, along with detailed worked examples using the POMADE package. For presenting power analysis findings, we emphasize graphical tools that can depict power under a range of plausible assumptions and introduce a novel plot, the traffic light power plot, for conveying the degree of certainty in one's assumptions.

在规划研究综述时，样本量和统计功率是需要考虑的重要因素。已有针对固定效应或随机效应模型的幂分析方法，但直到最近，这些方法仍局限于每项研究只有一个独立效应的简单数据结构。最近的工作为涉及具有多个依赖效应大小估计值的研究的荟萃分析提供了功率近似公式，这在社会科学研究的综合分析中很常见。之前的工作侧重于近似值的开发和验证，但并没有解决在应用这些近似值规划涉及依存效应大小的综述时遇到的实践挑战。我们的目的是通过提供实用指南，指导如何在规划依存效应大小的荟萃分析时进行功率分析，并介绍为此目的设计的新 R 软件包 POMADE，从而促进这些最新进展的应用。我们全面概述了进行功率分析所需的研究设计特征和模型参数的相关资源信息，以及使用 POMADE 软件包的详细工作示例。在展示功率分析结果时，我们强调图形工具可以描述一系列可信假设下的功率，并介绍了一种新颖的图谱--交通灯功率图，用于表达假设的确定程度。

{"title":"Conducting power analysis for meta-analysis with dependent effect sizes: Common guidelines and an introduction to the POMADE R package","authors":"Mikkel Helding Vembye, James Eric Pustejovsky, Therese Deocampo Pigott","doi":"10.1002/jrsm.1752","DOIUrl":"10.1002/jrsm.1752","url":null,"abstract":"Sample size and statistical power are important factors to consider when planning a research synthesis. Power analysis methods have been developed for fixed effect or random effects models, but until recently these methods were limited to simple data structures with a single, independent effect per study. Recent work has provided power approximation formulas for meta-analyses involving studies with multiple, dependent effect size estimates, which are common in syntheses of social science research. Prior work focused on developing and validating the approximations but did not address the practice challenges encountered in applying them for purposes of planning a synthesis involving dependent effect sizes. We aim to facilitate the application of these recent developments by providing practical guidance on how to conduct power analysis for planning a meta-analysis of dependent effect sizes and by introducing a new R package, POMADE, designed for this purpose. We present a comprehensive overview of resources for finding information about the study design features and model parameters needed to conduct power analysis, along with detailed worked examples using the POMADE package. For presenting power analysis findings, we emphasize graphical tools that can depict power under a range of plausible assumptions and introduce a novel plot, the traffic light power plot, for conveying the degree of certainty in one's assumptions.","PeriodicalId":226,"journal":{"name":"Research Synthesis Methods","volume":"15 6","pages":"1214-1230"},"PeriodicalIF":5.0,"publicationDate":"2024-09-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://onlinelibrary.wiley.com/doi/epdf/10.1002/jrsm.1752","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142249349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"生物学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0