Statistical Methods in Medical Research最新文献_第6页

Minimizing confounding in comparative observational studies with time-to-event outcomes: An extensive comparison of covariate balancing methods using Monte Carlo simulation. 尽量减少具有时间到事件结果的比较观察研究中的混杂因素：使用蒙特卡洛模拟对协变量平衡方法进行广泛比较。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-08-01 Epub Date: 2024-07-25 DOI: 10.1177/09622802241262527

Guy Cafri, Stephen Fortin, Peter C Austin

Observational studies are frequently used in clinical research to estimate the effects of treatments or exposures on outcomes. To reduce the effects of confounding when estimating treatment effects, covariate balancing methods are frequently implemented. This study evaluated, using extensive Monte Carlo simulation, several methods of covariate balancing, and two methods for propensity score estimation, for estimating the average treatment effect on the treated using a hazard ratio from a Cox proportional hazards model. With respect to minimizing bias and maximizing accuracy (as measured by the mean square error) of the treatment effect, the average treatment effect on the treated weighting, fine stratification, and optimal full matching with a conventional logistic regression model for the propensity score performed best across all simulated conditions. Other methods performed well in specific circumstances, such as pair matching when sample sizes were large (n = 5000) and the proportion treated was < 0.25. Statistical power was generally higher for weighting methods than matching methods, and Type I error rates were at or below the nominal level for balancing methods with unbiased treatment effect estimates. There was also a decreasing effective sample size with an increasing number of strata, therefore for stratification-based weighting methods, it may be important to consider fewer strata. Generally, we recommend methods that performed well in our simulations, although the identification of methods that performed well is necessarily limited by the specific features of our simulation. The methods are illustrated using a real-world example comparing beta blockers and angiotensin-converting enzyme inhibitors among hypertensive patients at risk for incident stroke.

临床研究中经常使用观察研究来估计治疗或暴露对结果的影响。在估计治疗效果时，为了减少混杂因素的影响，通常会采用共变量平衡方法。本研究通过大量蒙特卡罗模拟，评估了几种协变量平衡方法和两种倾向评分估算方法，以估算使用 Cox 比例危险模型中的危险比对受治疗者的平均治疗效果。就偏差最小化和治疗效果准确性最大化（以均方误差衡量）而言，在所有模拟条件下，加权法、精细分层法和倾向得分采用传统逻辑回归模型的最佳完全匹配法对治疗者的平均治疗效果表现最佳。其他方法在特定情况下表现良好，如样本量较大（n = 5000）且治疗比例为 0.25 时的配对匹配。加权法的统计能力普遍高于配对法，对于治疗效果估计值无偏的平衡法，I 类错误率处于或低于名义水平。此外，有效样本量随着分层数的增加而减少，因此对于基于分层的加权方法，可能需要考虑减少分层数。一般来说，我们推荐那些在模拟中表现良好的方法，但要找出表现良好的方法必然受到我们模拟的具体特点的限制。我们用一个真实世界的例子来说明这些方法，该例子比较了有卒中风险的高血压患者中的β受体阻滞剂和血管紧张素转换酶抑制剂。

{"title":"Minimizing confounding in comparative observational studies with time-to-event outcomes: An extensive comparison of covariate balancing methods using Monte Carlo simulation.","authors":"Guy Cafri, Stephen Fortin, Peter C Austin","doi":"10.1177/09622802241262527","DOIUrl":"10.1177/09622802241262527","url":null,"abstract":"Observational studies are frequently used in clinical research to estimate the effects of treatments or exposures on outcomes. To reduce the effects of confounding when estimating treatment effects, covariate balancing methods are frequently implemented. This study evaluated, using extensive Monte Carlo simulation, several methods of covariate balancing, and two methods for propensity score estimation, for estimating the average treatment effect on the treated using a hazard ratio from a Cox proportional hazards model. With respect to minimizing bias and maximizing accuracy (as measured by the mean square error) of the treatment effect, the average treatment effect on the treated weighting, fine stratification, and optimal full matching with a conventional logistic regression model for the propensity score performed best across all simulated conditions. Other methods performed well in specific circumstances, such as pair matching when sample sizes were large (n = 5000) and the proportion treated was < 0.25. Statistical power was generally higher for weighting methods than matching methods, and Type I error rates were at or below the nominal level for balancing methods with unbiased treatment effect estimates. There was also a decreasing effective sample size with an increasing number of strata, therefore for stratification-based weighting methods, it may be important to consider fewer strata. Generally, we recommend methods that performed well in our simulations, although the identification of methods that performed well is necessarily limited by the specific features of our simulation. The methods are illustrated using a real-world example comparing beta blockers and angiotensin-converting enzyme inhibitors among hypertensive patients at risk for incident stroke.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1437-1460"},"PeriodicalIF":1.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141760992","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Group lasso priors for Bayesian accelerated failure time models with left-truncated and interval-censored data. 贝叶斯加速故障时间模型左截断和区间校验数据的组套索先验。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-08-01 Epub Date: 2024-07-25 DOI: 10.1177/09622802241262523

Harrison T Reeder, Sebastien Haneuse, Kyu Ha Lee

An important task in health research is to characterize time-to-event outcomes such as disease onset or mortality in terms of a potentially high-dimensional set of risk factors. For example, prospective cohort studies of Alzheimer's disease (AD) typically enroll older adults for observation over several decades to assess the long-term impact of genetic and other factors on cognitive decline and mortality. The accelerated failure time model is particularly well-suited to such studies, structuring covariate effects as "horizontal" changes to the survival quantiles that conceptually reflect shifts in the outcome distribution due to lifelong exposures. However, this modeling task is complicated by the enrollment of adults at differing ages, and intermittent follow-up visits leading to interval-censored outcome information. Moreover, genetic and clinical risk factors are not only high-dimensional, but characterized by underlying grouping structures, such as by function or gene location. Such grouped high-dimensional covariates require shrinkage methods that directly acknowledge this structure to facilitate variable selection and estimation. In this paper, we address these considerations directly by proposing a Bayesian accelerated failure time model with a group-structured lasso penalty, designed for left-truncated and interval-censored time-to-event data. We develop an R package with a Markov chain Monte Carlo sampler for estimation. We present a simulation study examining the performance of this method relative to an ordinary lasso penalty and apply the proposed method to identify groups of predictive genetic and clinical risk factors for AD in the Religious Orders Study and Memory and Aging Project prospective cohort studies of AD and dementia.

健康研究的一项重要任务是根据一组潜在的高维风险因素来描述疾病发病或死亡率等时间到事件的结果。例如，阿尔茨海默病（AD）的前瞻性队列研究通常会招募老年人进行数十年的观察，以评估遗传和其他因素对认知能力下降和死亡率的长期影响。加速失效时间模型特别适合此类研究，它将协变量效应结构化为生存量化值的 "水平 "变化，从概念上反映了终生暴露导致的结果分布变化。然而，这项建模任务因不同年龄段的成人入组和间歇性随访而变得复杂，这导致了间隔删失的结果信息。此外，遗传和临床风险因素不仅是高维的，而且具有潜在的分组结构，如按功能或基因位置分组。这种分组的高维协变量需要直接承认这种结构的收缩方法，以促进变量的选择和估计。在本文中，我们提出了一种具有分组结构套索惩罚的贝叶斯加速失效时间模型，该模型专为左截断和区间校验的时间到事件数据而设计，从而直接解决了这些问题。我们开发了一个 R 软件包，其中包含一个用于估计的马尔科夫链蒙特卡罗采样器。我们介绍了一项模拟研究，检验了该方法相对于普通拉索惩罚的性能，并将所提出的方法应用于在 "宗教习俗研究"（Religious Orders Study）和 "记忆与老龄化项目"（Memory and Aging Project）的前瞻性队列研究中识别AD的预测性遗传和临床风险因素组。

{"title":"Group lasso priors for Bayesian accelerated failure time models with left-truncated and interval-censored data.","authors":"Harrison T Reeder, Sebastien Haneuse, Kyu Ha Lee","doi":"10.1177/09622802241262523","DOIUrl":"10.1177/09622802241262523","url":null,"abstract":"An important task in health research is to characterize time-to-event outcomes such as disease onset or mortality in terms of a potentially high-dimensional set of risk factors. For example, prospective cohort studies of Alzheimer's disease (AD) typically enroll older adults for observation over several decades to assess the long-term impact of genetic and other factors on cognitive decline and mortality. The accelerated failure time model is particularly well-suited to such studies, structuring covariate effects as \"horizontal\" changes to the survival quantiles that conceptually reflect shifts in the outcome distribution due to lifelong exposures. However, this modeling task is complicated by the enrollment of adults at differing ages, and intermittent follow-up visits leading to interval-censored outcome information. Moreover, genetic and clinical risk factors are not only high-dimensional, but characterized by underlying grouping structures, such as by function or gene location. Such grouped high-dimensional covariates require shrinkage methods that directly acknowledge this structure to facilitate variable selection and estimation. In this paper, we address these considerations directly by proposing a Bayesian accelerated failure time model with a group-structured lasso penalty, designed for left-truncated and interval-censored time-to-event data. We develop an R package with a Markov chain Monte Carlo sampler for estimation. We present a simulation study examining the performance of this method relative to an ordinary lasso penalty and apply the proposed method to identify groups of predictive genetic and clinical risk factors for AD in the Religious Orders Study and Memory and Aging Project prospective cohort studies of AD and dementia.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1412-1423"},"PeriodicalIF":1.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141760991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A dose-effect network meta-analysis model with application in antidepressants using restricted cubic splines. 使用受限三次样条的剂量效应网络荟萃分析模型在抗抑郁药物中的应用。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-08-01 Epub Date: 2022-02-24 DOI: 10.1177/09622802211070256

Tasnim Hamza, Toshi A Furukawa, Nicola Orsini, Andrea Cipriani, Cynthia P Iglesias, Georgia Salanti

Network meta-analysis has been used to answer a range of clinical questions about the preferred intervention for a given condition. Although the effectiveness and safety of pharmacological agents depend on the dose administered, network meta-analysis applications typically ignore the role that drugs dosage plays in the results. This leads to more heterogeneity in the network. In this paper, we present a suite of network meta-analysis models that incorporate the dose-effect relationship using restricted cubic splines. We extend existing models into a dose-effect network meta-regression to account for study-level covariates and for groups of agents in a class-effect dose-effect network meta-analysis model. We apply our models to a network of aggregate data about the efficacy of 21 antidepressants and placebo for depression. We find that all antidepressants are more efficacious than placebo after a certain dose. Also, we identify the dose level at which each antidepressant's effect exceeds that of placebo and estimate the dose beyond which the effect of antidepressants no longer increases. When covariates were introduced to the model, we find that studies with small sample size tend to exaggerate antidepressants efficacy for several of the drugs. Our dose-effect network meta-analysis model with restricted cubic splines provides a flexible approach to modelling the dose-effect relationship in multiple interventions. Decision-makers can use our model to inform treatment choice.

网络荟萃分析已被用于回答一系列有关特定病症的首选干预措施的临床问题。虽然药物的有效性和安全性取决于给药剂量，但网络荟萃分析的应用通常会忽略药物剂量在结果中的作用。这就导致了网络中更多的异质性。在本文中，我们提出了一套网络荟萃分析模型，利用受限三次样条将剂量-效应关系纳入其中。我们将现有模型扩展为剂量效应网络荟萃回归，以考虑研究水平协变量和类效应剂量效应网络荟萃分析模型中的药剂组。我们将模型应用于 21 种抗抑郁药和安慰剂对抑郁症疗效的综合数据网络。我们发现，所有抗抑郁药在达到一定剂量后都比安慰剂更有效。此外，我们还确定了每种抗抑郁药疗效超过安慰剂的剂量水平，并估算了超过该剂量后抗抑郁药疗效不再增加的剂量。在模型中引入协变量后，我们发现样本量较小的研究往往会夸大几种药物的抗抑郁疗效。我们的剂量效应网络荟萃分析模型采用了限制性三次样条，为多种干预措施的剂量效应关系建模提供了一种灵活的方法。决策者可以利用我们的模型为治疗选择提供依据。

{"title":"A dose-effect network meta-analysis model with application in antidepressants using restricted cubic splines.","authors":"Tasnim Hamza, Toshi A Furukawa, Nicola Orsini, Andrea Cipriani, Cynthia P Iglesias, Georgia Salanti","doi":"10.1177/09622802211070256","DOIUrl":"10.1177/09622802211070256","url":null,"abstract":"Network meta-analysis has been used to answer a range of clinical questions about the preferred intervention for a given condition. Although the effectiveness and safety of pharmacological agents depend on the dose administered, network meta-analysis applications typically ignore the role that drugs dosage plays in the results. This leads to more heterogeneity in the network. In this paper, we present a suite of network meta-analysis models that incorporate the dose-effect relationship using restricted cubic splines. We extend existing models into a dose-effect network meta-regression to account for study-level covariates and for groups of agents in a class-effect dose-effect network meta-analysis model. We apply our models to a network of aggregate data about the efficacy of 21 antidepressants and placebo for depression. We find that all antidepressants are more efficacious than placebo after a certain dose. Also, we identify the dose level at which each antidepressant's effect exceeds that of placebo and estimate the dose beyond which the effect of antidepressants no longer increases. When covariates were introduced to the model, we find that studies with small sample size tend to exaggerate antidepressants efficacy for several of the drugs. Our dose-effect network meta-analysis model with restricted cubic splines provides a flexible approach to modelling the dose-effect relationship in multiple interventions. Decision-makers can use our model to inform treatment choice.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1461-1472"},"PeriodicalIF":1.6,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11462779/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"39824807","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Proportion of treatment effect explained: An overview of interpretations. 治疗效果比例解释：解释概述。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-07-25 DOI: 10.1177/09622802241259177

Florian Stijven, Ariel Alonso, Geert Molenberghs

The selection of the primary endpoint in a clinical trial plays a critical role in determining the trial's success. Ideally, the primary endpoint is the clinically most relevant outcome, also termed the true endpoint. However, practical considerations, like extended follow-up, may complicate this choice, prompting the proposal to replace the true endpoint with so-called surrogate endpoints. Evaluating the validity of these surrogate endpoints is crucial, and a popular evaluation framework is based on the proportion of treatment effect explained (PTE). While methodological advancements in this area have focused primarily on estimation methods, interpretation remains a challenge hindering the practical use of the PTE. We review various ways to interpret the PTE. These interpretations-two causal and one non-causal-reveal connections between the PTE principal surrogacy, causal mediation analysis, and the prediction of trial-level treatment effects. A common limitation across these interpretations is the reliance on unverifiable assumptions. As such, we argue that the PTE is only meaningful when researchers are willing to make very strong assumptions. These challenges are also illustrated in an analysis of three hypothetical vaccine trials.

临床试验中主要终点的选择对试验的成败起着至关重要的作用。理想情况下，主要终点是与临床最相关的结果，也称为真实终点。然而，一些实际考虑因素（如延长随访时间）可能会使这一选择变得复杂，因此有人建议用所谓的替代终点来取代真实终点。评估这些替代终点的有效性至关重要，而一种流行的评估框架是基于治疗效果的解释比例（PTE）。虽然这一领域的方法论进步主要集中在估算方法上，但解释仍然是阻碍 PTE 实际应用的一个挑战。我们回顾了解释 PTE 的各种方法。这些解释--两种因果解释和一种非因果解释--揭示了 PTE 主要代理、因果中介分析和试验水平治疗效果预测之间的联系。这些解释的一个共同局限是依赖于无法验证的假设。因此，我们认为，只有当研究人员愿意做出非常有力的假设时，PTE 才有意义。对三项假定疫苗试验的分析也说明了这些挑战。

{"title":"Proportion of treatment effect explained: An overview of interpretations.","authors":"Florian Stijven, Ariel Alonso, Geert Molenberghs","doi":"10.1177/09622802241259177","DOIUrl":"10.1177/09622802241259177","url":null,"abstract":"The selection of the primary endpoint in a clinical trial plays a critical role in determining the trial's success. Ideally, the primary endpoint is the clinically most relevant outcome, also termed the true endpoint. However, practical considerations, like extended follow-up, may complicate this choice, prompting the proposal to replace the true endpoint with so-called surrogate endpoints. Evaluating the validity of these surrogate endpoints is crucial, and a popular evaluation framework is based on the proportion of treatment effect explained (PTE). While methodological advancements in this area have focused primarily on estimation methods, interpretation remains a challenge hindering the practical use of the PTE. We review various ways to interpret the PTE. These interpretations-two causal and one non-causal-reveal connections between the PTE principal surrogacy, causal mediation analysis, and the prediction of trial-level treatment effects. A common limitation across these interpretations is the reliance on unverifiable assumptions. As such, we argue that the PTE is only meaningful when researchers are willing to make very strong assumptions. These challenges are also illustrated in an analysis of three hypothetical vaccine trials.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1278-1296"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141760993","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quantifying proportion of treatment effect by surrogate endpoint under heterogeneity. 在异质性条件下通过替代终点量化治疗效果的比例。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-05-08 DOI: 10.1177/09622802241247719

Xinzhou Guo, Florence T Bourgeois, Tianxi Cai

When the primary endpoints in randomized clinical trials require long term follow-up or are costly to measure, it is often desirable to assess treatment effects on surrogate instead of clinical endpoints. Prior to adopting a surrogate endpoint for such purposes, the extent of its surrogacy on the primary endpoint must be assessed. There is a rich statistical literature on assessing surrogacy in the overall population, much of which is based on quantifying the proportion of treatment effect on the primary endpoint that is explained by the treatment effect on the surrogate endpoint. However, the surrogacy of an endpoint may vary across different patient subgroups according to baseline demographic characteristics, and limited methods are currently available to assess overall surrogacy in the presence of potential surrogacy heterogeneity. In this paper, we propose methods that incorporate covariates for baseline information, such as age, to improve overall surrogacy assessment. We use flexible semi-non-parametric modeling strategies to adjust for covariate effects and derive a robust estimate for the proportion of treatment effect of the covariate-adjusted surrogate endpoint. Simulation results suggest that the adjusted surrogate endpoint has greater proportion of treatment effect compared to the unadjusted surrogate endpoint. We apply the proposed method to data from a clinical trial of infliximab and assess the adequacy of the surrogate endpoint in the presence of age heterogeneity.

当随机临床试验中的主要终点需要长期随访或测量成本较高时，通常希望通过替代终点而不是临床终点来评估治疗效果。在采用替代终点之前，必须评估其对主要终点的替代程度。在评估总体人群的代用性方面，有丰富的统计文献，其中大部分都是基于量化主要终点的治疗效果中被代用终点的治疗效果所解释的比例。然而，根据基线人口学特征的不同，终点的代偿性在不同的患者亚群中可能会有所不同，目前可用来评估潜在代偿异质性情况下总体代偿性的方法非常有限。在本文中，我们提出了结合年龄等基线信息协变量的方法，以改进总体代偿率评估。我们采用灵活的半非参数建模策略来调整协变量效应，并得出协变量调整后的代用终点治疗效应比例的稳健估计值。模拟结果表明，与未经调整的替代终点相比，调整后的替代终点具有更大的治疗效果比例。我们将提出的方法应用于英夫利西单抗的临床试验数据，并评估了代用终点在年龄异质性情况下的充分性。

{"title":"Quantifying proportion of treatment effect by surrogate endpoint under heterogeneity.","authors":"Xinzhou Guo, Florence T Bourgeois, Tianxi Cai","doi":"10.1177/09622802241247719","DOIUrl":"10.1177/09622802241247719","url":null,"abstract":"When the primary endpoints in randomized clinical trials require long term follow-up or are costly to measure, it is often desirable to assess treatment effects on surrogate instead of clinical endpoints. Prior to adopting a surrogate endpoint for such purposes, the extent of its surrogacy on the primary endpoint must be assessed. There is a rich statistical literature on assessing surrogacy in the overall population, much of which is based on quantifying the proportion of treatment effect on the primary endpoint that is explained by the treatment effect on the surrogate endpoint. However, the surrogacy of an endpoint may vary across different patient subgroups according to baseline demographic characteristics, and limited methods are currently available to assess overall surrogacy in the presence of potential surrogacy heterogeneity. In this paper, we propose methods that incorporate covariates for baseline information, such as age, to improve overall surrogacy assessment. We use flexible semi-non-parametric modeling strategies to adjust for covariate effects and derive a robust estimate for the proportion of treatment effect of the covariate-adjusted surrogate endpoint. Simulation results suggest that the adjusted surrogate endpoint has greater proportion of treatment effect compared to the unadjusted surrogate endpoint. We apply the proposed method to data from a clinical trial of infliximab and assess the adequacy of the surrogate endpoint in the presence of age heterogeneity.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1152-1162"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140877360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A structured iterative division approach for non-sparse regression models and applications in biological data analysis. 非稀疏回归模型的结构化迭代分割方法及其在生物数据分析中的应用

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-05-23 DOI: 10.1177/09622802241254251

Shun Yu, Yuehan Yang

In this paper, we focus on the modeling problem of estimating data with non-sparse structures, specifically focusing on biological data that exhibit a high degree of relevant features. Various fields, such as biology and finance, face the challenge of non-sparse estimation. We address the problems using the proposed method, called structured iterative division. Structured iterative division effectively divides data into non-sparse and sparse structures and eliminates numerous irrelevant variables, significantly reducing the error while maintaining computational efficiency. Numerical and theoretical results demonstrate the competitive advantage of the proposed method on a wide range of problems, and the proposed method exhibits excellent statistical performance in numerical comparisons with several existing methods. We apply the proposed algorithm to two biology problems, gene microarray datasets, and chimeric protein datasets, to the prognostic risk of distant metastasis in breast cancer and Alzheimer's disease, respectively. Structured iterative division provides insights into gene identification and selection, and we also provide meaningful results in anticipating cancer risk and identifying key factors.

在本文中，我们将重点关注估计非稀疏结构数据的建模问题，特别是关注表现出高度相关特征的生物数据。生物学和金融学等多个领域都面临着非稀疏估计的挑战。我们提出了一种名为结构化迭代除法的方法来解决这些问题。结构化迭代除法能有效地将数据分为非稀疏结构和稀疏结构，并消除大量无关变量，在保持计算效率的同时显著降低误差。数值和理论结果表明了所提方法在各种问题上的竞争优势，在与几种现有方法的数值比较中，所提方法表现出了优异的统计性能。我们将提出的算法应用于两个生物学问题，即基因芯片数据集和嵌合蛋白数据集，分别用于乳腺癌和阿尔茨海默病远处转移的预后风险。结构化迭代划分为基因识别和选择提供了见解，我们还在预测癌症风险和识别关键因素方面提供了有意义的结果。

{"title":"A structured iterative division approach for non-sparse regression models and applications in biological data analysis.","authors":"Shun Yu, Yuehan Yang","doi":"10.1177/09622802241254251","DOIUrl":"10.1177/09622802241254251","url":null,"abstract":"In this paper, we focus on the modeling problem of estimating data with non-sparse structures, specifically focusing on biological data that exhibit a high degree of relevant features. Various fields, such as biology and finance, face the challenge of non-sparse estimation. We address the problems using the proposed method, called structured iterative division. Structured iterative division effectively divides data into non-sparse and sparse structures and eliminates numerous irrelevant variables, significantly reducing the error while maintaining computational efficiency. Numerical and theoretical results demonstrate the competitive advantage of the proposed method on a wide range of problems, and the proposed method exhibits excellent statistical performance in numerical comparisons with several existing methods. We apply the proposed algorithm to two biology problems, gene microarray datasets, and chimeric protein datasets, to the prognostic risk of distant metastasis in breast cancer and Alzheimer's disease, respectively. Structured iterative division provides insights into gene identification and selection, and we also provide meaningful results in anticipating cancer risk and identifying key factors.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1233-1248"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141082254","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Group sequential methods based on supremum logrank statistics under proportional and nonproportional hazards. 基于比例和非比例危害下的至高对数秩统计的分组序列方法。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-06-05 DOI: 10.1177/09622802241254211

Jean Marie Boher, Thomas Filleron, Patrick Sfumato, Pierre Bunouf, Richard J Cook

Despite the widespread use of Cox regression for modeling treatment effects in clinical trials, in immunotherapy oncology trials and other settings therapeutic benefits are not immediately realized thereby violating the proportional hazards assumption. Weighted logrank tests and the so-called Maxcombo test involving the combination of multiple logrank test statistics have been advocated to increase power for detecting effects in these and other settings where hazards are nonproportional. We describe a testing framework based on supremum logrank statistics created by successively analyzing and excluding early events, or obtained using a moving time window. We then describe how such tests can be conducted in a group sequential trial with interim analyses conducted for potential early stopping of benefit. The crossing boundaries for the interim test statistics are determined using an easy-to-implement Monte Carlo algorithm. Numerical studies illustrate the good frequency properties of the proposed group sequential methods.

尽管在临床试验中广泛使用 Cox 回归对治疗效果进行建模，但在免疫疗法肿瘤试验和其他情况下，治疗效果并不会立即显现，因此违反了比例危害假设。加权对数秩检验和所谓的 Maxcombo 检验涉及多个对数秩检验统计量的组合，被主张用来提高在这些和其他非比例危害环境中检测效应的能力。我们描述了一个测试框架，该框架基于通过连续分析和排除早期事件创建的或使用移动时间窗获得的超等 logrank 统计量。然后，我们介绍了如何在分组顺序试验中进行此类测试，并针对可能出现的早期停止获益情况进行中期分析。我们使用一种易于实施的蒙特卡洛算法来确定中期测试统计的交叉界限。数值研究说明了所提出的分组序列方法具有良好的频率特性。

引用次数: 0

Robust integration of secondary outcomes information into primary outcome analysis in the presence of missing data. 在数据缺失的情况下，将次要结果信息可靠地纳入主要结果分析。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-05-20 DOI: 10.1177/09622802241254195

Daxuan Deng, Vernon M Chinchilli, Hao Feng, Chixiang Chen, Ming Wang

In clinical and observational studies, secondary outcomes are frequently collected alongside the primary outcome for each subject, yet their potential to improve the analysis efficiency remains underutilized. Moreover, missing data, commonly encountered in practice, can introduce bias to estimates if not appropriately addressed. This article presents an innovative approach that enhances the empirical likelihood-based information borrowing method by integrating missing-data techniques, ensuring robust data integration. We introduce a plug-in inverse probability weighting estimator to handle missingness in the primary analysis, demonstrating its equivalence to the standard joint estimator under mild conditions. To address potential bias from missing secondary outcomes, we propose a uniform mapping strategy, imputing incomplete secondary outcomes into a unified space. Extensive simulations highlight the effectiveness of our method, showing consistent, efficient, and robust estimators under various scenarios involving missing data and/or misspecified secondary models. Finally, we apply our proposal to the Uniform Data Set from the National Alzheimer's Coordinating Center, exemplifying its practical application.

在临床和观察性研究中，在收集每个受试者的主要结果的同时，经常会收集次要结果，但它们在提高分析效率方面的潜力仍未得到充分利用。此外，在实践中经常会遇到数据缺失的情况，如果处理不当，会给估计结果带来偏差。本文提出了一种创新方法，通过整合缺失数据技术来增强基于经验似然法的信息借用方法，确保稳健的数据整合。我们引入了一种插件式反概率加权估计器来处理主要分析中的缺失，并证明了在温和条件下它与标准联合估计器的等效性。为了解决次要结果缺失可能造成的偏差，我们提出了一种统一映射策略，将不完整的次要结果归入一个统一的空间。大量的模拟突出了我们方法的有效性，显示了在涉及缺失数据和/或次级模型失当的各种情况下，我们的估计方法是一致、高效和稳健的。最后，我们将我们的建议应用于国家阿尔茨海默氏症协调中心的统一数据集，举例说明其实际应用。

{"title":"Robust integration of secondary outcomes information into primary outcome analysis in the presence of missing data.","authors":"Daxuan Deng, Vernon M Chinchilli, Hao Feng, Chixiang Chen, Ming Wang","doi":"10.1177/09622802241254195","DOIUrl":"10.1177/09622802241254195","url":null,"abstract":"In clinical and observational studies, secondary outcomes are frequently collected alongside the primary outcome for each subject, yet their potential to improve the analysis efficiency remains underutilized. Moreover, missing data, commonly encountered in practice, can introduce bias to estimates if not appropriately addressed. This article presents an innovative approach that enhances the empirical likelihood-based information borrowing method by integrating missing-data techniques, ensuring robust data integration. We introduce a plug-in inverse probability weighting estimator to handle missingness in the primary analysis, demonstrating its equivalence to the standard joint estimator under mild conditions. To address potential bias from missing secondary outcomes, we propose a uniform mapping strategy, imputing incomplete secondary outcomes into a unified space. Extensive simulations highlight the effectiveness of our method, showing consistent, efficient, and robust estimators under various scenarios involving missing data and/or misspecified secondary models. Finally, we apply our proposal to the Uniform Data Set from the National Alzheimer's Coordinating Center, exemplifying its practical application.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1249-1263"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141065604","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A capture-recapture modeling framework emphasizing expert opinion in disease surveillance. 在疾病监测中强调专家意见的捕获-再捕获建模框架。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-05-20 DOI: 10.1177/09622802241254217

Yuzi Zhang, Lin Ge, Lance A Waller, Sarita Shah, Robert H Lyles

In disease surveillance, capture-recapture methods are commonly used to estimate the number of diseased cases in a defined target population. Since the number of cases never identified by any surveillance system cannot be observed, estimation of the case count typically requires at least one crucial assumption about the dependency between surveillance systems. However, such assumptions are generally unverifiable based on the observed data alone. In this paper, we advocate a modeling framework hinging on the choice of a key population-level parameter that reflects dependencies among surveillance streams. With the key dependency parameter as the focus, the proposed method offers the benefits of (a) incorporating expert opinion in the spirit of prior information to guide estimation; (b) providing accessible bias corrections, and (c) leveraging an adapted credible interval approach to facilitate inference. We apply the proposed framework to two real human immunodeficiency virus surveillance datasets exhibiting three-stream and four-stream capture-recapture-based case count estimation. Our approach enables estimation of the number of human immunodeficiency virus positive cases for both examples, under realistic assumptions that are under the investigator's control and can be readily interpreted. The proposed framework also permits principled uncertainty analyses through which a user can acknowledge their level of confidence in assumptions made about the key non-identifiable dependency parameter.

在疾病监测中，通常使用捕获-再捕获方法来估算特定目标人群中的病例数。由于无法观察到任何监测系统从未发现的病例数，因此估算病例数通常需要至少一个关于监测系统之间依赖关系的关键假设。然而，仅凭观察到的数据通常无法验证这些假设。在本文中，我们提出了一个建模框架，其核心是选择一个反映监测流之间依赖关系的关键人群参数。以关键依赖性参数为重点，所提出的方法具有以下优点：(a) 以先验信息的精神纳入专家意见以指导估算；(b) 提供可利用的偏差修正；(c) 利用经调整的可信区间方法促进推论。我们将提出的框架应用于两个真实的人体免疫缺陷病毒监测数据集，这两个数据集分别展示了基于三流和四流捕获-再捕获的病例数估算。我们的方法可以在调查人员可控且易于解释的现实假设条件下，估算出这两个例子中的人体免疫缺陷病毒阳性病例数。建议的框架还允许进行原则性的不确定性分析，通过这种分析，用户可以确认他们对关键的不可识别依赖参数假设的信心程度。

{"title":"A capture-recapture modeling framework emphasizing expert opinion in disease surveillance.","authors":"Yuzi Zhang, Lin Ge, Lance A Waller, Sarita Shah, Robert H Lyles","doi":"10.1177/09622802241254217","DOIUrl":"10.1177/09622802241254217","url":null,"abstract":"In disease surveillance, capture-recapture methods are commonly used to estimate the number of diseased cases in a defined target population. Since the number of cases never identified by any surveillance system cannot be observed, estimation of the case count typically requires at least one crucial assumption about the dependency between surveillance systems. However, such assumptions are generally unverifiable based on the observed data alone. In this paper, we advocate a modeling framework hinging on the choice of a key population-level parameter that reflects dependencies among surveillance streams. With the key dependency parameter as the focus, the proposed method offers the benefits of (a) incorporating expert opinion in the spirit of prior information to guide estimation; (b) providing accessible bias corrections, and (c) leveraging an adapted credible interval approach to facilitate inference. We apply the proposed framework to two real human immunodeficiency virus surveillance datasets exhibiting three-stream and four-stream capture-recapture-based case count estimation. Our approach enables estimation of the number of human immunodeficiency virus positive cases for both examples, under realistic assumptions that are under the investigator's control and can be readily interpreted. The proposed framework also permits principled uncertainty analyses through which a user can acknowledge their level of confidence in assumptions made about the key non-identifiable dependency parameter.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1197-1210"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11347122/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141065594","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Testing for marginal covariate effect when the subgroup size induced by the covariate is informative. 当协变量引起的亚组规模具有信息量时，测试协变量的边际效应。

IF 1.6 3区医学 Q3 HEALTH CARE SCIENCES & SERVICES

Statistical Methods in Medical Research

Pub Date : 2024-07-01 Epub Date: 2024-05-20 DOI: 10.1177/09622802241254196

Samuel Anyaso-Samuel, Somnath Datta

In many cluster-correlated data analyses, informative cluster size poses a challenge that can potentially introduce bias in statistical analyses. Different methodologies have been introduced in statistical literature to address this bias. In this study, we consider a complex form of informativeness where the number of observations corresponding to latent levels of a unit-level continuous covariate within a cluster is associated with the response variable. This type of informativeness has not been explored in prior research. We present a novel test statistic designed to evaluate the effect of the continuous covariate while accounting for the presence of informativeness. The covariate induces a continuum of latent subgroups within the clusters, and our test statistic is formulated by aggregating values from an established statistic that accounts for informative subgroup sizes when comparing group-specific marginal distributions. Through carefully designed simulations, we compare our test with four traditional methods commonly employed in the analysis of cluster-correlated data. Only our test maintains the size across all data-generating scenarios with informativeness. We illustrate the proposed method to test for marginal associations in periodontal data with this distinctive form of informativeness.

在许多聚类相关的数据分析中，信息聚类的规模是一个挑战，有可能在统计分析中引入偏差。统计文献中提出了不同的方法来解决这一偏差。在本研究中，我们考虑了一种复杂形式的信息量，即在一个聚类中，与单位水平连续协变量的潜在水平相对应的观测值数量与响应变量相关联。之前的研究还没有探讨过这种类型的信息性。我们提出了一种新的检验统计量，旨在评估连续协变量的影响，同时考虑到信息量的存在。协变量会在聚类中诱发连续的潜在子群，而我们的检验统计量是通过汇总既有统计量的值而得出的，该统计量在比较特定群体的边际分布时考虑了信息性子群的大小。通过精心设计的模拟，我们将我们的检验方法与聚类相关数据分析中常用的四种传统方法进行了比较。结果表明，只有我们的检验方法在所有数据生成情况下都能保持信息量的大小。我们将对所提出的方法进行说明，以检验牙周病数据中的边际关联性。

{"title":"Testing for marginal covariate effect when the subgroup size induced by the covariate is informative.","authors":"Samuel Anyaso-Samuel, Somnath Datta","doi":"10.1177/09622802241254196","DOIUrl":"10.1177/09622802241254196","url":null,"abstract":"In many cluster-correlated data analyses, informative cluster size poses a challenge that can potentially introduce bias in statistical analyses. Different methodologies have been introduced in statistical literature to address this bias. In this study, we consider a complex form of informativeness where the number of observations corresponding to latent levels of a unit-level continuous covariate within a cluster is associated with the response variable. This type of informativeness has not been explored in prior research. We present a novel test statistic designed to evaluate the effect of the continuous covariate while accounting for the presence of informativeness. The covariate induces a continuum of latent subgroups within the clusters, and our test statistic is formulated by aggregating values from an established statistic that accounts for informative subgroup sizes when comparing group-specific marginal distributions. Through carefully designed simulations, we compare our test with four traditional methods commonly employed in the analysis of cluster-correlated data. Only our test maintains the size across all data-generating scenarios with informativeness. We illustrate the proposed method to test for marginal associations in periodontal data with this distinctive form of informativeness.","PeriodicalId":22038,"journal":{"name":"Statistical Methods in Medical Research","volume":" ","pages":"1264-1277"},"PeriodicalIF":1.6,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141065634","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0