Statistics in Medicine最新文献_第5页

Empirical Sandwich Variance Estimator for Iterated Conditional Expectation g-Computation. 迭代条件期望 g 计算的经验三明治方差估算器。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-11-03 DOI: 10.1002/sim.10255

Paul N Zivich, Rachael K Ross, Bonnie E Shook-Sa, Stephen R Cole, Jessie K Edwards

Iterated conditional expectation (ICE) g-computation is an estimation approach for addressing time-varying confounding for both longitudinal and time-to-event data. Unlike other g-computation implementations, ICE avoids the need to specify models for each time-varying covariate. For variance estimation, previous work has suggested the bootstrap. However, bootstrapping can be computationally intense. Here, we present ICE g-computation as a set of stacked estimating equations. Therefore, the variance for the ICE g-computation estimator can be consistently estimated using the empirical sandwich variance estimator. Performance of the variance estimator was evaluated empirically with a simulation study. The proposed approach is also demonstrated with an illustrative example on the effect of cigarette smoking on the prevalence of hypertension. In the simulation study, the empirical sandwich variance estimator appropriately estimated the variance. When comparing runtimes between the sandwich variance estimator and the bootstrap for the applied example, the sandwich estimator was substantially faster, even when bootstraps were run in parallel. The empirical sandwich variance estimator is a viable option for variance estimation with ICE g-computation.

迭代条件期望（ICE）g-计算是一种估计方法，用于解决纵向数据和时间到事件数据的时变混杂问题。与其他 g 计算实现不同的是，ICE 无需为每个时变协变量指定模型。对于方差估计，以前的工作建议使用引导法。然而，自举法的计算量很大。在这里，我们将 ICE g 计算作为一组堆叠估计方程。因此，ICE g 计算估计器的方差可以使用经验三明治方差估计器进行一致估计。我们通过模拟研究对方差估计器的性能进行了经验评估。此外，还以吸烟对高血压患病率的影响为例，演示了所提出的方法。在模拟研究中，经验夹心方差估计器恰当地估计了方差。在比较三明治方差估计器和自举法在应用实例中的运行时间时，三明治估计器的速度要快得多，即使在并行运行自举法时也是如此。经验三明治方差估计器是利用 ICE g 计算进行方差估计的可行选择。

{"title":"Empirical Sandwich Variance Estimator for Iterated Conditional Expectation g-Computation.","authors":"Paul N Zivich, Rachael K Ross, Bonnie E Shook-Sa, Stephen R Cole, Jessie K Edwards","doi":"10.1002/sim.10255","DOIUrl":"10.1002/sim.10255","url":null,"abstract":"Iterated conditional expectation (ICE) g-computation is an estimation approach for addressing time-varying confounding for both longitudinal and time-to-event data. Unlike other g-computation implementations, ICE avoids the need to specify models for each time-varying covariate. For variance estimation, previous work has suggested the bootstrap. However, bootstrapping can be computationally intense. Here, we present ICE g-computation as a set of stacked estimating equations. Therefore, the variance for the ICE g-computation estimator can be consistently estimated using the empirical sandwich variance estimator. Performance of the variance estimator was evaluated empirically with a simulation study. The proposed approach is also demonstrated with an illustrative example on the effect of cigarette smoking on the prevalence of hypertension. In the simulation study, the empirical sandwich variance estimator appropriately estimated the variance. When comparing runtimes between the sandwich variance estimator and the bootstrap for the applied example, the sandwich estimator was substantially faster, even when bootstraps were run in parallel. The empirical sandwich variance estimator is a viable option for variance estimation with ICE g-computation.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5562-5572"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142569401","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Drug Efficacy Estimation for Follow-on Companion Diagnostic Devices Through External Studies. 通过外部研究估算后续辅助诊断设备的药效。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-11-05 DOI: 10.1002/sim.10231

Jiarui Sun, Wenjie Hu, Xiao-Hua Zhou

A therapeutic product is usually not suitable for all patients but for only a subpopulation. The safe and effective use of such a therapeutic product requires the co-approval of a companion diagnostic device which can be used to identify suitable patients. While the first-of-a-kind companion diagnostic device is often developed in conjunction with its intended therapeutic product and simultaneously validated through a randomized clinical trial, there remains room for the innovation of new and improved follow-on companion diagnostic devices designed for the same therapeutic product. However, conducting a new randomized trial or a bridging study for the follow on companion devices may be unethical, expensive or unpractical. Hence, there arises a need for an external study to evaluate the concordance between the FDA-approved comparator companion diagnostic device (CCD) and the subsequent follow-on companion diagnostic devices (FCD), indirectly validating the latter. In this article, we introduce a novel external study design, referred to as the targeted treatment design, as an extension of the existing concordance design. Additionally, we present corresponding statistical analysis methods. Our approach combines the CCD randomized trial data and the FCD external study data, enabling the estimation of drug efficacy within the FCD+ and FCD- subpopulations-the parameters crucial for the validation of the FCD. Theoretical results and simulation studies validate the proposed methods and we further illustrate the proposed methods through an application in a real example of non-small-cell lung cancer.

治疗产品通常并不适合所有病人，而只适合一部分病人。要安全有效地使用这种治疗产品，就必须同时批准一种辅助诊断设备，用于确定合适的患者。虽然首创的配套诊断设备通常与预期的治疗产品一起开发，并同时通过随机临床试验进行验证，但为同一治疗产品设计的新的和改进的后续配套诊断设备仍有创新的空间。然而，为后续配套设备进行新的随机试验或桥接研究可能不道德、昂贵或不切实际。因此，有必要开展一项外部研究，以评估 FDA 批准的参照配套诊断设备（CCD）与后续配套诊断设备（FCD）之间的一致性，从而间接验证后者。在本文中，我们介绍了一种新颖的外部研究设计（称为靶向治疗设计），作为现有一致性设计的延伸。此外，我们还介绍了相应的统计分析方法。我们的方法结合了 CCD 随机试验数据和 FCD 外部研究数据，能够估算 FCD+ 和 FCD- 亚群的药物疗效--这些参数对 FCD 的验证至关重要。理论结果和模拟研究验证了所提出的方法，我们还通过在非小细胞肺癌实际案例中的应用进一步说明了所提出的方法。

{"title":"Drug Efficacy Estimation for Follow-on Companion Diagnostic Devices Through External Studies.","authors":"Jiarui Sun, Wenjie Hu, Xiao-Hua Zhou","doi":"10.1002/sim.10231","DOIUrl":"10.1002/sim.10231","url":null,"abstract":"A therapeutic product is usually not suitable for all patients but for only a subpopulation. The safe and effective use of such a therapeutic product requires the co-approval of a companion diagnostic device which can be used to identify suitable patients. While the first-of-a-kind companion diagnostic device is often developed in conjunction with its intended therapeutic product and simultaneously validated through a randomized clinical trial, there remains room for the innovation of new and improved follow-on companion diagnostic devices designed for the same therapeutic product. However, conducting a new randomized trial or a bridging study for the follow on companion devices may be unethical, expensive or unpractical. Hence, there arises a need for an external study to evaluate the concordance between the FDA-approved comparator companion diagnostic device (CCD) and the subsequent follow-on companion diagnostic devices (FCD), indirectly validating the latter. In this article, we introduce a novel external study design, referred to as the targeted treatment design, as an extension of the existing concordance design. Additionally, we present corresponding statistical analysis methods. Our approach combines the CCD randomized trial data and the FCD external study data, enabling the estimation of drug efficacy within the FCD+ and FCD- subpopulations-the parameters crucial for the validation of the FCD. Theoretical results and simulation studies validate the proposed methods and we further illustrate the proposed methods through an application in a real example of non-small-cell lung cancer.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5605-5617"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142584331","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

New Quadratic Discriminant Analysis Algorithms for Correlated Audiometric Data. 用于相关听力数据的新型二次判别分析算法

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-10-25 DOI: 10.1002/sim.10257

Fuyu Guo, David M Zucker, Kenneth I Vaden, Sharon Curhan, Judy R Dubno, Molin Wang

Paired organs like eyes, ears, and lungs in humans exhibit similarities, and data from these organs often display remarkable correlations. Accounting for these correlations could enhance classification models used in predicting disease phenotypes. To our knowledge, there is limited, if any, literature addressing this topic, and existing methods do not exploit such correlations. For example, the conventional approach treats each ear as an independent observation when predicting audiometric phenotypes and is agnostic about the correlation of data from the two ears of the same person. This approach may lead to information loss and reduce the model performance. In response to this gap, particularly in the context of audiometric phenotype prediction, this paper proposes new quadratic discriminant analysis (QDA) algorithms that appropriately deal with the dependence between ears. We propose two-stage analysis strategies: (1) conducting data transformations to reduce data dimensionality before applying QDA; and (2) developing new QDA algorithms to partially utilize the dependence between phenotypes of two ears. We conducted simulation studies to compare different transformation methods and to assess the performance of different QDA algorithms. The empirical results suggested that the transformation may only be beneficial when the sample size is relatively small. Moreover, our proposed new QDA algorithms performed better than the conventional approach in both person-level and ear-level accuracy. As an illustration, we applied them to audiometric data from the Medical University of South Carolina Longitudinal Cohort Study of Age-related Hearing Loss. In addition, we developed an R package, PairQDA, to implement the proposed algorithms.

人类的眼睛、耳朵和肺等成对器官具有相似性，这些器官的数据往往显示出显著的相关性。考虑这些相关性可以增强用于预测疾病表型的分类模型。据我们所知，涉及这一主题的文献即使有也很有限，而且现有的方法也没有利用这些相关性。例如，在预测听力表型时，传统方法将每只耳朵视为独立的观察对象，而不考虑同一人两只耳朵数据的相关性。这种方法可能会导致信息丢失，降低模型性能。针对这一缺陷，特别是在听力表型预测方面，本文提出了新的二次判别分析（QDA）算法，以适当处理耳朵之间的依赖关系。我们提出了两个阶段的分析策略：（1）在应用 QDA 之前进行数据转换以降低数据维度；（2）开发新的 QDA 算法以部分利用双耳表型之间的依赖性。我们进行了模拟研究，比较了不同的转换方法，并评估了不同 QDA 算法的性能。实证结果表明，只有当样本量相对较小时，转换才可能是有益的。此外，我们提出的新 QDA 算法在人和耳的准确性方面都优于传统方法。作为说明，我们将其应用于南卡罗来纳医科大学老年性听力损失纵向队列研究的听力数据。此外，我们还开发了一个 R 软件包 PairQDA 来实现所提出的算法。

{"title":"New Quadratic Discriminant Analysis Algorithms for Correlated Audiometric Data.","authors":"Fuyu Guo, David M Zucker, Kenneth I Vaden, Sharon Curhan, Judy R Dubno, Molin Wang","doi":"10.1002/sim.10257","DOIUrl":"10.1002/sim.10257","url":null,"abstract":"Paired organs like eyes, ears, and lungs in humans exhibit similarities, and data from these organs often display remarkable correlations. Accounting for these correlations could enhance classification models used in predicting disease phenotypes. To our knowledge, there is limited, if any, literature addressing this topic, and existing methods do not exploit such correlations. For example, the conventional approach treats each ear as an independent observation when predicting audiometric phenotypes and is agnostic about the correlation of data from the two ears of the same person. This approach may lead to information loss and reduce the model performance. In response to this gap, particularly in the context of audiometric phenotype prediction, this paper proposes new quadratic discriminant analysis (QDA) algorithms that appropriately deal with the dependence between ears. We propose two-stage analysis strategies: (1) conducting data transformations to reduce data dimensionality before applying QDA; and (2) developing new QDA algorithms to partially utilize the dependence between phenotypes of two ears. We conducted simulation studies to compare different transformation methods and to assess the performance of different QDA algorithms. The empirical results suggested that the transformation may only be beneficial when the sample size is relatively small. Moreover, our proposed new QDA algorithms performed better than the conventional approach in both person-level and ear-level accuracy. As an illustration, we applied them to audiometric data from the Medical University of South Carolina Longitudinal Cohort Study of Age-related Hearing Loss. In addition, we developed an R package, PairQDA, to implement the proposed algorithms.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5473-5483"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142508281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dose Individualization for Phase I Cancer Trials With Broadened Eligibility. 扩大癌症 I 期试验的剂量个体化资格。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-10-31 DOI: 10.1002/sim.10264

Rebecca B Silva, Bin Cheng, Richard D Carvajal, Shing M Lee

Broadening eligibility criteria in cancer trials has been advocated to represent the intended patient population more accurately. The advantages are clear in terms of generalizability and recruitment, however there are some important considerations in terms of design for efficiency and patient safety. While toxicity may be expected to be homogeneous across these subpopulations, designs should be able to recommend safe and precise doses if subpopulations with different toxicity profiles exist. Dose-finding designs accounting for patient heterogeneity have been proposed, but existing methods assume that the source of heterogeneity is known. We propose a broadened eligibility dose-finding design to address the situation of unknown patient heterogeneity in phase I cancer clinical trials where eligibility is expanded, and multiple eligibility criteria could potentially lead to different optimal doses for patient subgroups. The design offers a two-in-one approach to dose-finding by simultaneously selecting patient criteria that differentiate the maximum tolerated dose (MTD), using stochastic search variable selection, and recommending the subpopulation-specific MTD if needed. Our simulation study compares the proposed design to the naive approach of assuming patient homogeneity and demonstrates favorable operating characteristics across a wide range of scenarios, allocating patients more often to their true MTD during the trial, recommending more than one MTD when needed, and identifying criteria that differentiate the patient population. The proposed design highlights the advantages of adding more variability at an early stage and demonstrates how assuming patient homogeneity can lead to unsafe or sub-therapeutic dose recommendations.

人们主张扩大癌症试验的资格标准，以便更准确地代表预期的患者群体。在普及性和招募方面的优势显而易见，但在设计效率和患者安全方面也有一些重要的考虑因素。虽然预计这些亚人群的毒性可能是相同的，但如果存在毒性特征不同的亚人群，设计应能够推荐安全和精确的剂量。已经有人提出了考虑患者异质性的剂量寻找设计，但现有方法假定异质性的来源是已知的。我们提出了一种扩大资格的剂量寻找设计，以解决 I 期癌症临床试验中患者异质性未知的情况，在这种情况下，资格范围扩大了，多种资格标准有可能导致患者亚群的最佳剂量不同。该设计提供了一种二合一的剂量寻找方法，即同时选择可区分最大耐受剂量（MTD）的患者标准，使用随机搜索变量选择，并在需要时推荐特定亚群的 MTD。我们的模拟研究将拟议的设计与假定患者同质性的天真方法进行了比较，结果表明，在各种情况下，拟议的设计都具有良好的运行特性，能在试验期间更频繁地将患者分配到其真正的 MTD，在需要时推荐一种以上的 MTD，并能确定区分患者群体的标准。建议的设计突出了在早期阶段增加更多可变性的优势，并展示了假设患者同质性会如何导致不安全或亚治疗剂量推荐。

{"title":"Dose Individualization for Phase I Cancer Trials With Broadened Eligibility.","authors":"Rebecca B Silva, Bin Cheng, Richard D Carvajal, Shing M Lee","doi":"10.1002/sim.10264","DOIUrl":"10.1002/sim.10264","url":null,"abstract":"Broadening eligibility criteria in cancer trials has been advocated to represent the intended patient population more accurately. The advantages are clear in terms of generalizability and recruitment, however there are some important considerations in terms of design for efficiency and patient safety. While toxicity may be expected to be homogeneous across these subpopulations, designs should be able to recommend safe and precise doses if subpopulations with different toxicity profiles exist. Dose-finding designs accounting for patient heterogeneity have been proposed, but existing methods assume that the source of heterogeneity is known. We propose a broadened eligibility dose-finding design to address the situation of unknown patient heterogeneity in phase I cancer clinical trials where eligibility is expanded, and multiple eligibility criteria could potentially lead to different optimal doses for patient subgroups. The design offers a two-in-one approach to dose-finding by simultaneously selecting patient criteria that differentiate the maximum tolerated dose (MTD), using stochastic search variable selection, and recommending the subpopulation-specific MTD if needed. Our simulation study compares the proposed design to the naive approach of assuming patient homogeneity and demonstrates favorable operating characteristics across a wide range of scenarios, allocating patients more often to their true MTD during the trial, recommending more than one MTD when needed, and identifying criteria that differentiate the patient population. The proposed design highlights the advantages of adding more variability at an early stage and demonstrates how assuming patient homogeneity can lead to unsafe or sub-therapeutic dose recommendations.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5534-5547"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Brief Introduction on Latent Variable Based Ordinal Regression Models With an Application to Survey Data. 基于潜变量的序数回归模型简介及在调查数据中的应用。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-10-28 DOI: 10.1002/sim.10208

Johannes Wieditz, Clemens Miller, Jan Scholand, Marcus Nemeth

The analysis of survey data is a frequently arising issue in clinical trials, particularly when capturing quantities which are difficult to measure. Typical examples are questionnaires about patient's well-being, pain, or consent to an intervention. In these, data is captured on a discrete scale containing only a limited number of possible answers, from which the respondent has to pick the answer which fits best his/her personal opinion. This data is generally located on an ordinal scale as answers can usually be arranged in an ascending order, for example, "bad", "neutral", "good" for well-being. Since responses are usually stored numerically for data processing purposes, analysis of survey data using ordinary linear regression models are commonly applied. However, assumptions of these models are often not met as linear regression requires a constant variability of the response variable and can yield predictions out of the range of response categories. By using linear models, one only gains insights about the mean response which may affect representativeness. In contrast, ordinal regression models can provide probability estimates for all response categories and yield information about the full response scale beyond the mean. In this work, we provide a concise overview of the fundamentals of latent variable based ordinal models, applications to a real data set, and outline the use of state-of-the-art-software for this purpose. Moreover, we discuss strengths, limitations and typical pitfalls. This is a companion work to a current vignette-based structured interview study in pediatric anesthesia.

调查数据的分析是临床试验中经常出现的问题，尤其是在获取难以测量的数量时。典型的例子是有关病人健康、疼痛或是否同意干预的调查问卷。在这些问卷中，数据是通过一个离散的量表采集的，量表只包含数量有限的可能答案，被调查者必须从中选出最符合其个人观点的答案。这种数据一般采用序数量表，因为答案通常可以按升序排列，例如，幸福感的 "差"、"中性"、"好"。由于回答通常以数字形式存储，以便于数据处理，因此通常采用普通线性回归模型对调查数据进行分析。然而，这些模型的假设条件往往无法满足，因为线性回归要求反应变量具有恒定的可变性，并可能产生超出反应类别范围的预测结果。通过使用线性模型，人们只能了解到平均值，这可能会影响代表性。与此相反，序数回归模型可以提供所有响应类别的概率估计值，并获得平均值以外的全部响应范围的信息。在这项工作中，我们简要概述了基于潜变量的序数模型的基本原理、在真实数据集中的应用，并概述了为此目的使用的最新软件。此外，我们还讨论了优势、局限性和典型陷阱。这是目前一项基于小故事的小儿麻醉结构化访谈研究的配套著作。

{"title":"A Brief Introduction on Latent Variable Based Ordinal Regression Models With an Application to Survey Data.","authors":"Johannes Wieditz, Clemens Miller, Jan Scholand, Marcus Nemeth","doi":"10.1002/sim.10208","DOIUrl":"10.1002/sim.10208","url":null,"abstract":"The analysis of survey data is a frequently arising issue in clinical trials, particularly when capturing quantities which are difficult to measure. Typical examples are questionnaires about patient's well-being, pain, or consent to an intervention. In these, data is captured on a discrete scale containing only a limited number of possible answers, from which the respondent has to pick the answer which fits best his/her personal opinion. This data is generally located on an ordinal scale as answers can usually be arranged in an ascending order, for example, \"bad\", \"neutral\", \"good\" for well-being. Since responses are usually stored numerically for data processing purposes, analysis of survey data using ordinary linear regression models are commonly applied. However, assumptions of these models are often not met as linear regression requires a constant variability of the response variable and can yield predictions out of the range of response categories. By using linear models, one only gains insights about the mean response which may affect representativeness. In contrast, ordinal regression models can provide probability estimates for all response categories and yield information about the full response scale beyond the mean. In this work, we provide a concise overview of the fundamentals of latent variable based ordinal models, applications to a real data set, and outline the use of state-of-the-art-software for this purpose. Moreover, we discuss strengths, limitations and typical pitfalls. This is a companion work to a current vignette-based structured interview study in pediatric anesthesia.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5618-5634"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11588990/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523101","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Heterogeneous Mediation Analysis for Cox Proportional Hazards Model With Multiple Mediators. 具有多个中介因子的 Cox 比例危害模型的异质性中介分析

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-10-28 DOI: 10.1002/sim.10239

Rongqian Sun, Xinyuan Song

This study proposes a heterogeneous mediation analysis for survival data that accommodates multiple mediators and sparsity of the predictors. We introduce a joint modeling approach that links the mediation regression and proportional hazards models through Bayesian additive regression trees with shared typologies. The shared tree component is motivated by the fact that confounders and effect modifiers on the causal pathways linked by different mediators often overlap. A sparsity-inducing prior is incorporated to capture the most relevant confounders and effect modifiers on different causal pathways. The individual-specific interventional direct and indirect effects are derived on the scale of the logarithm of hazards and survival function. A Bayesian approach with an efficient Markov chain Monte Carlo algorithm is developed to estimate the conditional interventional effects through the Monte Carlo implementation of the mediation formula. Simulation studies are conducted to verify the empirical performance of the proposed method. An application to the ACTG175 study further demonstrates the method's utility in causal discovery and heterogeneity quantification.

本研究提出了一种针对生存数据的异质性中介分析方法，该方法可适应多个中介因素和预测因素的稀疏性。我们引入了一种联合建模方法，通过具有共享类型的贝叶斯加性回归树将中介回归模型和比例危险模型联系起来。共享树部分的动机是，由不同中介因素连接的因果路径上的混杂因素和效应修饰因素经常重叠。为了捕捉不同因果路径上最相关的混杂因素和效应修饰因素，我们加入了一个稀疏性诱导先验。个体特异性干预的直接和间接效应是在危害对数和生存函数的尺度上得出的。利用高效的马尔科夫链蒙特卡罗算法开发了一种贝叶斯方法，通过蒙特卡罗中介公式的实施来估计条件干预效应。为验证所提方法的经验性能，进行了模拟研究。在 ACTG175 研究中的应用进一步证明了该方法在因果发现和异质性量化方面的实用性。

{"title":"Heterogeneous Mediation Analysis for Cox Proportional Hazards Model With Multiple Mediators.","authors":"Rongqian Sun, Xinyuan Song","doi":"10.1002/sim.10239","DOIUrl":"10.1002/sim.10239","url":null,"abstract":"This study proposes a heterogeneous mediation analysis for survival data that accommodates multiple mediators and sparsity of the predictors. We introduce a joint modeling approach that links the mediation regression and proportional hazards models through Bayesian additive regression trees with shared typologies. The shared tree component is motivated by the fact that confounders and effect modifiers on the causal pathways linked by different mediators often overlap. A sparsity-inducing prior is incorporated to capture the most relevant confounders and effect modifiers on different causal pathways. The individual-specific interventional direct and indirect effects are derived on the scale of the logarithm of hazards and survival function. A Bayesian approach with an efficient Markov chain Monte Carlo algorithm is developed to estimate the conditional interventional effects through the Monte Carlo implementation of the mediation formula. Simulation studies are conducted to verify the empirical performance of the proposed method. An application to the ACTG175 study further demonstrates the method's utility in causal discovery and heterogeneity quantification.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5497-5512"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11588993/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142523115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Comparison of Variance Estimators for Logistic Regression Models Estimated Using Generalized Estimating Equations (GEE) in the Context of Observational Health Services Research. 观察性健康服务研究中使用广义估计方程 (GEE) 估计的 Logistic 回归模型方差估计器的比较》（A Comparison of Variance Estimators for Logistic Regression Models Estimated Using Generalized Estimating Equations (GEE) in the Context of Observational Health Services Research）。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-10-31 DOI: 10.1002/sim.10260

Peter C Austin

In observational health services research, researchers often use clustered data to estimate the independent association between individual outcomes and several cluster-level covariates after adjusting for individual-level characteristics. Generalized estimating equations are a popular method for estimating generalized linear models using clustered data. The conventional Liang-Zeger variance estimator is known to result in estimated standard errors that are biased low when the number of clusters in small. Alternative variance estimators have been proposed for use when the number of clusters is low. Previous studies focused on these alternative variance estimators in the context of cluster randomized trials, which are often characterized by a small number of clusters and by an outcomes regression model that often consists of a single cluster-level variable (the treatment/exposure variable). We addressed the following questions: (i) which estimator is preferred for estimating the standard errors of cluster-level covariates for logistic regression models with multiple binary and continuous cluster-level variables in addition to subject-level variables; (ii) in such settings, how many clusters are required for the Liang-Zeger variance estimator to have acceptable performance for estimating the standard errors of cluster-level covariates. We suggest that when estimating standard errors: (i) when the number of clusters is < 15 use the Kauermann-Carroll estimator; (ii) when the number of clusters is between 15 and 40 use the Fay-Graubard estimator; (iii) when the number of clusters exceeds 40, use the Liang-Zeger estimator or the Fay-Graubard estimator. When estimating confidence intervals, we suggest using the Mancl-DeRouen estimator with a t-distribution.

在观察性健康服务研究中，研究人员经常使用聚类数据来估计个体结果与调整个体水平特征后的几个聚类水平协变量之间的独立关联。广义估计方程是利用聚类数据估计广义线性模型的常用方法。众所周知，当聚类数量较少时，传统的梁-泽格方差估计器会导致估计标准误差偏低。有人提出了在聚类数较少时使用的替代方差估计器。以前的研究主要针对分组随机试验中的这些替代方差估计器，分组随机试验的特点通常是分组数量少，结果回归模型通常由单一分组变量（治疗/暴露变量）组成。我们探讨了以下问题：(i) 对于除受试者变量外还包含多个二元和连续群组级变量的逻辑回归模型，哪种估计器更适合用于估计群组级协变量的标准误差；(ii) 在这种情况下，需要多少群组才能使梁-泽格方差估计器在估计群组级协变量的标准误差时具有可接受的性能。我们建议在估计标准误差时：(i) 当聚类的数量是

{"title":"A Comparison of Variance Estimators for Logistic Regression Models Estimated Using Generalized Estimating Equations (GEE) in the Context of Observational Health Services Research.","authors":"Peter C Austin","doi":"10.1002/sim.10260","DOIUrl":"10.1002/sim.10260","url":null,"abstract":"In observational health services research, researchers often use clustered data to estimate the independent association between individual outcomes and several cluster-level covariates after adjusting for individual-level characteristics. Generalized estimating equations are a popular method for estimating generalized linear models using clustered data. The conventional Liang-Zeger variance estimator is known to result in estimated standard errors that are biased low when the number of clusters in small. Alternative variance estimators have been proposed for use when the number of clusters is low. Previous studies focused on these alternative variance estimators in the context of cluster randomized trials, which are often characterized by a small number of clusters and by an outcomes regression model that often consists of a single cluster-level variable (the treatment/exposure variable). We addressed the following questions: (i) which estimator is preferred for estimating the standard errors of cluster-level covariates for logistic regression models with multiple binary and continuous cluster-level variables in addition to subject-level variables; (ii) in such settings, how many clusters are required for the Liang-Zeger variance estimator to have acceptable performance for estimating the standard errors of cluster-level covariates. We suggest that when estimating standard errors: (i) when the number of clusters is < 15 use the Kauermann-Carroll estimator; (ii) when the number of clusters is between 15 and 40 use the Fay-Graubard estimator; (iii) when the number of clusters exceeds 40, use the Liang-Zeger estimator or the Fay-Graubard estimator. When estimating confidence intervals, we suggest using the Mancl-DeRouen estimator with a t-distribution.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5548-5561"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11588976/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142547547","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Bayesian Safety and Futility Monitoring in Phase II Trials Using One Utility-Based Rule. 使用一个基于效用的规则对 II 期试验进行贝叶斯安全性和有效性监测。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-20 Epub Date: 2024-11-05 DOI: 10.1002/sim.10254

Juhee Lee, Peter F Thall

For phase II clinical trials that determine the acceptability of an experimental treatment based on ordinal toxicity and ordinal response, most monitoring methods require each ordinal outcome to be dichotomized using a selected cut-point. This allows two early stopping rules to be constructed that compare marginal probabilities of toxicity and response to respective upper and lower limits. Important problems with this approach are loss of information due to dichotomization, dependence of treatment acceptability decisions on precisely how each ordinal variable is dichotomized, and ignoring association between the two outcomes. To address these problems, we propose a new Bayesian method, which we call U-Bayes, that exploits elicited numerical utilities of the joint ordinal outcomes to construct one early stopping rule that compares the mean utility to a lower limit. U-Bayes avoids the problems noted above by using the entire joint distribution of the ordinal outcomes, and not dichotomizing the outcomes. A step-by-step algorithm is provided for constructing a U-Bayes rule based on elicited utilities and elicited limits on marginal outcome probabilities. A simulation study shows that U-Bayes greatly improves the probability of determining treatment acceptability compared to conventional designs that use two monitoring rules based on marginal probabilities.

对于根据序数毒性和序数反应确定实验治疗可接受性的 II 期临床试验，大多数监测方法都要求使用选定的切点对每个序数结果进行二分。这样就可以构建两个早期停止规则，将毒性和反应的边际概率与各自的上限和下限进行比较。这种方法存在的重要问题是，二分法会导致信息丢失，治疗可接受性决定取决于每个序数变量如何精确二分，以及忽略两个结果之间的关联。为了解决这些问题，我们提出了一种新的贝叶斯方法（我们称之为 U-Bayes），该方法利用所获得的联合序数结果的数值效用来构建一个早期停止规则，将平均效用与下限进行比较。U-Bayes 通过使用整个序数结果的联合分布，而不是将结果二分，从而避免了上述问题。本文提供了一种分步算法，用于根据激发的效用和激发的边际结果概率限制构建 U-Bayes 规则。一项模拟研究表明，与使用基于边际概率的两种监测规则的传统设计相比，U-贝叶斯法则大大提高了确定治疗可接受性的概率。

{"title":"Bayesian Safety and Futility Monitoring in Phase II Trials Using One Utility-Based Rule.","authors":"Juhee Lee, Peter F Thall","doi":"10.1002/sim.10254","DOIUrl":"10.1002/sim.10254","url":null,"abstract":"For phase II clinical trials that determine the acceptability of an experimental treatment based on ordinal toxicity and ordinal response, most monitoring methods require each ordinal outcome to be dichotomized using a selected cut-point. This allows two early stopping rules to be constructed that compare marginal probabilities of toxicity and response to respective upper and lower limits. Important problems with this approach are loss of information due to dichotomization, dependence of treatment acceptability decisions on precisely how each ordinal variable is dichotomized, and ignoring association between the two outcomes. To address these problems, we propose a new Bayesian method, which we call U-Bayes, that exploits elicited numerical utilities of the joint ordinal outcomes to construct one early stopping rule that compares the mean utility to a lower limit. U-Bayes avoids the problems noted above by using the entire joint distribution of the ordinal outcomes, and not dichotomizing the outcomes. A step-by-step algorithm is provided for constructing a U-Bayes rule based on elicited utilities and elicited limits on marginal outcome probabilities. A simulation study shows that U-Bayes greatly improves the probability of determining treatment acceptability compared to conventional designs that use two monitoring rules based on marginal probabilities.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":"5583-5595"},"PeriodicalIF":1.8,"publicationDate":"2024-12-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11781291/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142576941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Analysis of Longitudinal Lupus Data Using Multivariate t-Linear Models. 用多元t-线性模型分析狼疮的纵向数据。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-19 DOI: 10.1002/sim.10248

Eun Jin Jang, Anbin Rhee, Soo-Kyung Cho, Keunbaik Lee

Analysis of healthcare utilization, such as hospitalization duration and medical costs, is crucial for policymakers and doctors in experimental and epidemiological investigations. Herein, we examine the healthcare utilization data of patients with systemic lupus erythematosus (SLE). The characteristics of the SLE data were measured over a 10-year period with outliers. Multivariate linear models with multivariate normal error distributions are commonly used to evaluate long series of multivariate longitudinal data. However, when there are outliers or heavy tails in the data, such as those based on healthcare utilization, the assumption of multivariate normality may be too strong, resulting in biased estimates. To address this, we propose multivariate t-linear models (MTLMs) with an autoregressive moving-average (ARMA) covariance matrix. Modeling the covariance matrix for multivariate longitudinal data is difficult since the covariance matrix is high dimensional and must be positive-definite. To address these, we employ a modified ARMA Cholesky decomposition and hypersphere decomposition. Several simulation studies are conducted to demonstrate the performance, robustness, and flexibility of the proposed models. The proposed MTLMs with ARMA structured covariance matrix are applied to analyze the healthcare utilization data of patients with SLE.

医疗保健利用的分析，如住院时间和医疗费用，对于决策者和医生在实验和流行病学调查中至关重要。在此，我们研究了系统性红斑狼疮（SLE）患者的医疗保健利用数据。SLE数据的特征是在10年期间测量的，有异常值。具有多元正态误差分布的多元线性模型通常用于评估长序列的多元纵向数据。然而，当数据中存在异常值或重尾时，例如基于医疗保健利用率的数据，多变量正态性假设可能太强，导致有偏估计。为了解决这个问题，我们提出了带有自回归移动平均（ARMA）协方差矩阵的多元t线性模型（mtlm）。多变量纵向数据的协方差矩阵是高维的，且必须是正定的。为了解决这些问题，我们采用了改进的ARMA Cholesky分解和超球分解。进行了一些仿真研究，以证明所提出模型的性能，鲁棒性和灵活性。本研究采用ARMA结构协方差矩阵的mtlm对SLE患者的医疗保健利用数据进行分析。

{"title":"Analysis of Longitudinal Lupus Data Using Multivariate t-Linear Models.","authors":"Eun Jin Jang, Anbin Rhee, Soo-Kyung Cho, Keunbaik Lee","doi":"10.1002/sim.10248","DOIUrl":"https://doi.org/10.1002/sim.10248","url":null,"abstract":"Analysis of healthcare utilization, such as hospitalization duration and medical costs, is crucial for policymakers and doctors in experimental and epidemiological investigations. Herein, we examine the healthcare utilization data of patients with systemic lupus erythematosus (SLE). The characteristics of the SLE data were measured over a 10-year period with outliers. Multivariate linear models with multivariate normal error distributions are commonly used to evaluate long series of multivariate longitudinal data. However, when there are outliers or heavy tails in the data, such as those based on healthcare utilization, the assumption of multivariate normality may be too strong, resulting in biased estimates. To address this, we propose multivariate t-linear models (MTLMs) with an autoregressive moving-average (ARMA) covariance matrix. Modeling the covariance matrix for multivariate longitudinal data is difficult since the covariance matrix is high dimensional and must be positive-definite. To address these, we employ a modified ARMA Cholesky decomposition and hypersphere decomposition. Several simulation studies are conducted to demonstrate the performance, robustness, and flexibility of the proposed models. The proposed MTLMs with ARMA structured covariance matrix are applied to analyze the healthcare utilization data of patients with SLE.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142865174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Nonparametric Estimation for Propensity Scores With Misclassified Treatments. 误分类治疗倾向得分的非参数估计。

IF 1.8 4区医学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY

Statistics in Medicine

Pub Date : 2024-12-18 DOI: 10.1002/sim.10306

Li-Pang Chen

In the framework of causal inference, average treatment effect (ATE) is one of crucial concerns. To estimate it, the propensity score based estimation method and its variants have been widely adopted. However, most existing methods were developed by assuming that binary treatments are precisely measured. In addition, propensity scores are usually formulated as parametric models with respect to confounders. However, in the presence of measurement error in binary treatments and nonlinear relationship between treatments and confounders, existing methods are no longer valid and may yield biased inference results if these features are ignored. In this paper, we first analytically examine the impact of estimation of ATE and derive biases for the estimator of ATE when treatments are contaminated with measurement error. After that, we develop a valid method to address binary treatments with misclassification. Given the corrected treatments, we adopt the random forest method to estimate the propensity score with nonlinear confounders accommodated and then derive the estimator of ATE. Asymptotic properties of the error-eliminated estimator are established. Numerical studies are also conducted to assess the finite sample performance of the proposed estimator, and numerical results verify the importance of correcting for measurement error effects.

在因果推理的框架中，平均治疗效果（ATE）是一个关键问题。为了对其进行估计，基于倾向分数的估计方法及其变体被广泛采用。然而，大多数现有的方法都是在假设二元处理是精确测量的情况下发展起来的。此外，倾向分数通常被表述为关于混杂因素的参数模型。然而，在二元处理中存在测量误差以及处理与混杂因素之间存在非线性关系的情况下，如果忽略这些特征，现有方法将不再有效，并且可能产生有偏差的推断结果。在本文中，我们首先分析了ATE估计的影响，并推导了当处理受到测量误差污染时ATE估计量的偏差。在此基础上，我们开发了一种有效的方法来解决二元分类错误。在校正处理条件下，采用随机森林方法估计非线性混杂因素的倾向得分，并推导出ATE的估计量。建立了消差估计量的渐近性质。数值研究还对所提出的估计器的有限样本性能进行了评估，数值结果验证了校正测量误差影响的重要性。

{"title":"Nonparametric Estimation for Propensity Scores With Misclassified Treatments.","authors":"Li-Pang Chen","doi":"10.1002/sim.10306","DOIUrl":"https://doi.org/10.1002/sim.10306","url":null,"abstract":"In the framework of causal inference, average treatment effect (ATE) is one of crucial concerns. To estimate it, the propensity score based estimation method and its variants have been widely adopted. However, most existing methods were developed by assuming that binary treatments are precisely measured. In addition, propensity scores are usually formulated as parametric models with respect to confounders. However, in the presence of measurement error in binary treatments and nonlinear relationship between treatments and confounders, existing methods are no longer valid and may yield biased inference results if these features are ignored. In this paper, we first analytically examine the impact of estimation of ATE and derive biases for the estimator of ATE when treatments are contaminated with measurement error. After that, we develop a valid method to address binary treatments with misclassification. Given the corrected treatments, we adopt the random forest method to estimate the propensity score with nonlinear confounders accommodated and then derive the estimator of ATE. Asymptotic properties of the error-eliminated estimator are established. Numerical studies are also conducted to assess the finite sample performance of the proposed estimator, and numerical results verify the importance of correcting for measurement error effects.","PeriodicalId":21879,"journal":{"name":"Statistics in Medicine","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142847719","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"医学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0