Biometrics最新文献_第8页

Designing cancer screening trials for reduction in late-stage cancer incidence. 设计癌症筛查试验，降低晚期癌症发病率。

IF 1.4 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-07-01 DOI: 10.1093/biomtc/ujae097

Kehao Zhu, Ying-Qi Zhao, Yingye Zheng

Before implementing a biomarker test for early cancer detection into routine clinical care, the test must demonstrate clinical utility, that is, the test results should lead to clinical actions that positively affect patient-relevant outcomes. Unlike therapeutical trials for patients diagnosed with cancer, designing a randomized controlled trial (RCT) to demonstrate the clinical utility of an early detection biomarker with mortality and related endpoints poses unique challenges. The hurdles stem from the prolonged natural progression of the disease and the lack of information regarding the time-varying screening effect on the target asymptomatic population. To facilitate the study design of screening trials, we propose using a generic multistate disease history model and derive model-based effect sizes. The model links key performance metrics of the test, such as sensitivity, to primary endpoints like the incidence of late-stage cancer. It also incorporates the practical implementation of the biomarker-testing program in real-world scenarios. Based on the chronological time scale aligned with RCT, our method allows the assessment of study powers based on key features of the new program, including the test sensitivity, the length of follow-up, and the number and frequency of repeated tests. The calculation tool from the proposed method will enable practitioners to perform realistic and quick evaluations when strategizing screening trials for specific diseases. We use numerical examples based on the National Lung Screening Trial to demonstrate the method.

在将癌症早期检测生物标志物试验应用于常规临床治疗之前，该试验必须证明其临床实用性，也就是说，试验结果应能导致对患者相关结果产生积极影响的临床行动。与针对确诊癌症患者的治疗试验不同，设计一项随机对照试验（RCT）来证明早期检测生物标记物的临床效用与死亡率及相关终点之间的关系会带来独特的挑战。这些障碍源于疾病的长期自然发展，以及缺乏有关筛查对目标无症状人群的时变效应的信息。为便于筛查试验的研究设计，我们建议使用通用的多州疾病史模型，并推导出基于模型的效应大小。该模型将检测的关键性能指标（如灵敏度）与晚期癌症发病率等主要终点联系起来。它还结合了生物标记物检测计划在现实世界中的实际实施情况。根据与 RCT 一致的时间尺度，我们的方法可以根据新方案的关键特征（包括检测灵敏度、随访时间以及重复检测的次数和频率）评估研究力量。在制定特定疾病筛查试验战略时，建议方法中的计算工具将使从业人员能够进行现实而快速的评估。我们使用基于国家肺筛查试验的数字示例来演示该方法。

{"title":"Designing cancer screening trials for reduction in late-stage cancer incidence.","authors":"Kehao Zhu, Ying-Qi Zhao, Yingye Zheng","doi":"10.1093/biomtc/ujae097","DOIUrl":"https://doi.org/10.1093/biomtc/ujae097","url":null,"abstract":"Before implementing a biomarker test for early cancer detection into routine clinical care, the test must demonstrate clinical utility, that is, the test results should lead to clinical actions that positively affect patient-relevant outcomes. Unlike therapeutical trials for patients diagnosed with cancer, designing a randomized controlled trial (RCT) to demonstrate the clinical utility of an early detection biomarker with mortality and related endpoints poses unique challenges. The hurdles stem from the prolonged natural progression of the disease and the lack of information regarding the time-varying screening effect on the target asymptomatic population. To facilitate the study design of screening trials, we propose using a generic multistate disease history model and derive model-based effect sizes. The model links key performance metrics of the test, such as sensitivity, to primary endpoints like the incidence of late-stage cancer. It also incorporates the practical implementation of the biomarker-testing program in real-world scenarios. Based on the chronological time scale aligned with RCT, our method allows the assessment of study powers based on key features of the new program, including the test sensitivity, the length of follow-up, and the number and frequency of repeated tests. The calculation tool from the proposed method will enable practitioners to perform realistic and quick evaluations when strategizing screening trials for specific diseases. We use numerical examples based on the National Lung Screening Trial to demonstrate the method.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 3","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11413908/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142280080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Estimating the size of a closed population by modeling latent and observed heterogeneity. 通过对潜在和观察到的异质性建模来估算封闭种群的规模。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae017

Francesco Bartolucci, Antonio Forcina

The paper extends the empirical likelihood (EL) approach of Liu et al. to a new and very flexible family of latent class models for capture-recapture data also allowing for serial dependence on previous capture history, conditionally on latent type and covariates. The EL approach allows to estimate the overall population size directly rather than by adding estimates conditional to covariate configurations. A Fisher-scoring algorithm for maximum likelihood estimation is proposed and a more efficient alternative to the traditional EL approach for estimating the non-parametric component is introduced; this allows us to show that the mapping between the non-parametric distribution of the covariates and the probabilities of being never captured is one-to-one and strictly increasing. Asymptotic results are outlined, and a procedure for constructing profile likelihood confidence intervals for the population size is presented. Two examples based on real data are used to illustrate the proposed approach and a simulation study indicates that, when estimating the overall undercount, the method proposed here is substantially more efficient than the one based on conditional maximum likelihood estimation, especially when the sample size is not sufficiently large.

本文将 Liu 等人的经验似然法（EL）扩展到一个新的、非常灵活的捕获-再捕获数据的潜类模型系列，该模型还允许对先前捕获历史的序列依赖，并以潜类和协变量为条件。EL 方法允许直接估计总体种群数量，而不是通过添加协变量配置条件下的估计值。我们提出了最大似然估计的费雪评分算法，并引入了一种更有效的替代传统 EL 方法的方法来估计非参数成分；这使我们能够证明协变量的非参数分布与从未被捕获的概率之间的映射是一一对应且严格递增的。概述了渐近结果，并介绍了构建人口规模的轮廓似然置信区间的程序。模拟研究表明，在估算总体少计人数时，本文提出的方法比基于条件最大似然估算的方法更有效，尤其是在样本量不够大的情况下。

{"title":"Estimating the size of a closed population by modeling latent and observed heterogeneity.","authors":"Francesco Bartolucci, Antonio Forcina","doi":"10.1093/biomtc/ujae017","DOIUrl":"10.1093/biomtc/ujae017","url":null,"abstract":"The paper extends the empirical likelihood (EL) approach of Liu et al. to a new and very flexible family of latent class models for capture-recapture data also allowing for serial dependence on previous capture history, conditionally on latent type and covariates. The EL approach allows to estimate the overall population size directly rather than by adding estimates conditional to covariate configurations. A Fisher-scoring algorithm for maximum likelihood estimation is proposed and a more efficient alternative to the traditional EL approach for estimating the non-parametric component is introduced; this allows us to show that the mapping between the non-parametric distribution of the covariates and the probabilities of being never captured is one-to-one and strictly increasing. Asymptotic results are outlined, and a procedure for constructing profile likelihood confidence intervals for the population size is presented. Two examples based on real data are used to illustrate the proposed approach and a simulation study indicates that, when estimating the overall undercount, the method proposed here is substantially more efficient than the one based on conditional maximum likelihood estimation, especially when the sample size is not sufficiently large.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140304679","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Flagging unusual clusters based on linear mixed models using weighted and self-calibrated predictors. 基于线性混合模型，使用加权和自校准预测因子标记异常群集。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae022

Charles E McCulloch, John M Neuhaus, Ross D Boylan

Statistical models incorporating cluster-specific intercepts are commonly used in hierarchical settings, for example, observations clustered within patients or patients clustered within hospitals. Predicted values of these intercepts are often used to identify or "flag" extreme or outlying clusters, such as poorly performing hospitals or patients with rapid declines in their health. We consider a variety of flagging rules, assessing different predictors, and using different accuracy measures. Using theoretical calculations and comprehensive numerical evaluation, we show that previously proposed rules based on the 2 most commonly used predictors, the usual best linear unbiased predictor and fixed effects predictor, perform extremely poorly: the incorrect flagging rates are either unacceptably high (approaching 0.5 in the limit) or overly conservative (eg, much <0.05 for reasonable parameter values, leading to very low correct flagging rates). We develop novel methods for flagging extreme clusters that can control the incorrect flagging rates, including very simple-to-use versions that we call "self-calibrated." The new methods have substantially higher correct flagging rates than previously proposed methods for flagging extreme values, while controlling the incorrect flagging rates. We illustrate their application using data on length of stay in pediatric hospitals for children admitted for asthma diagnoses.

包含特定群组截距的统计模型常用于分层环境，例如，观察结果集中在患者内部或患者集中在医院内部。这些截距的预测值通常用于识别或 "标记 "极端或离群群组，如表现不佳的医院或健康状况急剧下降的患者。我们考虑了多种标记规则，评估了不同的预测因子，并采用了不同的准确度测量方法。通过理论计算和全面的数值评估，我们发现以前提出的基于两种最常用预测因子（通常的最佳线性无偏预测因子和固定效应预测因子）的规则表现极差：错误标记率要么高得令人无法接受（接近 0.5 的极限值），要么过于保守（例如，远远低于 0.5）。

{"title":"Flagging unusual clusters based on linear mixed models using weighted and self-calibrated predictors.","authors":"Charles E McCulloch, John M Neuhaus, Ross D Boylan","doi":"10.1093/biomtc/ujae022","DOIUrl":"10.1093/biomtc/ujae022","url":null,"abstract":"Statistical models incorporating cluster-specific intercepts are commonly used in hierarchical settings, for example, observations clustered within patients or patients clustered within hospitals. Predicted values of these intercepts are often used to identify or \"flag\" extreme or outlying clusters, such as poorly performing hospitals or patients with rapid declines in their health. We consider a variety of flagging rules, assessing different predictors, and using different accuracy measures. Using theoretical calculations and comprehensive numerical evaluation, we show that previously proposed rules based on the 2 most commonly used predictors, the usual best linear unbiased predictor and fixed effects predictor, perform extremely poorly: the incorrect flagging rates are either unacceptably high (approaching 0.5 in the limit) or overly conservative (eg, much <0.05 for reasonable parameter values, leading to very low correct flagging rates). We develop novel methods for flagging extreme clusters that can control the incorrect flagging rates, including very simple-to-use versions that we call \"self-calibrated.\" The new methods have substantially higher correct flagging rates than previously proposed methods for flagging extreme values, while controlling the incorrect flagging rates. We illustrate their application using data on length of stay in pediatric hospitals for children admitted for asthma diagnoses.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140334556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Discussion on "Bayesian meta-analysis of penetrance for cancer risk" by Thanthirige Lakshika M. Ruberu, Danielle Braun, Giovanni Parmigiani, and Swati Biswas. Thanthirige Lakshika M. Ruberu、Danielle Braun、Giovanni Parmigiani 和 Swati Biswas 关于 "癌症风险渗透的贝叶斯元分析 "的讨论。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae039

Sudipto Banerjee

引用次数: 0

Robustness of response-adaptive randomization. 反应自适应随机化的稳健性。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae049

Xiaoqing Ye, Feifang Hu, Wei Ma

Doubly adaptive biased coin design (DBCD), a response-adaptive randomization scheme, aims to skew subject assignment probabilities based on accrued responses for ethical considerations. Recent years have seen substantial advances in understanding DBCD's theoretical properties, assuming correct model specification for the responses. However, concerns have been raised about the impact of model misspecification on its design and analysis. In this paper, we assess the robustness to both design model misspecification and analysis model misspecification under DBCD. On one hand, we confirm that the consistency and asymptotic normality of the allocation proportions can be preserved, even when the responses follow a distribution other than the one imposed by the design model during the implementation of DBCD. On the other hand, we extensively investigate three commonly used linear regression models for estimating and inferring the treatment effect, namely difference-in-means, analysis of covariance (ANCOVA) I, and ANCOVA II. By allowing these regression models to be arbitrarily misspecified, thereby not reflecting the true data generating process, we derive the consistency and asymptotic normality of the treatment effect estimators evaluated from the three models. The asymptotic properties show that the ANCOVA II model, which takes covariate-by-treatment interaction terms into account, yields the most efficient estimator. These results can provide theoretical support for using DBCD in scenarios involving model misspecification, thereby promoting the widespread application of this randomization procedure.

双向自适应偏向硬币设计（DBCD）是一种反应自适应随机化方案，其目的是出于伦理考虑，在累积反应的基础上倾斜受试者分配概率。近年来，人们对 DBCD 理论特性的理解有了长足的进步，前提是要对反应进行正确的模型规范。然而，人们对模型规范错误对其设计和分析的影响表示担忧。在本文中，我们评估了 DBCD 对设计模型误设和分析模型误设的稳健性。一方面，我们证实，即使在实施 DBCD 时，响应的分布与设计模型所强加的分布不同，分配比例的一致性和渐近正态性仍然可以保持。另一方面，我们广泛研究了用于估计和推断治疗效果的三种常用线性回归模型，即均值差、协方差分析（ANCOVA）I 和 ANCOVA II。通过允许这些回归模型被任意错误地指定，从而不反映真实的数据生成过程，我们推导出了从这三个模型中评估出的治疗效果估计值的一致性和渐近正态性。渐近性质表明，考虑了协变量与治疗交互项的方差分析 II 模型产生了最有效的估计值。这些结果为在涉及模型不规范的情况下使用 DBCD 提供了理论支持，从而促进了这一随机化程序的广泛应用。

{"title":"Robustness of response-adaptive randomization.","authors":"Xiaoqing Ye, Feifang Hu, Wei Ma","doi":"10.1093/biomtc/ujae049","DOIUrl":"https://doi.org/10.1093/biomtc/ujae049","url":null,"abstract":"Doubly adaptive biased coin design (DBCD), a response-adaptive randomization scheme, aims to skew subject assignment probabilities based on accrued responses for ethical considerations. Recent years have seen substantial advances in understanding DBCD's theoretical properties, assuming correct model specification for the responses. However, concerns have been raised about the impact of model misspecification on its design and analysis. In this paper, we assess the robustness to both design model misspecification and analysis model misspecification under DBCD. On one hand, we confirm that the consistency and asymptotic normality of the allocation proportions can be preserved, even when the responses follow a distribution other than the one imposed by the design model during the implementation of DBCD. On the other hand, we extensively investigate three commonly used linear regression models for estimating and inferring the treatment effect, namely difference-in-means, analysis of covariance (ANCOVA) I, and ANCOVA II. By allowing these regression models to be arbitrarily misspecified, thereby not reflecting the true data generating process, we derive the consistency and asymptotic normality of the treatment effect estimators evaluated from the three models. The asymptotic properties show that the ANCOVA II model, which takes covariate-by-treatment interaction terms into account, yields the most efficient estimator. These results can provide theoretical support for using DBCD in scenarios involving model misspecification, thereby promoting the widespread application of this randomization procedure.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141178772","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Bayesian semi-parametric model for learning biomarker trajectories and changepoints in the preclinical phase of Alzheimer's disease. 学习阿尔茨海默病临床前阶段生物标记物轨迹和变化点的贝叶斯半参数模型。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae048

Kunbo Wang, William Hua, MeiCheng Wang, Yanxun Xu

It has become consensus that mild cognitive impairment (MCI), one of the early symptoms onset of Alzheimer's disease (AD), may appear 10 or more years after the emergence of neuropathological abnormalities. Therefore, understanding the progression of AD biomarkers and uncovering when brain alterations begin in the preclinical stage, while patients are still cognitively normal, are crucial for effective early detection and therapeutic development. In this paper, we develop a Bayesian semiparametric framework that jointly models the longitudinal trajectory of the AD biomarker with a changepoint relative to the occurrence of symptoms onset, which is subject to left truncation and right censoring, in a heterogeneous population. Furthermore, unlike most existing methods assuming that everyone in the considered population will eventually develop the disease, our approach accounts for the possibility that some individuals may never experience MCI or AD, even after a long follow-up time. We evaluate the proposed model through simulation studies and demonstrate its clinical utility by examining an important AD biomarker, ptau181, using a dataset from the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD) study.

轻度认知障碍（MCI）是阿尔茨海默病（AD）的早期症状之一，可能在神经病理学异常出现 10 年或更长时间后才出现，这一点已成为共识。因此，了解阿尔茨海默病生物标志物的发展过程，并在患者认知能力正常的情况下揭示大脑改变何时开始于临床前阶段，对于有效的早期检测和治疗开发至关重要。在本文中，我们开发了一种贝叶斯半参数框架，在异质性人群中，该框架可联合建模AD生物标志物的纵向轨迹与相对于症状发作的变化点，该变化点会受到左截断和右删减的影响。此外，与大多数现有方法假设所考虑人群中的每个人最终都会发病不同，我们的方法考虑到了某些个体即使经过长时间随访也可能从未出现 MCI 或 AD 的可能性。我们通过模拟研究对所提出的模型进行了评估，并利用正常人认知能力下降生物标志物（BIOCARD）研究的数据集检测了一个重要的注意力缺失症生物标志物 ptau181，从而证明了该模型的临床实用性。

{"title":"A Bayesian semi-parametric model for learning biomarker trajectories and changepoints in the preclinical phase of Alzheimer's disease.","authors":"Kunbo Wang, William Hua, MeiCheng Wang, Yanxun Xu","doi":"10.1093/biomtc/ujae048","DOIUrl":"10.1093/biomtc/ujae048","url":null,"abstract":"It has become consensus that mild cognitive impairment (MCI), one of the early symptoms onset of Alzheimer's disease (AD), may appear 10 or more years after the emergence of neuropathological abnormalities. Therefore, understanding the progression of AD biomarkers and uncovering when brain alterations begin in the preclinical stage, while patients are still cognitively normal, are crucial for effective early detection and therapeutic development. In this paper, we develop a Bayesian semiparametric framework that jointly models the longitudinal trajectory of the AD biomarker with a changepoint relative to the occurrence of symptoms onset, which is subject to left truncation and right censoring, in a heterogeneous population. Furthermore, unlike most existing methods assuming that everyone in the considered population will eventually develop the disease, our approach accounts for the possibility that some individuals may never experience MCI or AD, even after a long follow-up time. We evaluate the proposed model through simulation studies and demonstrate its clinical utility by examining an important AD biomarker, ptau181, using a dataset from the Biomarkers of Cognitive Decline Among Normal Individuals (BIOCARD) study.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11110494/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141074619","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Incorporating nonparametric methods for estimating causal excursion effects in mobile health with zero-inflated count outcomes. 采用非参数方法估计零膨胀计数结果移动健康中的因果偏移效应。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae054

Xueqing Liu, Tianchen Qian, Lauren Bell, Bibhas Chakraborty

In mobile health, tailoring interventions for real-time delivery is of paramount importance. Micro-randomized trials have emerged as the "gold-standard" methodology for developing such interventions. Analyzing data from these trials provides insights into the efficacy of interventions and the potential moderation by specific covariates. The "causal excursion effect," a novel class of causal estimand, addresses these inquiries. Yet, existing research mainly focuses on continuous or binary data, leaving count data largely unexplored. The current work is motivated by the Drink Less micro-randomized trial from the UK, which focuses on a zero-inflated proximal outcome, i.e., the number of screen views in the subsequent hour following the intervention decision point. To be specific, we revisit the concept of causal excursion effect, specifically for zero-inflated count outcomes, and introduce novel estimation approaches that incorporate nonparametric techniques. Bidirectional asymptotics are established for the proposed estimators. Simulation studies are conducted to evaluate the performance of the proposed methods. As an illustration, we also implement these methods to the Drink Less trial data.

在移动医疗领域，为实时交付量身定制的干预措施至关重要。微型随机试验已成为开发此类干预措施的 "黄金标准 "方法。通过分析这些试验的数据，可以深入了解干预措施的效果以及特定协变量的潜在调节作用。因果偏离效应 "是一类新的因果估计值，可以解决这些问题。然而，现有的研究主要集中在连续或二元数据上，而对计数数据的研究还很少。目前的研究是受英国 "少喝酒 "微观随机试验的启发，该试验侧重于零膨胀的近端结果，即干预决策点后一小时内的屏幕浏览次数。具体来说，我们重新审视了因果偏移效应的概念，特别是针对零膨胀计数结果，并引入了结合非参数技术的新型估算方法。我们为所提出的估计方法建立了双向渐近线。我们还进行了模拟研究，以评估所提出方法的性能。作为说明，我们还将这些方法应用于饮酒少试验数据。

{"title":"Incorporating nonparametric methods for estimating causal excursion effects in mobile health with zero-inflated count outcomes.","authors":"Xueqing Liu, Tianchen Qian, Lauren Bell, Bibhas Chakraborty","doi":"10.1093/biomtc/ujae054","DOIUrl":"https://doi.org/10.1093/biomtc/ujae054","url":null,"abstract":"In mobile health, tailoring interventions for real-time delivery is of paramount importance. Micro-randomized trials have emerged as the \"gold-standard\" methodology for developing such interventions. Analyzing data from these trials provides insights into the efficacy of interventions and the potential moderation by specific covariates. The \"causal excursion effect,\" a novel class of causal estimand, addresses these inquiries. Yet, existing research mainly focuses on continuous or binary data, leaving count data largely unexplored. The current work is motivated by the Drink Less micro-randomized trial from the UK, which focuses on a zero-inflated proximal outcome, i.e., the number of screen views in the subsequent hour following the intervention decision point. To be specific, we revisit the concept of causal excursion effect, specifically for zero-inflated count outcomes, and introduce novel estimation approaches that incorporate nonparametric techniques. Bidirectional asymptotics are established for the proposed estimators. Simulation studies are conducted to evaluate the performance of the proposed methods. As an illustration, we also implement these methods to the Drink Less trial data.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141260409","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Bayesian convolutional neural network-based generalized linear model. 基于贝叶斯卷积神经网络的广义线性模型。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae057

Yeseul Jeon, Won Chang, Seonghyun Jeong, Sanghoon Han, Jaewoo Park

Convolutional neural networks (CNNs) provide flexible function approximations for a wide variety of applications when the input variables are in the form of images or spatial data. Although CNNs often outperform traditional statistical models in prediction accuracy, statistical inference, such as estimating the effects of covariates and quantifying the prediction uncertainty, is not trivial due to the highly complicated model structure and overparameterization. To address this challenge, we propose a new Bayesian approach by embedding CNNs within the generalized linear models (GLMs) framework. We use extracted nodes from the last hidden layer of CNN with Monte Carlo (MC) dropout as informative covariates in GLM. This improves accuracy in prediction and regression coefficient inference, allowing for the interpretation of coefficients and uncertainty quantification. By fitting ensemble GLMs across multiple realizations from MC dropout, we can account for uncertainties in extracting the features. We apply our methods to biological and epidemiological problems, which have both high-dimensional correlated inputs and vector covariates. Specifically, we consider malaria incidence data, brain tumor image data, and fMRI data. By extracting information from correlated inputs, the proposed method can provide an interpretable Bayesian analysis. The algorithm can be broadly applicable to image regressions or correlated data analysis by enabling accurate Bayesian inference quickly.

当输入变量为图像或空间数据时，卷积神经网络（CNN）可为各种应用提供灵活的函数近似。虽然卷积神经网络在预测准确性上往往优于传统统计模型，但由于模型结构非常复杂且参数过多，统计推断（如估计协变量的影响和量化预测的不确定性）并非易事。为了应对这一挑战，我们提出了一种新的贝叶斯方法，即在广义线性模型（GLM）框架内嵌入 CNN。我们将从 CNN 最后一个隐藏层提取的节点与蒙特卡罗（MC）剔除作为广义线性模型中的信息协变量。这提高了预测和回归系数推断的准确性，允许对系数进行解释和不确定性量化。通过拟合来自 MC 丢失的多个变现的集合 GLM，我们可以考虑提取特征时的不确定性。我们将我们的方法应用于生物和流行病学问题，这些问题既有高维相关输入，也有向量协变量。具体来说，我们考虑了疟疾发病率数据、脑肿瘤图像数据和 fMRI 数据。通过从相关输入中提取信息，所提出的方法可以提供可解释的贝叶斯分析。通过快速实现准确的贝叶斯推理，该算法可广泛应用于图像回归或相关数据分析。

{"title":"A Bayesian convolutional neural network-based generalized linear model.","authors":"Yeseul Jeon, Won Chang, Seonghyun Jeong, Sanghoon Han, Jaewoo Park","doi":"10.1093/biomtc/ujae057","DOIUrl":"https://doi.org/10.1093/biomtc/ujae057","url":null,"abstract":"Convolutional neural networks (CNNs) provide flexible function approximations for a wide variety of applications when the input variables are in the form of images or spatial data. Although CNNs often outperform traditional statistical models in prediction accuracy, statistical inference, such as estimating the effects of covariates and quantifying the prediction uncertainty, is not trivial due to the highly complicated model structure and overparameterization. To address this challenge, we propose a new Bayesian approach by embedding CNNs within the generalized linear models (GLMs) framework. We use extracted nodes from the last hidden layer of CNN with Monte Carlo (MC) dropout as informative covariates in GLM. This improves accuracy in prediction and regression coefficient inference, allowing for the interpretation of coefficients and uncertainty quantification. By fitting ensemble GLMs across multiple realizations from MC dropout, we can account for uncertainties in extracting the features. We apply our methods to biological and epidemiological problems, which have both high-dimensional correlated inputs and vector covariates. Specifically, we consider malaria incidence data, brain tumor image data, and fMRI data. By extracting information from correlated inputs, the proposed method can provide an interpretable Bayesian analysis. The algorithm can be broadly applicable to image regressions or correlated data analysis by enabling accurate Bayesian inference quickly.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141417569","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Dissecting the colocalized GWAS and eQTLs with mediation analysis for high-dimensional exposures and confounders. 通过对高维暴露和混杂因素进行中介分析，剖析定位的 GWAS 和 eQTL。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae050

Qi Zhang, Zhikai Yang, Jinliang Yang

To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).

为了充分利用全基因组关联研究（GWAS）和性状与分子表型的数量性状位点（QTL）图谱的进步，从机理上理解遗传调控，生物研究人员经常调查与 QTL 或 GWAS 峰值共定位的表达 QTL（eQTL）。我们的研究受到了两项此类研究的启发。一项研究的目的是找出导致表型变异的因果单核苷酸多态性，这些单核苷酸多态性在玉米转录组水平的影响可以解释其效应。另一项以小鼠为对象的研究侧重于发现通过调控反式调控基因诱导表型变化的顺式驱动基因。这两项研究都可以表述为具有潜在高维暴露、混杂因素和中介因素的中介问题，旨在估算每种暴露的总体间接效应（IE）。在本文中，我们提出了 MedDiC，这是一种基于差异系数法估算总体 IE 的新程序。我们的模拟研究发现，与其他竞争方法相比，MedDiC 能提供有效的 IE 推断，具有更高的功率、更短的置信区间和更快的计算时间。我们将 MedDiC 应用于上述两个激励性数据集，发现 MedDiC 在分析密切相关的性状时能产生可重复的输出结果，并得到外部生物学证据的支持。代码和其他信息可在我们的 GitHub 页面 (https://github.com/QiZhangStat/MedDiC) 上获取。

{"title":"Dissecting the colocalized GWAS and eQTLs with mediation analysis for high-dimensional exposures and confounders.","authors":"Qi Zhang, Zhikai Yang, Jinliang Yang","doi":"10.1093/biomtc/ujae050","DOIUrl":"https://doi.org/10.1093/biomtc/ujae050","url":null,"abstract":"To leverage the advancements in genome-wide association studies (GWAS) and quantitative trait loci (QTL) mapping for traits and molecular phenotypes to gain mechanistic understanding of the genetic regulation, biological researchers often investigate the expression QTLs (eQTLs) that colocalize with QTL or GWAS peaks. Our research is inspired by 2 such studies. One aims to identify the causal single nucleotide polymorphisms that are responsible for the phenotypic variation and whose effects can be explained by their impacts at the transcriptomic level in maize. The other study in mouse focuses on uncovering the cis-driver genes that induce phenotypic changes by regulating trans-regulated genes. Both studies can be formulated as mediation problems with potentially high-dimensional exposures, confounders, and mediators that seek to estimate the overall indirect effect (IE) for each exposure. In this paper, we propose MedDiC, a novel procedure to estimate the overall IE based on difference-in-coefficients approach. Our simulation studies find that MedDiC offers valid inference for the IE with higher power, shorter confidence intervals, and faster computing time than competing methods. We apply MedDiC to the 2 aforementioned motivating datasets and find that MedDiC yields reproducible outputs across the analysis of closely related traits, with results supported by external biological evidence. The code and additional information are available on our GitHub page (https://github.com/QiZhangStat/MedDiC).","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141155115","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Case weighted power priors for hybrid control analyses with time-to-event data. 利用时间到事件数据进行混合控制分析的案例加权幂先验。

IF 1.9 4区数学 Q3 BIOLOGY

Biometrics

Pub Date : 2024-03-27 DOI: 10.1093/biomtc/ujae019

Evan Kwiatkowski, Jiawen Zhu, Xiao Li, Herbert Pang, Grazyna Lieberman, Matthew A Psioda

We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCTs) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g., unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.

我们开发了一种混合分析方法，利用外部对照来增强随机对照试验（RCT）中的内部对照臂，根据 RCT 和外部对照患者的相似性来确定借用程度，以考虑系统性差异（如未测量的混杂因素）。该方法是对功率先验的新扩展，根据与随机对照数据的兼容性，分别计算每个外部对照的贴现权重。贴现权重是利用外部对照的预测分布确定的，预测分布是通过 RCT 估计的时间到事件参数的后验分布得出的。该方法使用的是具有片断恒定基线危害的比例危害回归模型。本文以一项已完成的非小细胞肺癌试验为基础，介绍了一项模拟研究和一个真实数据示例。结果表明，在外部对照和 RCT 群体之间存在各种形式的不相容性的情况下，病例加权功率先验可提供稳健的推断。

{"title":"Case weighted power priors for hybrid control analyses with time-to-event data.","authors":"Evan Kwiatkowski, Jiawen Zhu, Xiao Li, Herbert Pang, Grazyna Lieberman, Matthew A Psioda","doi":"10.1093/biomtc/ujae019","DOIUrl":"10.1093/biomtc/ujae019","url":null,"abstract":"We develop a method for hybrid analyses that uses external controls to augment internal control arms in randomized controlled trials (RCTs) where the degree of borrowing is determined based on similarity between RCT and external control patients to account for systematic differences (e.g., unmeasured confounders). The method represents a novel extension of the power prior where discounting weights are computed separately for each external control based on compatibility with the randomized control data. The discounting weights are determined using the predictive distribution for the external controls derived via the posterior distribution for time-to-event parameters estimated from the RCT. This method is applied using a proportional hazards regression model with piecewise constant baseline hazard. A simulation study and a real-data example are presented based on a completed trial in non-small cell lung cancer. It is shown that the case weighted power prior provides robust inference under various forms of incompatibility between the external controls and RCT population.","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 2","pages":""},"PeriodicalIF":1.9,"publicationDate":"2024-03-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10968526/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140304678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0