首页 > 最新文献

Biostatistics最新文献

英文 中文
Estimation and inference for causal spillover effects in egocentric-network randomized trials in the presence of network membership misclassification. 存在网络成员错误分类的自我中心网络随机试验中因果溢出效应的估计与推断。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf009
Ariel Chao, Donna Spiegelman, Ashley Buchanan, Laura Forastiere

To leverage peer influence and increase population behavioral changes, behavioral interventions often rely on peer-based strategies. A common study design that assesses such strategies is the egocentric-network randomized trial (ENRT), where index participants receive a behavioral training and are encouraged to disseminate information to their peers. Under this design, a crucial estimand of interest is the Average Spillover Effect (ASpE), which measures the impact of the intervention on participants who do not receive it, but whose outcomes may be affected by others who do. The assessment of the ASpE relies on assumptions about, and correct measurement of, interference sets within which individuals may influence one another's outcomes. It can be challenging to properly specify interference sets, such as networks in ENRTs, and when mismeasured, intervention effects estimated by existing methods will be biased. In studies where social networks play an important role in disease transmission or behavior change, correcting ASpE estimates for bias due to network misclassification is critical for accurately evaluating the full impact of interventions. We combined measurement error and causal inference methods to bias-correct the ASpE estimate for network misclassification in ENRTs, when surrogate networks are recorded in place of true ones, and validation data that relate the misclassified to the true networks are available. We investigated finite sample properties of our methods in an extensive simulation study and illustrated our methods in the HIV Prevention Trials Network (HPTN) 037 study.

为了利用同伴影响和增加人口行为改变,行为干预往往依赖于基于同伴的策略。评估这些策略的一种常见的研究设计是自我中心网络随机试验(ENRT),在该试验中,指数参与者接受行为训练,并鼓励他们向同伴传播信息。在这种设计下,一个重要的估计是平均溢出效应(ASpE),它衡量干预对没有接受干预的参与者的影响,但其结果可能受到其他参与者的影响。ASpE的评估依赖于对干扰集的假设和对干扰集的正确测量,在这些干扰集中,个体可能会影响彼此的结果。适当地指定干扰集(例如ENRTs中的网络)可能具有挑战性,并且当测量错误时,用现有方法估计的干预效果将存在偏差。在社会网络在疾病传播或行为改变中发挥重要作用的研究中,纠正由于网络错误分类而导致的ASpE估计偏差对于准确评估干预措施的全部影响至关重要。我们将测量误差和因果推理方法结合起来,对ENRTs中网络错误分类的ASpE估计进行偏差校正,当记录替代网络代替真实网络时,并且可以获得将错误分类与真实网络联系起来的验证数据。我们在广泛的模拟研究中研究了我们方法的有限样本特性,并在HIV预防试验网络(HPTN) 037研究中说明了我们的方法。
{"title":"Estimation and inference for causal spillover effects in egocentric-network randomized trials in the presence of network membership misclassification.","authors":"Ariel Chao, Donna Spiegelman, Ashley Buchanan, Laura Forastiere","doi":"10.1093/biostatistics/kxaf009","DOIUrl":"10.1093/biostatistics/kxaf009","url":null,"abstract":"<p><p>To leverage peer influence and increase population behavioral changes, behavioral interventions often rely on peer-based strategies. A common study design that assesses such strategies is the egocentric-network randomized trial (ENRT), where index participants receive a behavioral training and are encouraged to disseminate information to their peers. Under this design, a crucial estimand of interest is the Average Spillover Effect (ASpE), which measures the impact of the intervention on participants who do not receive it, but whose outcomes may be affected by others who do. The assessment of the ASpE relies on assumptions about, and correct measurement of, interference sets within which individuals may influence one another's outcomes. It can be challenging to properly specify interference sets, such as networks in ENRTs, and when mismeasured, intervention effects estimated by existing methods will be biased. In studies where social networks play an important role in disease transmission or behavior change, correcting ASpE estimates for bias due to network misclassification is critical for accurately evaluating the full impact of interventions. We combined measurement error and causal inference methods to bias-correct the ASpE estimate for network misclassification in ENRTs, when surrogate networks are recorded in place of true ones, and validation data that relate the misclassified to the true networks are available. We investigated finite sample properties of our methods in an extensive simulation study and illustrated our methods in the HIV Prevention Trials Network (HPTN) 037 study.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11955068/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143755648","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sensitivity analysis for the probability of benefit in randomized controlled trials with a binary treatment and a binary outcome. 采用二元治疗和二元结局的随机对照试验中获益概率的敏感性分析。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf011
Iuliana Ciocănea-Teodorescu, Erin E Gabriel, Arvid Sjölander

For a comprehensive understanding of the effect of a given treatment on an outcome of interest, quantification of individual treatment heterogeneity is essential, alongside estimation of the average causal effect. However, even in randomized controlled trials, quantities such as the probability of benefit or the probability of harm are not identifiable, since multiple potential outcomes cannot be observed simultaneously for the same individual. We propose a sensitivity analysis for the probability of benefit in randomized controlled trial settings with a binary treatment and a binary outcome, by quantifying the deviation from conditional independence of the two potential outcomes, given a set of measured prognostic baseline covariates. We do this using a marginal sensitivity analysis parameter that does not depend on the number or complexity of the measured covariates. We provide a guide to estimation and interpretation, and illustrate our method in simulations, as well as using a real data example from a randomized controlled trial studying the effect of umbilical vein oxytocin administration on the need for manual removal of the placenta during birth.

为了全面了解给定治疗对目标结果的影响,除了估计平均因果效应外,还必须对个体治疗异质性进行量化。然而,即使在随机对照试验中,也无法确定诸如获益概率或伤害概率之类的数量,因为无法同时观察到同一个体的多种潜在结果。我们提出在随机对照试验设置中采用二元治疗和二元结果的获益概率的敏感性分析,通过量化两种潜在结果的条件独立性偏差,给定一组测量的预后基线协变量。我们使用不依赖于测量协变量的数量或复杂性的边际灵敏度分析参数来做到这一点。我们提供了一个估计和解释的指南,并在模拟中说明了我们的方法,并使用了一个随机对照试验的真实数据示例,研究了脐静脉催产素对分娩时人工移除胎盘需求的影响。
{"title":"Sensitivity analysis for the probability of benefit in randomized controlled trials with a binary treatment and a binary outcome.","authors":"Iuliana Ciocănea-Teodorescu, Erin E Gabriel, Arvid Sjölander","doi":"10.1093/biostatistics/kxaf011","DOIUrl":"10.1093/biostatistics/kxaf011","url":null,"abstract":"<p><p>For a comprehensive understanding of the effect of a given treatment on an outcome of interest, quantification of individual treatment heterogeneity is essential, alongside estimation of the average causal effect. However, even in randomized controlled trials, quantities such as the probability of benefit or the probability of harm are not identifiable, since multiple potential outcomes cannot be observed simultaneously for the same individual. We propose a sensitivity analysis for the probability of benefit in randomized controlled trial settings with a binary treatment and a binary outcome, by quantifying the deviation from conditional independence of the two potential outcomes, given a set of measured prognostic baseline covariates. We do this using a marginal sensitivity analysis parameter that does not depend on the number or complexity of the measured covariates. We provide a guide to estimation and interpretation, and illustrate our method in simulations, as well as using a real data example from a randomized controlled trial studying the effect of umbilical vein oxytocin administration on the need for manual removal of the placenta during birth.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12129078/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144210358","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A filtering approach for statistical inference in a stochastic SIR model with an application to Covid-19 data. 随机SIR模型中统计推断的滤波方法及其在Covid-19数据中的应用
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf036
Katia Colaneri, Camilla Damian, Rüdiger Frey

In this paper, we consider a discrete-time stochastic SIR model, where the transmission rate and the number of infectious individuals are random and unobservable. This model accounts for random fluctuations in infectiousness and for non-detected infections. Thus, statistical inference has to be performed in a partial information setting. We adopt a Bayesian approach and use nested particle filtering to estimate the state of the system and the parameters. Moreover, we discuss forecasts and model tests based on the posterior predictive distribution. As a case study, we apply our methodology to Austrian Covid-19 infection data.

在本文中,我们考虑一个离散时间随机SIR模型,其中传播率和感染个体的数量是随机的和不可观察的。该模型考虑了传染性和未检测到的感染的随机波动。因此,统计推断必须在部分信息设置中执行。我们采用贝叶斯方法并使用嵌套粒子滤波来估计系统的状态和参数。此外,我们还讨论了基于后验预测分布的预测和模型检验。作为案例研究,我们将我们的方法应用于奥地利Covid-19感染数据。
{"title":"A filtering approach for statistical inference in a stochastic SIR model with an application to Covid-19 data.","authors":"Katia Colaneri, Camilla Damian, Rüdiger Frey","doi":"10.1093/biostatistics/kxaf036","DOIUrl":"10.1093/biostatistics/kxaf036","url":null,"abstract":"<p><p>In this paper, we consider a discrete-time stochastic SIR model, where the transmission rate and the number of infectious individuals are random and unobservable. This model accounts for random fluctuations in infectiousness and for non-detected infections. Thus, statistical inference has to be performed in a partial information setting. We adopt a Bayesian approach and use nested particle filtering to estimate the state of the system and the parameters. Moreover, we discuss forecasts and model tests based on the posterior predictive distribution. As a case study, we apply our methodology to Austrian Covid-19 infection data.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12554006/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145373268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Determining vaccine responders in the presence of baseline immunity using single-cell assays and paired control samples. 在基线免疫存在的情况下,使用单细胞试验和配对对照样本确定疫苗应答者。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf045
Zhe Chen, Siyu Heng, Asa Tapley, Stephen De Rosa, Bo Zhang

A key objective in vaccine studies is to evaluate vaccine-induced immunogenicity and determine whether participants have mounted a response to the vaccine. Cellular immune responses are essential for assessing vaccine-induced immunogenicity, and single-cell assays, such as intracellular cytokine staining (ICS) and B-cell phenotyping (BCP), are commonly employed to profile individual immune cell phenotypes and the cytokines they produce after stimulation. In this article, we introduce a novel statistical framework for identifying vaccine responders using ICS data collected before and after vaccination. This framework incorporates paired control data to account for potential unintended variations between assay runs, such as batch effects, that could lead to misclassification of participants as vaccine responders or non-responders. To formally integrate paired control data for accounting for assay variation across different time points (ie before and after vaccination), our proposed framework calculates and reports two $ P $-values, both adjusting for paired control data but in distinct ways: (i) the maximally adjusted $ P $-value, which applies the most conservative adjustment to the unadjusted $ P $-value, ensuring validity over all plausible batch effects consistent with the paired control samples' data, and (ii) the minimally adjusted $ P $-value, which imposes only the minimal adjustment to the unadjusted $ P $-value, such that the adjusted $ P $-value cannot be falsified by the paired control samples' data. Minimally and maximally adjusted $ P $-values offer a balanced approach to managing Type I error rates and statistical power in the presence of batch effects. We apply this framework to analyze ICS data collected at baseline and 4 wks post-vaccination from the COVID-19 Prevention Network (CoVPN) 3008 study. Our analysis helps address two clinical questions: (i) which participants exhibited evidence of an incident Omicron infection between baseline and 4 wks after receiving the final dose of the primary vaccination series, and (ii) which participants showed vaccine-induced T cell responses against the Omicron BA.4/5 Spike protein.

疫苗研究的一个关键目标是评估疫苗诱导的免疫原性,并确定参与者是否对疫苗产生了反应。细胞免疫应答对于评估疫苗诱导的免疫原性至关重要,单细胞试验,如细胞内细胞因子染色(ICS)和b细胞表型(BCP),通常用于分析个体免疫细胞表型及其在刺激后产生的细胞因子。在本文中,我们介绍了一种新的统计框架,用于使用接种前后收集的ICS数据来识别疫苗应答者。该框架纳入了成对对照数据,以解释分析运行之间潜在的意外变化,例如批量效应,这可能导致将参与者错误分类为疫苗应答者或无应答者。为了正式整合成对对照数据,以解释不同时间点(即接种疫苗之前和之后)的检测变化,我们提出的框架计算并报告两个P值,它们都对成对对照数据进行了调整,但方式不同:(i)最大调整的$ P $值,它对未调整的$ P $值应用最保守的调整,确保与成对对照样本数据一致的所有似是而非的批效应的有效性;(ii)最小调整的$ P $值,它只对未调整的$ P $值施加最小的调整,这样调整后的$ P $值就不会被成对对照样本的数据伪造。最小和最大调整的$ P $值提供了一种平衡的方法来管理第一类错误率和存在批处理效应的统计能力。我们应用这一框架分析了COVID-19预防网络(CoVPN) 3008研究在基线和接种疫苗后4周收集的ICS数据。我们的分析有助于解决两个临床问题:(i)哪些参与者在接受一次疫苗系列的最后剂量后的基线和4周之间表现出意外的Omicron感染的证据,以及(ii)哪些参与者表现出疫苗诱导的针对Omicron BA.4/5刺突蛋白的T细胞反应。
{"title":"Determining vaccine responders in the presence of baseline immunity using single-cell assays and paired control samples.","authors":"Zhe Chen, Siyu Heng, Asa Tapley, Stephen De Rosa, Bo Zhang","doi":"10.1093/biostatistics/kxaf045","DOIUrl":"10.1093/biostatistics/kxaf045","url":null,"abstract":"<p><p>A key objective in vaccine studies is to evaluate vaccine-induced immunogenicity and determine whether participants have mounted a response to the vaccine. Cellular immune responses are essential for assessing vaccine-induced immunogenicity, and single-cell assays, such as intracellular cytokine staining (ICS) and B-cell phenotyping (BCP), are commonly employed to profile individual immune cell phenotypes and the cytokines they produce after stimulation. In this article, we introduce a novel statistical framework for identifying vaccine responders using ICS data collected before and after vaccination. This framework incorporates paired control data to account for potential unintended variations between assay runs, such as batch effects, that could lead to misclassification of participants as vaccine responders or non-responders. To formally integrate paired control data for accounting for assay variation across different time points (ie before and after vaccination), our proposed framework calculates and reports two $ P $-values, both adjusting for paired control data but in distinct ways: (i) the maximally adjusted $ P $-value, which applies the most conservative adjustment to the unadjusted $ P $-value, ensuring validity over all plausible batch effects consistent with the paired control samples' data, and (ii) the minimally adjusted $ P $-value, which imposes only the minimal adjustment to the unadjusted $ P $-value, such that the adjusted $ P $-value cannot be falsified by the paired control samples' data. Minimally and maximally adjusted $ P $-values offer a balanced approach to managing Type I error rates and statistical power in the presence of batch effects. We apply this framework to analyze ICS data collected at baseline and 4 wks post-vaccination from the COVID-19 Prevention Network (CoVPN) 3008 study. Our analysis helps address two clinical questions: (i) which participants exhibited evidence of an incident Omicron infection between baseline and 4 wks after receiving the final dose of the primary vaccination series, and (ii) which participants showed vaccine-induced T cell responses against the Omicron BA.4/5 Spike protein.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145643009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decomposition of longitudinal disparities: an application to the fetal growth-singletons study. 纵向差异分解:在胎儿生长-单胎研究中的应用。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf044
Sang Kyu Lee, Seonjin Kim, Mi-Ok Kim, Katherine L Grantz, Hyokyoung G Hong

Addressing health disparities across demographic groups remains a critical challenge in public health, with significant gaps in understanding how these disparities evolve over time. This paper extends the traditional Peters-Belson decomposition to a longitudinal setting, focusing on the role of a single explanatory variable, referred to as a modifier, that captures complex interactions with other covariates. The proposed method partitions disparities into 3 components: (i) the portion associated with differences in the conditional distribution of covariates, evaluated under a common distribution of the modifier across groups; (ii) the portion arising from differences in the distribution of the modifier and its interactions with other covariates; and (iii) the unexplained disparity not accounted for by observed covariates. Rather than aggregating the first 2 components into one "explained disparity," the proposed method allows for a separate characterization of temporal patterns in disparities, distinguishing those that are unassociated with the modifier from those that are associated with it. We illustrate the method using a fetal growth study, examining disparities in fetal development trajectories across racial and ethnic groups during pregnancy.

解决人口群体之间的健康差异仍然是公共卫生领域的一项重大挑战,在了解这些差异如何随时间演变方面存在重大差距。本文将传统的彼得斯-贝尔森分解扩展到纵向设置,重点关注单个解释变量的作用,称为修饰符,它捕获了与其他协变量的复杂相互作用。该方法将差异划分为3个部分:(i)与协变量条件分布差异相关的部分,在组间修饰符的共同分布下进行评估;(ii)修饰语分布的差异及其与其他协变量的相互作用所产生的部分;(iii)未被观测到的协变量解释的无法解释的差异。与其将前两个成分聚合成一个“可解释的差异”,建议的方法允许对差异中的时间模式进行单独的表征,区分那些与修饰语无关的和那些与之相关的。我们用胎儿生长研究来说明这种方法,研究了怀孕期间不同种族和民族群体胎儿发育轨迹的差异。
{"title":"Decomposition of longitudinal disparities: an application to the fetal growth-singletons study.","authors":"Sang Kyu Lee, Seonjin Kim, Mi-Ok Kim, Katherine L Grantz, Hyokyoung G Hong","doi":"10.1093/biostatistics/kxaf044","DOIUrl":"10.1093/biostatistics/kxaf044","url":null,"abstract":"<p><p>Addressing health disparities across demographic groups remains a critical challenge in public health, with significant gaps in understanding how these disparities evolve over time. This paper extends the traditional Peters-Belson decomposition to a longitudinal setting, focusing on the role of a single explanatory variable, referred to as a modifier, that captures complex interactions with other covariates. The proposed method partitions disparities into 3 components: (i) the portion associated with differences in the conditional distribution of covariates, evaluated under a common distribution of the modifier across groups; (ii) the portion arising from differences in the distribution of the modifier and its interactions with other covariates; and (iii) the unexplained disparity not accounted for by observed covariates. Rather than aggregating the first 2 components into one \"explained disparity,\" the proposed method allows for a separate characterization of temporal patterns in disparities, distinguishing those that are unassociated with the modifier from those that are associated with it. We illustrate the method using a fetal growth study, examining disparities in fetal development trajectories across racial and ethnic groups during pregnancy.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12701353/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145744404","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Functional quantile principal component analysis. 功能量化主成分分析
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxae040
Álvaro Méndez-Civieta, Ying Wei, Keith M Diaz, Jeff Goldsmith

This paper introduces functional quantile principal component analysis (FQPCA), a dimensionality reduction technique that extends the concept of functional principal components analysis (FPCA) to the examination of participant-specific quantiles curves. Our approach borrows strength across participants to estimate patterns in quantiles, and uses participant-level data to estimate loadings on those patterns. As a result, FQPCA is able to capture shifts in the scale and distribution of data that affect participant-level quantile curves, and is also a robust methodology suitable for dealing with outliers, heteroscedastic data or skewed data. The need for such methodology is exemplified by physical activity data collected using wearable devices. Participants often differ in the timing and intensity of physical activity behaviors, and capturing information beyond the participant-level expected value curves produced by FPCA is necessary for a robust quantification of diurnal patterns of activity. We illustrate our methods using accelerometer data from the National Health and Nutrition Examination Survey, and produce participant-level 10%, 50%, and 90% quantile curves over 24 h of activity. The proposed methodology is supported by simulation results, and is available as an R package.

本文介绍了功能量化主成分分析(FQPCA),这是一种降维技术,它将功能主成分分析(FPCA)的概念扩展到了对特定参与者量化曲线的研究。我们的方法借用不同参与者的力量来估计量化曲线的模式,并使用参与者层面的数据来估计这些模式的载荷。因此,FQPCA 能够捕捉到数据规模和分布中影响参与者水平量化曲线的变化,也是一种适用于处理异常值、异方差数据或倾斜数据的稳健方法。使用可穿戴设备收集的身体活动数据就说明了对这种方法的需求。参与者的体力活动行为在时间和强度上往往各不相同,要想对昼夜活动模式进行稳健的量化,就必须捕捉 FPCA 生成的参与者级预期值曲线以外的信息。我们使用美国国家健康与营养调查的加速度计数据来说明我们的方法,并生成了参与者水平的 10%、50% 和 90% 的 24 小时活动量定量曲线。我们提出的方法得到了模拟结果的支持,并以 R 软件包的形式提供。
{"title":"Functional quantile principal component analysis.","authors":"Álvaro Méndez-Civieta, Ying Wei, Keith M Diaz, Jeff Goldsmith","doi":"10.1093/biostatistics/kxae040","DOIUrl":"10.1093/biostatistics/kxae040","url":null,"abstract":"<p><p>This paper introduces functional quantile principal component analysis (FQPCA), a dimensionality reduction technique that extends the concept of functional principal components analysis (FPCA) to the examination of participant-specific quantiles curves. Our approach borrows strength across participants to estimate patterns in quantiles, and uses participant-level data to estimate loadings on those patterns. As a result, FQPCA is able to capture shifts in the scale and distribution of data that affect participant-level quantile curves, and is also a robust methodology suitable for dealing with outliers, heteroscedastic data or skewed data. The need for such methodology is exemplified by physical activity data collected using wearable devices. Participants often differ in the timing and intensity of physical activity behaviors, and capturing information beyond the participant-level expected value curves produced by FPCA is necessary for a robust quantification of diurnal patterns of activity. We illustrate our methods using accelerometer data from the National Health and Nutrition Examination Survey, and produce participant-level 10%, 50%, and 90% quantile curves over 24 h of activity. The proposed methodology is supported by simulation results, and is available as an R package.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823270/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142513407","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Understanding the opioid syndemic in North Carolina: A novel approach to modeling and identifying factors. 了解北卡罗莱纳州的阿片类药物综合征:一种建模和识别因素的新方法。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxae052
Eva Murphy, David Kline, Kathleen L Egan, Kathryn E Lancaster, William C Miller, Lance A Waller, Staci A Hepler

The opioid epidemic is a significant public health challenge in North Carolina, but limited data restrict our understanding of its complexity. Examining trends and relationships among different outcomes believed to reflect opioid misuse provides an alternative perspective to understand the opioid epidemic. We use a Bayesian dynamic spatial factor model to capture the interrelated dynamics within six different county-level outcomes, such as illicit opioid overdose deaths, emergency department visits related to drug overdose, treatment counts for opioid use disorder, patients receiving prescriptions for buprenorphine, and newly diagnosed cases of acute and chronic hepatitis C virus and human immunodeficiency virus. We design the factor model to yield meaningful interactions among predefined subsets of these outcomes, causing a departure from the conventional lower triangular structure in the loadings matrix and leading to familiar identifiability issues. To address this challenge, we propose a novel approach that involves decomposing the loadings matrix within a Markov chain Monte Carlo algorithm, allowing us to estimate the loadings and factors uniquely. As a result, we gain a better understanding of the spatio-temporal dynamics of the opioid epidemic in North Carolina.

阿片类药物流行是北卡罗来纳州重大的公共卫生挑战,但有限的数据限制了我们对其复杂性的理解。研究被认为反映阿片类药物滥用的不同结果之间的趋势和关系,为了解阿片类药物流行提供了另一种视角。我们使用贝叶斯动态空间因子模型来捕捉六个不同县级结果的相关动态,例如非法阿片类药物过量死亡,与药物过量相关的急诊就诊,阿片类药物使用障碍的治疗计数,接受丁丙诺啡处方的患者,以及新诊断的急性和慢性丙型肝炎病毒和人类免疫缺陷病毒病例。我们设计了因子模型,以在这些结果的预定义子集之间产生有意义的相互作用,从而导致负载矩阵中传统的下三角形结构的偏离,并导致熟悉的可识别性问题。为了解决这一挑战,我们提出了一种新的方法,该方法涉及在马尔可夫链蒙特卡罗算法中分解负载矩阵,使我们能够唯一地估计负载和因素。因此,我们对北卡罗来纳州阿片类药物流行的时空动态有了更好的了解。
{"title":"Understanding the opioid syndemic in North Carolina: A novel approach to modeling and identifying factors.","authors":"Eva Murphy, David Kline, Kathleen L Egan, Kathryn E Lancaster, William C Miller, Lance A Waller, Staci A Hepler","doi":"10.1093/biostatistics/kxae052","DOIUrl":"10.1093/biostatistics/kxae052","url":null,"abstract":"<p><p>The opioid epidemic is a significant public health challenge in North Carolina, but limited data restrict our understanding of its complexity. Examining trends and relationships among different outcomes believed to reflect opioid misuse provides an alternative perspective to understand the opioid epidemic. We use a Bayesian dynamic spatial factor model to capture the interrelated dynamics within six different county-level outcomes, such as illicit opioid overdose deaths, emergency department visits related to drug overdose, treatment counts for opioid use disorder, patients receiving prescriptions for buprenorphine, and newly diagnosed cases of acute and chronic hepatitis C virus and human immunodeficiency virus. We design the factor model to yield meaningful interactions among predefined subsets of these outcomes, causing a departure from the conventional lower triangular structure in the loadings matrix and leading to familiar identifiability issues. To address this challenge, we propose a novel approach that involves decomposing the loadings matrix within a Markov chain Monte Carlo algorithm, allowing us to estimate the loadings and factors uniquely. As a result, we gain a better understanding of the spatio-temporal dynamics of the opioid epidemic in North Carolina.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823283/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143048855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Probabilistic clustering using shared latent variable model for assessing Alzheimer's disease biomarkers. 使用共享潜在变量模型评估阿尔茨海默病生物标志物的概率聚类。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf010
Yizhen Xu, Scott Zeger, Zheyu Wang

The preclinical stage of many neurodegenerative diseases can span decades before symptoms become apparent. Understanding the sequence of preclinical biomarker changes provides a critical opportunity for early diagnosis and effective intervention prior to significant loss of patients' brain functions. The main challenge to early detection lies in the absence of direct observation of the disease state and the considerable variability in both biomarkers and disease dynamics among individuals. Recent research hypothesized the existence of subgroups with distinct biomarker patterns due to co-morbidities and degrees of brain resilience. Our ability to diagnose early and intervene during the preclinical stage of neurodegenerative diseases will be enhanced by further insights into heterogeneity in the biomarker-disease relationship. In this article, we focus on Alzheimer's disease (AD) and attempt to identify the systematic patterns within the heterogeneous AD biomarker-disease cascade. Specifically, we quantify the disease progression using a dynamic latent variable whose mixture distribution represents patient subgroups. Model estimation uses Hamiltonian Monte Carlo with the number of clusters determined by the Bayesian Information Criterion. We report simulation studies that investigate the performance of the proposed model in finite sample settings that are similar to our motivating application. We apply the proposed model to the Biomarkers of Cognitive Decline Among Normal Individuals data, a longitudinal study that was conducted over 2 decades among individuals who were initially cognitively normal. Our application yields evidence consistent with the hypothetical model of biomarker dynamics presented in Jack Jr et al. In addition, our analysis identified 2 subgroups with distinct disease-onset patterns. Finally, we develop a dynamic prediction approach to improve the precision of prognoses.

许多神经退行性疾病的临床前阶段在症状变得明显之前可以跨越几十年。了解临床前生物标志物变化的顺序为早期诊断和有效干预提供了重要的机会,以便在患者脑功能显著丧失之前进行干预。早期检测的主要挑战在于缺乏对疾病状态的直接观察,以及个体之间生物标志物和疾病动态的相当大的差异。最近的研究假设存在具有不同生物标志物模式的亚群,由于合并症和大脑恢复能力的程度。通过进一步了解生物标志物与疾病关系的异质性,我们在神经退行性疾病的临床前阶段进行早期诊断和干预的能力将得到增强。在本文中,我们关注阿尔茨海默病(AD),并试图确定异质性AD生物标志物-疾病级联中的系统模式。具体来说,我们使用一个动态潜在变量来量化疾病进展,该变量的混合分布代表了患者亚组。模型估计采用哈密顿蒙特卡罗方法,聚类数量由贝叶斯信息准则确定。我们报告了模拟研究,研究了所提出的模型在有限样本设置中的性能,类似于我们的激励应用程序。我们将提出的模型应用于正常个体认知衰退的生物标志物数据,这是一项纵向研究,在最初认知正常的个体中进行了20多年。我们的应用产生了与Jack Jr等人提出的生物标志物动力学假设模型一致的证据。此外,我们的分析确定了2个具有不同发病模式的亚组。最后,我们开发了一种动态预测方法来提高预测的精度。
{"title":"Probabilistic clustering using shared latent variable model for assessing Alzheimer's disease biomarkers.","authors":"Yizhen Xu, Scott Zeger, Zheyu Wang","doi":"10.1093/biostatistics/kxaf010","DOIUrl":"10.1093/biostatistics/kxaf010","url":null,"abstract":"<p><p>The preclinical stage of many neurodegenerative diseases can span decades before symptoms become apparent. Understanding the sequence of preclinical biomarker changes provides a critical opportunity for early diagnosis and effective intervention prior to significant loss of patients' brain functions. The main challenge to early detection lies in the absence of direct observation of the disease state and the considerable variability in both biomarkers and disease dynamics among individuals. Recent research hypothesized the existence of subgroups with distinct biomarker patterns due to co-morbidities and degrees of brain resilience. Our ability to diagnose early and intervene during the preclinical stage of neurodegenerative diseases will be enhanced by further insights into heterogeneity in the biomarker-disease relationship. In this article, we focus on Alzheimer's disease (AD) and attempt to identify the systematic patterns within the heterogeneous AD biomarker-disease cascade. Specifically, we quantify the disease progression using a dynamic latent variable whose mixture distribution represents patient subgroups. Model estimation uses Hamiltonian Monte Carlo with the number of clusters determined by the Bayesian Information Criterion. We report simulation studies that investigate the performance of the proposed model in finite sample settings that are similar to our motivating application. We apply the proposed model to the Biomarkers of Cognitive Decline Among Normal Individuals data, a longitudinal study that was conducted over 2 decades among individuals who were initially cognitively normal. Our application yields evidence consistent with the hypothetical model of biomarker dynamics presented in Jack Jr et al. In addition, our analysis identified 2 subgroups with distinct disease-onset patterns. Finally, we develop a dynamic prediction approach to improve the precision of prognoses.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12054513/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144029768","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed lag interaction model with index modification. 具有索引修改的分布式滞后交互模型。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf017
Danielle Demateis, Sandra India-Aldana, Robert O Wright, Rosalind J Wright, Andrea Baccarelli, Elena Colicino, Ander Wilson, Kayleigh P Keller

Epidemiological evidence supports an association between exposure to air pollution during pregnancy and birth and child health outcomes. Typically, such associations are estimated by regressing an outcome on daily or weekly measures of exposure during pregnancy using a distributed lag model. However, these associations may be modified by multiple factors. We propose a distributed lag interaction model with index modification that allows for effect modification of a functional predictor by a weighted average of multiple modifiers. Our model allows for simultaneous estimation of modifier index weights and the exposure-time-response function via a spline cross-basis in a Bayesian hierarchical framework. Through simulations, we showed that our model out-performs competing methods when there are multiple modifiers of unknown importance. We applied our proposed method to a Colorado birth cohort to estimate the association between birth weight and air pollution modified by a neighborhood-vulnerability index and to a Mexican birth cohort to estimate the association between birthing-parent cardio-metabolic endpoints and air pollution modified by a birthing-parent lifetime stress index.

流行病学证据支持在怀孕和分娩期间接触空气污染与儿童健康结果之间存在关联。通常,这种关联是通过使用分布滞后模型对怀孕期间每日或每周暴露量的结果进行回归来估计的。然而,这些关联可能受到多种因素的影响。我们提出了一个具有指数修正的分布式滞后交互模型,该模型允许通过多个修正因子的加权平均值对功能预测因子进行效果修正。我们的模型允许在贝叶斯层次框架中通过样条交叉基同时估计修正指标权重和暴露-时间-响应函数。通过仿真,我们发现当存在多个未知重要度的修饰符时,我们的模型优于竞争方法。我们将我们提出的方法应用于科罗拉多州的一个出生队列,通过邻居脆弱性指数来估计出生体重与空气污染之间的关系,并将其应用于墨西哥的一个出生队列,通过出生父母一生压力指数来估计出生父母心脏代谢终点与空气污染之间的关系。
{"title":"Distributed lag interaction model with index modification.","authors":"Danielle Demateis, Sandra India-Aldana, Robert O Wright, Rosalind J Wright, Andrea Baccarelli, Elena Colicino, Ander Wilson, Kayleigh P Keller","doi":"10.1093/biostatistics/kxaf017","DOIUrl":"10.1093/biostatistics/kxaf017","url":null,"abstract":"<p><p>Epidemiological evidence supports an association between exposure to air pollution during pregnancy and birth and child health outcomes. Typically, such associations are estimated by regressing an outcome on daily or weekly measures of exposure during pregnancy using a distributed lag model. However, these associations may be modified by multiple factors. We propose a distributed lag interaction model with index modification that allows for effect modification of a functional predictor by a weighted average of multiple modifiers. Our model allows for simultaneous estimation of modifier index weights and the exposure-time-response function via a spline cross-basis in a Bayesian hierarchical framework. Through simulations, we showed that our model out-performs competing methods when there are multiple modifiers of unknown importance. We applied our proposed method to a Colorado birth cohort to estimate the association between birth weight and air pollution modified by a neighborhood-vulnerability index and to a Mexican birth cohort to estimate the association between birthing-parent cardio-metabolic endpoints and air pollution modified by a birthing-parent lifetime stress index.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12205949/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144369549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing the mean-variance relationship in spatially resolved transcriptomics data with spoon. 用spoon处理空间解析转录组学数据中的均方差关系。
IF 2 3区 数学 Q3 MATHEMATICAL & COMPUTATIONAL BIOLOGY Pub Date : 2025-12-31 DOI: 10.1093/biostatistics/kxaf012
Kinnary Shah, Boyi Guo, Stephanie C Hicks

An important task in the analysis of spatially resolved transcriptomics (SRT) data is to identify spatially variable genes (SVGs), or genes that vary in a 2D space. Current approaches rank SVGs based on either $ P $-values or an effect size, such as the proportion of spatial variance. However, previous work in the analysis of RNA-sequencing data identified a technical bias with log-transformation, violating the "mean-variance relationship" of gene counts, where highly expressed genes are more likely to have a higher variance in counts but lower variance after log-transformation. Here, we demonstrate the mean-variance relationship in SRT data. Furthermore, we propose spoon, a statistical framework using empirical Bayes techniques to remove this bias, leading to more accurate prioritization of SVGs. We demonstrate the performance of spoon in both simulated and real SRT data. A software implementation of our method is available at https://bioconductor.org/packages/spoon.

空间解析转录组学(SRT)数据分析的一个重要任务是识别空间可变基因(SVGs),或在二维空间中变化的基因。目前的方法是根据P值或效应大小(如空间方差的比例)对svg进行排序。然而,之前在rna测序数据分析中的工作发现了对数转化的技术偏差,违反了基因计数的“均值-方差关系”,即高表达基因更有可能在计数上有更高的方差,但在对数转化后方差更低。在这里,我们展示了SRT数据中的均值-方差关系。此外,我们提出了spoon,一个使用经验贝叶斯技术的统计框架来消除这种偏见,从而更准确地确定svg的优先级。我们在模拟和真实的SRT数据中验证了spoon的性能。我们的方法的软件实现可以在https://bioconductor.org/packages/spoon上找到。
{"title":"Addressing the mean-variance relationship in spatially resolved transcriptomics data with spoon.","authors":"Kinnary Shah, Boyi Guo, Stephanie C Hicks","doi":"10.1093/biostatistics/kxaf012","DOIUrl":"10.1093/biostatistics/kxaf012","url":null,"abstract":"<p><p>An important task in the analysis of spatially resolved transcriptomics (SRT) data is to identify spatially variable genes (SVGs), or genes that vary in a 2D space. Current approaches rank SVGs based on either $ P $-values or an effect size, such as the proportion of spatial variance. However, previous work in the analysis of RNA-sequencing data identified a technical bias with log-transformation, violating the \"mean-variance relationship\" of gene counts, where highly expressed genes are more likely to have a higher variance in counts but lower variance after log-transformation. Here, we demonstrate the mean-variance relationship in SRT data. Furthermore, we propose spoon, a statistical framework using empirical Bayes techniques to remove this bias, leading to more accurate prioritization of SVGs. We demonstrate the performance of spoon in both simulated and real SRT data. A software implementation of our method is available at https://bioconductor.org/packages/spoon.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2025-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12166475/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144295418","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Biostatistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1