首页 > 最新文献

Annals of Applied Statistics最新文献

英文 中文
AN OMNIBUS TEST FOR DETECTION OF SUBGROUP TREATMENT EFFECTS VIA DATA PARTITIONING. 通过数据分区检测亚组治疗效果的综合测试。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-12-01 Epub Date: 2022-09-26 DOI: 10.1214/21-AOAS1589
Yifei Sun, Xuming He, Jianhua Hu

Late-stage clinical trials have been conducted primarily to establish the efficacy of a new treatment in an intended population. A corollary of population heterogeneity in clinical trials is that a treatment might be effective for one or more subgroups, rather than for the whole population of interest. As an example, the phase III clinical trial of panitumumab in metastatic colorectal cancer patients failed to demonstrate its efficacy in the overall population, but a subgroup associated with tumor KRAS status was found to be promising (Peeters et al. (Am. J. Clin. Oncol. 28 (2010) 4706-4713)). As we search for such subgroups via data partitioning based on a large number of biomarkers, we need to guard against inflated type I error rates due to multiple testing. Commonly-used multiplicity adjustments tend to lose power for the detection of subgroup treatment effects. We develop an effective omnibus test to detect the existence of, at least, one subgroup treatment effect, allowing a large number of possible subgroups to be considered and possibly censored outcomes. Applied to the panitumumab trial data, the proposed test would confirm a significant subgroup treatment effect. Empirical studies also show that the proposed test is applicable to a variety of outcome variables and maintains robust statistical power.

后期临床试验主要是为了确定一种新疗法在目标人群中的疗效。临床试验中人群异质性的一个必然结果是,一种治疗方法可能对一个或多个亚组有效,而不是对整个相关人群有效。例如,帕尼单抗在转移性结直肠癌患者中的 III 期临床试验未能证明其在总体人群中的疗效,但发现与肿瘤 KRAS 状态相关的一个亚组很有希望(Peeters 等(Am.J. Clin.Oncol.28 (2010) 4706-4713)).当我们通过基于大量生物标记物的数据分区来寻找此类亚组时,我们需要防止因多重检验而导致的I型错误率升高。常用的多重性调整往往会失去检测亚组治疗效应的能力。我们开发了一种有效的综合测试来检测是否存在至少一种亚组治疗效应,允许考虑大量可能的亚组和可能的删减结果。将该检验方法应用于帕尼单抗试验数据,可确认存在显著的亚组治疗效应。实证研究还表明,建议的检验适用于各种结果变量,并能保持强大的统计能力。
{"title":"AN OMNIBUS TEST FOR DETECTION OF SUBGROUP TREATMENT EFFECTS VIA DATA PARTITIONING.","authors":"Yifei Sun, Xuming He, Jianhua Hu","doi":"10.1214/21-AOAS1589","DOIUrl":"10.1214/21-AOAS1589","url":null,"abstract":"<p><p>Late-stage clinical trials have been conducted primarily to establish the efficacy of a new treatment in an intended population. A corollary of population heterogeneity in clinical trials is that a treatment might be effective for one or more subgroups, rather than for the whole population of interest. As an example, the phase III clinical trial of panitumumab in metastatic colorectal cancer patients failed to demonstrate its efficacy in the overall population, but a subgroup associated with tumor KRAS status was found to be promising (Peeters et al. (<i>Am. J. Clin. Oncol.</i> 28 (2010) 4706-4713)). As we search for such subgroups via data partitioning based on a large number of biomarkers, we need to guard against inflated type I error rates due to multiple testing. Commonly-used multiplicity adjustments tend to lose power for the detection of subgroup treatment effects. We develop an effective omnibus test to detect the existence of, at least, one subgroup treatment effect, allowing a large number of possible subgroups to be considered and possibly censored outcomes. Applied to the panitumumab trial data, the proposed test would confirm a significant subgroup treatment effect. Empirical studies also show that the proposed test is applicable to a variety of outcome variables and maintains robust statistical power.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10381789/pdf/nihms-1919024.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9973657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CRITICAL WINDOW VARIABLE SELECTION FOR MIXTURES: ESTIMATING THE IMPACT OF MULTIPLE AIR POLLUTANTS ON STILLBIRTH. 混合物的关键窗口变量选择:估计多种空气污染物对死产的影响。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 DOI: 10.1214/21-aoas1560
Joshua L Warren, Howard H Chang, Lauren K Warren, Matthew J Strickland, Lyndsey A Darrow, James A Mulholland

Understanding the role of time-varying pollution mixtures on human health is critical as people are simultaneously exposed to multiple pollutants during their lives. For vulnerable subpopulations who have well-defined exposure periods (e.g., pregnant women), questions regarding critical windows of exposure to these mixtures are important for mitigating harm. We extend critical window variable selection (CWVS) to the multipollutant setting by introducing CWVS for mixtures (CWVSmix), a hierarchical Bayesian method that combines smoothed variable selection and temporally correlated weight parameters to: (i) identify critical windows of exposure to mixtures of time-varying pollutants, (ii) estimate the time-varying relative importance of each individual pollutant and their first order interactions within the mixture, and (iii) quantify the impact of the mixtures on health. Through simulation we show that CWVSmix offers the best balance of performance in each of these categories in comparison to competing methods. Using these approaches, we investigate the impact of exposure to multiple ambient air pollutants on the risk of stillbirth in New Jersey, 2005-2014. We find consistent elevated risk in gestational weeks 2, 16-17, and 20 for non-Hispanic Black mothers, with pollution mixtures dominated by ammonium (weeks 2, 17, 20), nitrate (weeks 2, 17), nitrogen oxides (weeks 2, 16), PM2.5 (week 2), and sulfate (week 20). The method is available in the R package CWVSmix.

了解时变污染混合物对人类健康的作用至关重要,因为人们在其一生中同时暴露于多种污染物。对于有明确暴露期的脆弱亚人群(如孕妇),关于暴露于这些混合物的关键窗口期的问题对于减轻危害很重要。我们通过引入混合的临界窗口变量选择(CWVSmix),将临界窗口变量选择(CWVS)扩展到多污染物设置,这是一种分层贝叶斯方法,结合了平滑变量选择和时间相关的权重参数,以便:(i)确定接触时变污染物混合物的关键窗口,(ii)估计每种污染物的时变相对重要性及其在混合物中的一级相互作用,以及(iii)量化混合物对健康的影响。通过仿真,我们表明CWVSmix在这些类别中提供了与竞争方法相比的最佳性能平衡。使用这些方法,我们调查了2005-2014年新泽西州暴露于多种环境空气污染物对死产风险的影响。我们发现非西班牙裔黑人母亲在妊娠2、16-17和20周的风险持续升高,污染混合物主要是铵(第2、17、20周)、硝酸盐(第2、17周)、氮氧化物(第2、16周)、PM2.5(第2周)和硫酸盐(第20周)。该方法在R包CWVSmix中可用。
{"title":"CRITICAL WINDOW VARIABLE SELECTION FOR MIXTURES: ESTIMATING THE IMPACT OF MULTIPLE AIR POLLUTANTS ON STILLBIRTH.","authors":"Joshua L Warren,&nbsp;Howard H Chang,&nbsp;Lauren K Warren,&nbsp;Matthew J Strickland,&nbsp;Lyndsey A Darrow,&nbsp;James A Mulholland","doi":"10.1214/21-aoas1560","DOIUrl":"https://doi.org/10.1214/21-aoas1560","url":null,"abstract":"<p><p>Understanding the role of time-varying pollution mixtures on human health is critical as people are simultaneously exposed to multiple pollutants during their lives. For vulnerable subpopulations who have well-defined exposure periods (e.g., pregnant women), questions regarding critical windows of exposure to these mixtures are important for mitigating harm. We extend critical window variable selection (CWVS) to the multipollutant setting by introducing CWVS for mixtures (CWVSmix), a hierarchical Bayesian method that combines smoothed variable selection and temporally correlated weight parameters to: (i) identify critical windows of exposure to mixtures of time-varying pollutants, (ii) estimate the time-varying relative importance of each individual pollutant and their first order interactions within the mixture, and (iii) quantify the impact of the mixtures on health. Through simulation we show that CWVSmix offers the best balance of performance in each of these categories in comparison to competing methods. Using these approaches, we investigate the impact of exposure to multiple ambient air pollutants on the risk of stillbirth in New Jersey, 2005-2014. We find consistent elevated risk in gestational weeks 2, 16-17, and 20 for non-Hispanic Black mothers, with pollution mixtures dominated by ammonium (weeks 2, 17, 20), nitrate (weeks 2, 17), nitrogen oxides (weeks 2, 16), PM<sub>2.5</sub> (week 2), and sulfate (week 20). The method is available in the R package CWVSmix.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9854390/pdf/nihms-1863002.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10124900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK. 大规模多元稀疏回归在英国生物银行中的应用。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 DOI: 10.1214/21-aoas1575
Junyang Qian, Yosuke Tanigawa, Ruilin Li, Robert Tibshirani, Manuel A Rivas, Trevor Hastie

In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), reduced rank regression is an effective way to borrow strength and capture latent structures that underlie the data. Our proposal is motivated by the UK Biobank population-based cohort study, where we are faced with large-scale, ultrahigh-dimensional features, and have access to a large number of outcomes (phenotypes)-lifestyle measures, biomarkers, and disease outcomes. We are hence led to fit sparse reduced-rank regression models, using computational strategies that allow us to scale to problems of this size. We use a scheme that alternates between solving the sparse regression problem and solving the reduced rank decomposition. For the sparse regression component we propose a scalable iterative algorithm based on adaptive screening that leverages the sparsity assumption and enables us to focus on solving much smaller subproblems. The full solution is reconstructed and tested via an optimality condition to make sure it is a valid solution for the original problem. We further extend the method to cope with practical issues, such as the inclusion of confounding variables and imputation of missing values among the phenotypes. Experiments on both synthetic data and the UK Biobank data demonstrate the effectiveness of the method and the algorithm. We present multiSnpnet package, available at http://github.com/junyangq/multiSnpnet that works on top of PLINK2 files, which we anticipate to be a valuable tool for generating polygenic risk scores from human genetic studies.

在高维回归问题中,通常相对较小的特征子集与预测结果相关,并且在解决方案上施加稀疏性的方法很受欢迎。当多个相关结果可用(多任务)时,降阶回归是一种有效的方法,可以借用强度并捕获数据背后的潜在结构。我们的提议是由英国生物银行基于人群的队列研究激发的,在该研究中,我们面临着大规模、超高维特征,并且可以获得大量的结果(表型)——生活方式测量、生物标志物和疾病结果。因此,我们使用允许我们扩展到这种规模的问题的计算策略来拟合稀疏降阶回归模型。我们使用一种交替解决稀疏回归问题和求解降阶分解的方案。对于稀疏回归组件,我们提出了一种基于自适应筛选的可扩展迭代算法,该算法利用稀疏性假设,使我们能够专注于解决更小的子问题。通过最优性条件对完整解进行重构和测试,以确保它是原始问题的有效解。我们进一步扩展了该方法来处理实际问题,例如在表型中包含混淆变量和缺失值的imputation。在合成数据和UK Biobank数据上的实验证明了该方法和算法的有效性。我们提供了multiSnpnet包,可在http://github.com/junyangq/multiSnpnet上获得,它在PLINK2文件上工作,我们预计它将成为一个有价值的工具,用于从人类遗传研究中生成多基因风险评分。
{"title":"LARGE-SCALE MULTIVARIATE SPARSE REGRESSION WITH APPLICATIONS TO UK BIOBANK.","authors":"Junyang Qian,&nbsp;Yosuke Tanigawa,&nbsp;Ruilin Li,&nbsp;Robert Tibshirani,&nbsp;Manuel A Rivas,&nbsp;Trevor Hastie","doi":"10.1214/21-aoas1575","DOIUrl":"https://doi.org/10.1214/21-aoas1575","url":null,"abstract":"<p><p>In high-dimensional regression problems, often a relatively small subset of the features are relevant for predicting the outcome, and methods that impose sparsity on the solution are popular. When multiple correlated outcomes are available (multitask), reduced rank regression is an effective way to borrow strength and capture latent structures that underlie the data. Our proposal is motivated by the UK Biobank population-based cohort study, where we are faced with large-scale, ultrahigh-dimensional features, and have access to a large number of outcomes (phenotypes)-lifestyle measures, biomarkers, and disease outcomes. We are hence led to fit sparse reduced-rank regression models, using computational strategies that allow us to scale to problems of this size. We use a scheme that alternates between solving the sparse regression problem and solving the reduced rank decomposition. For the sparse regression component we propose a scalable iterative algorithm based on adaptive screening that leverages the sparsity assumption and enables us to focus on solving much smaller subproblems. The full solution is reconstructed and tested via an optimality condition to make sure it is a valid solution for the original problem. We further extend the method to cope with practical issues, such as the inclusion of confounding variables and imputation of missing values among the phenotypes. Experiments on both synthetic data and the UK Biobank data demonstrate the effectiveness of the method and the algorithm. We present multiSnpnet package, available at http://github.com/junyangq/multiSnpnet that works on top of PLINK2 files, which we anticipate to be a valuable tool for generating polygenic risk scores from human genetic studies.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9454085/pdf/nihms-1830548.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9399257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A BAYESIAN HIERARCHICAL MODEL FOR COMBINING MULTIPLE DATA SOURCES IN POPULATION SIZE ESTIMATION. 在种群数量估计中结合多种数据源的贝叶斯分层模型。
IF 1.3 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-09-01 Epub Date: 2022-07-19 DOI: 10.1214/21-AOAS1556
Jacob Parsons, Xiaoyue Niu, Le Bao

To combat the HIV/AIDS pandemic effectively, targeted interventions among certain key populations play a critical role. Examples of such key populations include sex workers, people who inject drugs, and men who have sex with men. While having accurate estimates for the size of these key populations is important, any attempt to directly contact or count members of these populations is difficult. As a result, indirect methods are used to produce size estimates. Multiple approaches for estimating the size of such populations have been suggested but often give conflicting results. It is, therefore, necessary to have a principled way to combine and reconcile these estimates. To this end, we present a Bayesian hierarchical model for estimating the size of key populations that combines multiple estimates from different sources of information. The proposed model makes use of multiple years of data and explicitly models the systematic error in the data sources used. We use the model to estimate the size of people who inject drugs in Ukraine. We evaluate the appropriateness of the model and compare the contribution of each data source to the final estimates.

为有效防治艾滋病毒/艾滋病,对某些关键人群采取有针对性的干预措施至关重要。这些关键人群包括性工作者、注射毒品者和男男性行为者。虽然准确估计这些关键人群的规模非常重要,但任何试图直接接触或统计这些人群成员的尝试都很困难。因此,我们采用间接方法来估算人口规模。人们提出了多种估算此类人口规模的方法,但结果往往相互矛盾。因此,有必要制定一种原则性的方法来合并和协调这些估计值。为此,我们提出了一个贝叶斯分层模型,用于估算重点人群的规模,该模型综合了来自不同信息来源的多种估算结果。所提出的模型利用了多年的数据,并对所用数据源的系统误差进行了明确建模。我们使用该模型估算了乌克兰注射吸毒者的规模。我们对模型的适当性进行了评估,并比较了每个数据源对最终估算结果的贡献。
{"title":"A BAYESIAN HIERARCHICAL MODEL FOR COMBINING MULTIPLE DATA SOURCES IN POPULATION SIZE ESTIMATION.","authors":"Jacob Parsons, Xiaoyue Niu, Le Bao","doi":"10.1214/21-AOAS1556","DOIUrl":"10.1214/21-AOAS1556","url":null,"abstract":"<p><p>To combat the HIV/AIDS pandemic effectively, targeted interventions among certain key populations play a critical role. Examples of such key populations include sex workers, people who inject drugs, and men who have sex with men. While having accurate estimates for the size of these key populations is important, any attempt to directly contact or count members of these populations is difficult. As a result, indirect methods are used to produce size estimates. Multiple approaches for estimating the size of such populations have been suggested but often give conflicting results. It is, therefore, necessary to have a principled way to combine and reconcile these estimates. To this end, we present a Bayesian hierarchical model for estimating the size of key populations that combines multiple estimates from different sources of information. The proposed model makes use of multiple years of data and explicitly models the systematic error in the data sources used. We use the model to estimate the size of people who inject drugs in Ukraine. We evaluate the appropriateness of the model and compare the contribution of each data source to the final estimates.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10150643/pdf/nihms-1889948.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9465730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BAYESIAN SEMIPARAMETRIC LONG MEMORY MODELS FOR DISCRETIZED EVENT DATA. 离散事件数据的贝叶斯半参数长记忆模型。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 Epub Date: 2022-07-19 DOI: 10.1214/21-aoas1546
Antik Chakraborty, Otso Ovaskainen, David B Dunson

We introduce a new class of semiparametric latent variable models for long memory discretized event data. The proposed methodology is motivated by a study of bird vocalizations in the Amazon rain forest; the timings of vocalizations exhibit self-similarity and long range dependence. This rules out Poisson process based models where the rate function itself is not long range dependent. The proposed class of FRActional Probit (FRAP) models is based on thresholding, a latent process. This latent process is modeled by a smooth Gaussian process and a fractional Brownian motion by assuming an additive structure. We develop a Bayesian approach to inference using Markov chain Monte Carlo and show good performance in simulation studies. Applying the methods to the Amazon bird vocalization data, we find substantial evidence for self-similarity and non-Markovian/Poisson dynamics. To accommodate the bird vocalization data in which there are many different species of birds exhibiting their own vocalization dynamics, a hierarchical expansion of FRAP is provided in the Supplementary Material.

针对长记忆离散事件数据,提出了一类新的半参数潜变量模型。提出的方法的动机是对亚马逊雨林中鸟类发声的研究;发声的时间表现出自相似性和长距离依赖性。这排除了基于泊松过程的模型,其中速率函数本身不是长期依赖的。所提出的分数概率(FRAP)模型是基于阈值,一个潜在的过程。这个潜在过程通过假设一个加性结构,用光滑高斯过程和分数布朗运动来建模。我们开发了一种基于马尔可夫链蒙特卡罗的贝叶斯推理方法,并在仿真研究中显示出良好的性能。将该方法应用于亚马逊鸟类发声数据,我们发现了自相似性和非马尔可夫/泊松动力学的大量证据。为了适应鸟类发声数据,其中有许多不同种类的鸟类表现出自己的发声动态,在补充材料中提供了FRAP的分层扩展。
{"title":"BAYESIAN SEMIPARAMETRIC LONG MEMORY MODELS FOR DISCRETIZED EVENT DATA.","authors":"Antik Chakraborty,&nbsp;Otso Ovaskainen,&nbsp;David B Dunson","doi":"10.1214/21-aoas1546","DOIUrl":"https://doi.org/10.1214/21-aoas1546","url":null,"abstract":"<p><p>We introduce a new class of semiparametric latent variable models for long memory discretized event data. The proposed methodology is motivated by a study of bird vocalizations in the Amazon rain forest; the timings of vocalizations exhibit self-similarity and long range dependence. This rules out Poisson process based models where the rate function itself is not long range dependent. The proposed class of FRActional Probit (FRAP) models is based on thresholding, a latent process. This latent process is modeled by a smooth Gaussian process and a fractional Brownian motion by assuming an additive structure. We develop a Bayesian approach to inference using Markov chain Monte Carlo and show good performance in simulation studies. Applying the methods to the Amazon bird vocalization data, we find substantial evidence for self-similarity and non-Markovian/Poisson dynamics. To accommodate the bird vocalization data in which there are many different species of birds exhibiting their own vocalization dynamics, a hierarchical expansion of FRAP is provided in the Supplementary Material.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9718501/pdf/nihms-1846463.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"35256023","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SENSITIVITY ANALYSIS FOR EVALUATING PRINCIPAL SURROGATE ENDPOINTS RELAXING THE EQUAL EARLY CLINICAL RISK ASSUMPTION. 评估主要替代终点的灵敏度分析放松了早期临床风险相同的假设。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 Epub Date: 2022-07-19 DOI: 10.1214/21-aoas1566
Ying Huang, Yingying Zhuang, Peter Gilbert

This article addresses the evaluation of post-randomization immune response biomarkers as principal surrogate endpoints of a vaccine's protective effect, based on data from randomized vaccine trials. An important metric for quantifying a biomarker's principal surrogacy in vaccine research is the vaccine efficacy curve, which shows a vaccine's efficacy as a function of potential biomarker values if receiving vaccine, among an 'early-always-at-risk' principal stratum of trial participants who remain disease-free at the time of biomarker measurement whether having received vaccine or placebo. Earlier work in principal surrogate evaluation relied on an 'equal-early-clinical-risk' assumption for identifiability of the vaccine curve, based on observed disease status at the time of biomarker measurement. This assumption is violated in the common setting that the vaccine has an early effect on the clinical endpoint before the biomarker is measured. In particular, a vaccine's early protective effect observed in two phase III dengue vaccine trials (CYD14/CYD15) has motivated our current research development. We relax the 'equal-early-clinical-risk' assumption and propose a new sensitivity analysis framework for principal surrogate evaluation allowing for early vaccine efficacy. Under this framework, we develop inference procedures for vaccine efficacy curve estimators based on the estimated maximum likelihood approach. We then use the proposed methodology to assess the surrogacy of post-randomization neutralization titer in the motivating dengue application.

本文以随机疫苗试验的数据为基础,对作为疫苗保护效果主要替代终点的随机化后免疫反应生物标志物进行了评估。疫苗疗效曲线是疫苗研究中量化生物标志物主要代用性的一个重要指标,它显示了疫苗的疗效与接受疫苗时潜在生物标志物值的函数关系,而疫苗的疗效是由 "早期一直处于风险中 "的主要试验参与者组成的。早期的主要替代物评估工作依赖于 "早期临床风险相同 "的假设,根据生物标记物测量时观察到的疾病状态来确定疫苗曲线的可识别性。在生物标记物测量前疫苗对临床终点产生早期影响的常见情况下,这一假设就被打破了。特别是,在登革热疫苗 III 期试验(CYD14/CYD15)中观察到的疫苗早期保护效果激发了我们目前的研究发展。我们放宽了 "早期临床风险相等 "的假设,并提出了一个新的敏感性分析框架,用于主要替代物评估,允许早期疫苗疗效。在这一框架下,我们基于最大似然估计法开发了疫苗疗效曲线估计器的推断程序。然后,我们在登革热应用中使用所提出的方法来评估随机化后中和滴度的代用性。
{"title":"SENSITIVITY ANALYSIS FOR EVALUATING PRINCIPAL SURROGATE ENDPOINTS RELAXING THE EQUAL EARLY CLINICAL RISK ASSUMPTION.","authors":"Ying Huang, Yingying Zhuang, Peter Gilbert","doi":"10.1214/21-aoas1566","DOIUrl":"10.1214/21-aoas1566","url":null,"abstract":"<p><p>This article addresses the evaluation of post-randomization immune response biomarkers as principal surrogate endpoints of a vaccine's protective effect, based on data from randomized vaccine trials. An important metric for quantifying a biomarker's principal surrogacy in vaccine research is the vaccine efficacy curve, which shows a vaccine's efficacy as a function of potential biomarker values if receiving vaccine, among an 'early-always-at-risk' principal stratum of trial participants who remain disease-free at the time of biomarker measurement whether having received vaccine or placebo. Earlier work in principal surrogate evaluation relied on an 'equal-early-clinical-risk' assumption for identifiability of the vaccine curve, based on observed disease status at the time of biomarker measurement. This assumption is violated in the common setting that the vaccine has an early effect on the clinical endpoint before the biomarker is measured. In particular, a vaccine's early protective effect observed in two phase III dengue vaccine trials (CYD14/CYD15) has motivated our current research development. We relax the 'equal-early-clinical-risk' assumption and propose a new sensitivity analysis framework for principal surrogate evaluation allowing for early vaccine efficacy. Under this framework, we develop inference procedures for vaccine efficacy curve estimators based on the estimated maximum likelihood approach. We then use the proposed methodology to assess the surrogacy of post-randomization neutralization titer in the motivating dengue application.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10065750/pdf/nihms-1836703.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10190558","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
BAYESIAN FUNCTIONAL REGISTRATION OF FMRI ACTIVATION MAPS. FMRI 激活图的贝叶斯功能配准。
IF 1.3 4区 数学 Q2 STATISTICS & PROBABILITY Pub Date : 2022-09-01 Epub Date: 2022-07-19 DOI: 10.1214/21-aoas1562
Guoqing Wang, Abhirup Datta, Martin A Lindquist

Functional magnetic resonance imaging (fMRI) has provided invaluable insight into our understanding of human behavior. However, large inter-individual differences in both brain anatomy and functional localization after anatomical alignment remain a major limitation in conducting group analyses and performing population level inference. This paper addresses this problem by developing and validating a new computational technique for reducing misalignment across individuals in functional brain systems by spatially transforming each subjects functional data to a common reference map. Our proposed Bayesian functional registration approach allows us to assess differences in brain function across subjects and individual differences in activation topology. It combines intensity-based and feature-based information into an integrated framework, and allows inference to be performed on the transformation via the posterior samples. We evaluate the method in a simulation study and apply it to data from a study of thermal pain. We find that the proposed approach provides increased sensitivity for group-level inference.

功能磁共振成像(fMRI)为我们了解人类行为提供了宝贵的洞察力。然而,解剖配准后大脑解剖和功能定位方面的巨大个体间差异仍然是进行群体分析和群体推断的主要限制因素。本文针对这一问题,开发并验证了一种新的计算技术,通过将每个受试者的功能数据空间转换到一个共同的参考图,减少大脑功能系统中的个体间错位。我们提出的贝叶斯功能配准方法允许我们评估不同受试者大脑功能的差异以及激活拓扑的个体差异。它将基于强度的信息和基于特征的信息整合到一个综合框架中,并允许通过后验样本对转换进行推断。我们在一项模拟研究中对该方法进行了评估,并将其应用于一项热痛研究的数据中。我们发现,所提出的方法提高了组级推断的灵敏度。
{"title":"BAYESIAN FUNCTIONAL REGISTRATION OF FMRI ACTIVATION MAPS.","authors":"Guoqing Wang, Abhirup Datta, Martin A Lindquist","doi":"10.1214/21-aoas1562","DOIUrl":"10.1214/21-aoas1562","url":null,"abstract":"<p><p>Functional magnetic resonance imaging (fMRI) has provided invaluable insight into our understanding of human behavior. However, large inter-individual differences in both brain anatomy and functional localization <i>after</i> anatomical alignment remain a major limitation in conducting group analyses and performing population level inference. This paper addresses this problem by developing and validating a new computational technique for reducing misalignment across individuals in functional brain systems by spatially transforming each subjects functional data to a common reference map. Our proposed Bayesian functional registration approach allows us to assess differences in brain function across subjects and individual differences in activation topology. It combines intensity-based and feature-based information into an integrated framework, and allows inference to be performed on the transformation via the posterior samples. We evaluate the method in a simulation study and apply it to data from a study of thermal pain. We find that the proposed approach provides increased sensitivity for group-level inference.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.3,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10312483/pdf/nihms-1910200.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10138002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DIRICHLET-TREE MULTINOMIAL MIXTURES FOR CLUSTERING MICROBIOME COMPOSITIONS. 聚类微生物组组成的Dirichlet-tree多项式混合物。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 Epub Date: 2022-07-19 DOI: 10.1214/21-aoas1552
Jialiang Mao, L I Ma

Studying the human microbiome has gained substantial interest in recent years, and a common task in the analysis of these data is to cluster microbiome compositions into subtypes. This subdivision of samples into subgroups serves as an intermediary step in achieving personalized diagnosis and treatment. In applying existing clustering methods to modern microbiome studies including the American Gut Project (AGP) data, we found that this seemingly standard task, however, is very challenging in the microbiome composition context due to several key features of such data. Standard distance-based clustering algorithms generally do not produce reliable results as they do not take into account the heterogeneity of the cross-sample variability among the bacterial taxa, while existing model-based approaches do not allow sufficient flexibility for the identification of complex within-cluster variation from cross-cluster variation. Direct applications of such methods generally lead to overly dispersed clusters in the AGP data and such phenomenon is common for other microbiome data. To overcome these challenges, we introduce Dirichlet-tree multinomial mixtures (DTMM) as a Bayesian generative model for clustering amplicon sequencing data in microbiome studies. DTMM models the microbiome population with a mixture of Dirichlet-tree kernels that utilizes the phylogenetic tree to offer a more flexible covariance structure in characterizing within-cluster variation, and it provides a means for identifying a subset of signature taxa that distinguish the clusters. We perform extensive simulation studies to evaluate the performance of DTMM and compare it to state-of-the-art model-based and distance-based clustering methods in the microbiome context, and carry out a validation study on a publicly available longitudinal data set to confirm the biological relevance of the clusters. Finally, we report a case study on the fecal data from the AGP to identify compositional clusters among individuals with inflammatory bowel disease and diabetes. Among our most interesting findings is that enterotypes (i.e., gut microbiome clusters) are not always defined by the most dominant species as previous analyses had assumed, but can involve a number of less abundant OTUs, which cannot be identified with existing distance-based and method-based approaches.

近年来,对人类微生物组的研究获得了极大的兴趣,分析这些数据的一个共同任务是将微生物组组成聚类成亚型。将样本细分为亚组是实现个性化诊断和治疗的中间步骤。在将现有的聚类方法应用于包括美国肠道计划(AGP)数据在内的现代微生物组研究中,我们发现,由于这些数据的几个关键特征,这一看似标准的任务在微生物组组成背景下非常具有挑战性。标准的基于距离的聚类算法通常不能产生可靠的结果,因为它们没有考虑到细菌分类群之间跨样本变异性的异质性,而现有的基于模型的方法在识别复杂的簇内变异和跨簇变异方面没有足够的灵活性。这种方法的直接应用通常会导致AGP数据中过于分散的簇,这种现象在其他微生物组数据中也很常见。为了克服这些挑战,我们引入Dirichlet-tree多项式混合物(DTMM)作为微生物组研究中扩增子测序数据聚类的贝叶斯生成模型。DTMM利用dirichlet树核的混合模型对微生物群进行建模,该模型利用系统发育树提供更灵活的协方差结构来表征聚类内的变化,并提供了一种识别区分聚类的特征分类群子集的方法。我们进行了广泛的模拟研究,以评估DTMM的性能,并将其与微生物组背景下最先进的基于模型和基于距离的聚类方法进行比较,并对公开可用的纵向数据集进行验证研究,以确认聚类的生物学相关性。最后,我们报告了一个关于AGP粪便数据的案例研究,以确定炎症性肠病和糖尿病患者的组成簇。我们最有趣的发现之一是,肠型(即肠道微生物群)并不总是像以前的分析所假设的那样由最优势的物种定义,而是可能涉及一些较少的otu,这些otu无法用现有的基于距离和基于方法的方法识别。
{"title":"DIRICHLET-TREE MULTINOMIAL MIXTURES FOR CLUSTERING MICROBIOME COMPOSITIONS.","authors":"Jialiang Mao,&nbsp;L I Ma","doi":"10.1214/21-aoas1552","DOIUrl":"https://doi.org/10.1214/21-aoas1552","url":null,"abstract":"<p><p>Studying the human microbiome has gained substantial interest in recent years, and a common task in the analysis of these data is to cluster microbiome compositions into subtypes. This subdivision of samples into subgroups serves as an intermediary step in achieving personalized diagnosis and treatment. In applying existing clustering methods to modern microbiome studies including the American Gut Project (AGP) data, we found that this seemingly standard task, however, is very challenging in the microbiome composition context due to several key features of such data. Standard distance-based clustering algorithms generally do not produce reliable results as they do not take into account the heterogeneity of the cross-sample variability among the bacterial taxa, while existing model-based approaches do not allow sufficient flexibility for the identification of complex within-cluster variation from cross-cluster variation. Direct applications of such methods generally lead to overly dispersed clusters in the AGP data and such phenomenon is common for other microbiome data. To overcome these challenges, we introduce Dirichlet-tree multinomial mixtures (DTMM) as a Bayesian generative model for clustering amplicon sequencing data in microbiome studies. DTMM models the microbiome population with a mixture of Dirichlet-tree kernels that utilizes the phylogenetic tree to offer a more flexible covariance structure in characterizing within-cluster variation, and it provides a means for identifying a subset of signature taxa that distinguish the clusters. We perform extensive simulation studies to evaluate the performance of DTMM and compare it to state-of-the-art model-based and distance-based clustering methods in the microbiome context, and carry out a validation study on a publicly available longitudinal data set to confirm the biological relevance of the clusters. Finally, we report a case study on the fecal data from the AGP to identify compositional clusters among individuals with inflammatory bowel disease and diabetes. Among our most interesting findings is that enterotypes (i.e., gut microbiome clusters) are not always defined by the most dominant species as previous analyses had assumed, but can involve a number of less abundant OTUs, which cannot be identified with existing distance-based and method-based approaches.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9484567/pdf/nihms-1814687.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40373323","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
MEASURING PERFORMANCE FOR END-OF-LIFE CARE. 衡量临终关怀的表现。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-09-01 DOI: 10.1214/21-aoas1558
Sebastien Haneuse, Deborah Schrag, Francesca Dominici, Sharon-Lise Normand, Kyu Ha Lee

Although not without controversy, readmission is entrenched as a hospital quality metric with statistical analyses generally based on fitting a logistic-Normal generalized linear mixed model. Such analyses, however, ignore death as a competing risk, although doing so for clinical conditions with high mortality can have profound effects; a hospital's seemingly good performance for readmission may be an artifact of it having poor performance for mortality. in this paper we propose novel multivariate hospital-level performance measures for readmission and mortality that derive from framing the analysis as one of cluster-correlated semi-competing risks data. We also consider a number of profiling-related goals, including the identification of extreme performers and a bivariate classification of whether the hospital has higher-/lower-than-expected readmission and mortality rates via a Bayesian decision-theoretic approach that characterizes hospitals on the basis of minimizing the posterior expected loss for an appropriate loss function. in some settings, particularly if the number of hospitals is large, the computational burden may be prohibitive. To resolve this, we propose a series of analysis strategies that will be useful in practice. Throughout, the methods are illustrated with data from CMS on N = 17,685 patients diagnosed with pancreatic cancer between 2000-2012 at one of J = 264 hospitals in California.

虽然并非没有争议,但再入院率被确立为医院质量指标,其统计分析通常基于拟合logistic-Normal广义线性混合模型。然而,这种分析忽略了死亡作为一种竞争风险,尽管对高死亡率的临床条件这样做可能会产生深远的影响;一家医院在再入院率方面表现良好,可能是它在死亡率方面表现不佳的假象。在本文中,我们提出了新的多变量医院水平的再入院和死亡率的绩效指标,这些指标来源于将分析框架作为集群相关的半竞争风险数据之一。我们还考虑了一些与分析相关的目标,包括识别极端表现者,以及通过贝叶斯决策理论方法对医院是否有高于/低于预期的再入院率和死亡率进行双变量分类,该方法以最小化适当损失函数的后验预期损失为基础来表征医院。在某些情况下,特别是在医院数量众多的情况下,计算负担可能令人望而却步。为了解决这个问题,我们提出了一系列在实践中有用的分析策略。在整个过程中,这些方法用CMS对2000年至2012年间在加利福尼亚州J = 264家医院中的一家诊断为胰腺癌的N = 17,685例患者的数据进行了说明。
{"title":"MEASURING PERFORMANCE FOR END-OF-LIFE CARE.","authors":"Sebastien Haneuse,&nbsp;Deborah Schrag,&nbsp;Francesca Dominici,&nbsp;Sharon-Lise Normand,&nbsp;Kyu Ha Lee","doi":"10.1214/21-aoas1558","DOIUrl":"https://doi.org/10.1214/21-aoas1558","url":null,"abstract":"<p><p>Although not without controversy, readmission is entrenched as a hospital quality metric with statistical analyses generally based on fitting a logistic-Normal generalized linear mixed model. Such analyses, however, ignore death as a competing risk, although doing so for clinical conditions with high mortality can have profound effects; a hospital's seemingly good performance for readmission may be an artifact of it having poor performance for mortality. in this paper we propose novel multivariate hospital-level performance measures for readmission and mortality that derive from framing the analysis as one of cluster-correlated semi-competing risks data. We also consider a number of profiling-related goals, including the identification of extreme performers and a bivariate classification of whether the hospital has higher-/lower-than-expected readmission and mortality rates via a Bayesian decision-theoretic approach that characterizes hospitals on the basis of minimizing the posterior expected loss for an appropriate loss function. in some settings, particularly if the number of hospitals is large, the computational burden may be prohibitive. To resolve this, we propose a series of analysis strategies that will be useful in practice. Throughout, the methods are illustrated with data from CMS on <i>N</i> = 17,685 patients diagnosed with pancreatic cancer between 2000-2012 at one of <i>J</i> = 264 hospitals in California.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9728673/pdf/nihms-1842846.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"10333686","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
KERNEL MACHINE AND DISTRIBUTED LAG MODELS FOR ASSESSING WINDOWS OF SUSCEPTIBILITY TO ENVIRONMENTAL MIXTURES IN CHILDREN'S HEALTH STUDIES. 儿童健康研究中评估环境混合物易感性窗口的核机和分布滞后模型。
IF 1.8 4区 数学 Q1 Mathematics Pub Date : 2022-06-01 Epub Date: 2022-06-13 DOI: 10.1214/21-aoas1533
Ander Wilson, Hsiao-Hsien Leon Hsu, Yueh-Hsiu Mathilda Chiu, Robert O Wright, Rosalind J Wright, Brent A Coull

Exposures to environmental chemicals during gestation can alter health status later in life. Most studies of maternal exposure to chemicals during pregnancy have focused on a single chemical exposure observed at high temporal resolution. Recent research has turned to focus on exposure to mixtures of multiple chemicals, generally observed at a single time point. We consider statistical methods for analyzing data on chemical mixtures that are observed at a high temporal resolution. As motivation, we analyze the association between exposure to four ambient air pollutants observed weekly throughout gestation and birth weight in a Boston-area prospective birth cohort. To explore patterns in the data, we first apply methods for analyzing data on (1) a single chemical observed at high temporal resolution, and (2) a mixture measured at a single point in time. We highlight the shortcomings of these approaches for temporally-resolved data on exposure to chemical mixtures. Second, we propose a novel method, a Bayesian kernel machine regression distributed lag model (BKMR-DLM), that simultaneously accounts for nonlinear associations and interactions among time-varying measures of exposure to mixtures. BKMR-DLM uses a functional weight for each exposure that parameterizes the window of susceptibility corresponding to that exposure within a kernel machine framework that captures non-linear and interaction effects of the multivariate exposure on the outcome. In a simulation study, we show that the proposed method can better estimate the exposure-response function and, in high signal settings, can identify critical windows in time during which exposure has an increased association with the outcome. Applying the proposed method to the Boston birth cohort data, we find evidence of a negative association between organic carbon and birth weight and that nitrate modifies the organic carbon, elemental carbon, and sulfate exposure-response functions.

怀孕期间接触环境中的化学物质会改变以后的健康状况。大多数关于母亲在怀孕期间接触化学物质的研究都集中在高时间分辨率下观察到的单一化学物质接触。最近的研究已转向关注暴露于多种化学物质的混合物,通常在一个时间点观察到。我们考虑用统计方法来分析在高时间分辨率下观察到的化学混合物的数据。作为动机,我们分析了波士顿地区前瞻性出生队列中妊娠期间每周观察到的四种环境空气污染物暴露与出生体重之间的关系。为了探索数据中的模式,我们首先应用分析数据的方法:(1)在高时间分辨率下观察到的单一化学物质,以及(2)在单一时间点测量的混合物。我们强调了这些方法对暴露于化学混合物的临时解决数据的缺点。其次,我们提出了一种新的方法,即贝叶斯核机回归分布滞后模型(BKMR-DLM),该模型同时考虑了时变混合物暴露度量之间的非线性关联和相互作用。BKMR-DLM对每个暴露使用一个功能权重,该权重参数化了内核机器框架中对应于该暴露的敏感性窗口,该框架捕获了多变量暴露对结果的非线性和交互影响。在模拟研究中,我们表明所提出的方法可以更好地估计暴露-响应函数,并且在高信号设置中,可以及时识别暴露与结果增加关联的关键窗口。将提出的方法应用于波士顿出生队列数据,我们发现有机碳与出生体重之间存在负相关的证据,并且硝酸盐改变了有机碳,元素碳和硫酸盐暴露响应函数。
{"title":"KERNEL MACHINE AND DISTRIBUTED LAG MODELS FOR ASSESSING WINDOWS OF SUSCEPTIBILITY TO ENVIRONMENTAL MIXTURES IN CHILDREN'S HEALTH STUDIES.","authors":"Ander Wilson,&nbsp;Hsiao-Hsien Leon Hsu,&nbsp;Yueh-Hsiu Mathilda Chiu,&nbsp;Robert O Wright,&nbsp;Rosalind J Wright,&nbsp;Brent A Coull","doi":"10.1214/21-aoas1533","DOIUrl":"https://doi.org/10.1214/21-aoas1533","url":null,"abstract":"<p><p>Exposures to environmental chemicals during gestation can alter health status later in life. Most studies of maternal exposure to chemicals during pregnancy have focused on a single chemical exposure observed at high temporal resolution. Recent research has turned to focus on exposure to mixtures of multiple chemicals, generally observed at a single time point. We consider statistical methods for analyzing data on chemical mixtures that are observed at a high temporal resolution. As motivation, we analyze the association between exposure to four ambient air pollutants observed weekly throughout gestation and birth weight in a Boston-area prospective birth cohort. To explore patterns in the data, we first apply methods for analyzing data on (1) a single chemical observed at high temporal resolution, and (2) a mixture measured at a single point in time. We highlight the shortcomings of these approaches for temporally-resolved data on exposure to chemical mixtures. Second, we propose a novel method, a Bayesian kernel machine regression distributed lag model (BKMR-DLM), that simultaneously accounts for nonlinear associations and interactions among time-varying measures of exposure to mixtures. BKMR-DLM uses a functional weight for each exposure that parameterizes the window of susceptibility corresponding to that exposure within a kernel machine framework that captures non-linear and interaction effects of the multivariate exposure on the outcome. In a simulation study, we show that the proposed method can better estimate the exposure-response function and, in high signal settings, can identify critical windows in time during which exposure has an increased association with the outcome. Applying the proposed method to the Boston birth cohort data, we find evidence of a negative association between organic carbon and birth weight and that nitrate modifies the organic carbon, elemental carbon, and sulfate exposure-response functions.</p>","PeriodicalId":50772,"journal":{"name":"Annals of Applied Statistics","volume":null,"pages":null},"PeriodicalIF":1.8,"publicationDate":"2022-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9603732/pdf/nihms-1807733.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"40651879","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
Annals of Applied Statistics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1