Widemberg S. Nobre, A. M. Schmidt, E. Moodie, D. Stephens
We propose and discuss a Bayesian procedure to estimate causal effects for multilevel observations in the presence of confounding. This work is motivated by an interest in determining the causal impact of directly observed therapy on the successful treatment of Tuberculosis. We focus on propensity score regression and covariate adjustment to balance the treatment allocation. We discuss the need to include latent local-level random effects in the propensity score model to reduce bias in the estimation of causal effects. A simulation study suggests that accounting for the multilevel nature of the data with latent structures in both the outcome and propensity score models has the potential to reduce bias in the estimation of causal effects.
{"title":"The impact of directly observed therapy on the efficacy of Tuberculosis treatment: a Bayesian multilevel approach","authors":"Widemberg S. Nobre, A. M. Schmidt, E. Moodie, D. Stephens","doi":"10.1093/jrsssc/qlad034","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad034","url":null,"abstract":"\u0000 We propose and discuss a Bayesian procedure to estimate causal effects for multilevel observations in the presence of confounding. This work is motivated by an interest in determining the causal impact of directly observed therapy on the successful treatment of Tuberculosis. We focus on propensity score regression and covariate adjustment to balance the treatment allocation. We discuss the need to include latent local-level random effects in the propensity score model to reduce bias in the estimation of causal effects. A simulation study suggests that accounting for the multilevel nature of the data with latent structures in both the outcome and propensity score models has the potential to reduce bias in the estimation of causal effects.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"40 6 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88071403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The central idea of this paper is to compare mean responses of several subjects in the presence of censoring and subject-specific variation. We develop a semiparametric mixed model for fitting subject-specific hazard curves to a set of censored failure times. A spline-based model and a mixed effects framework for smoothing are used. Efficient estimators of fixed parameters and predictors of the random components are derived and their asymptotic properties studied. This is a generalization of the method proposed by [Cai, T., Hyndman, R. J., & Wand, M. P. (2002). Mixed model-based hazard estimation. Journal of Computational and Graphical Statistics, 11(4), 784–798. https://doi.org/10.1198/106186002862] to incorporate additional subject-specific variation of the hazard function. The results are illustrated using two motivating examples.
本文的中心思想是比较几个受试者在审查和受试者特定变化的情况下的平均反应。我们开发了一个半参数混合模型,用于拟合主题特定的危险曲线到一组截尾失效时间。使用基于样条的模型和混合效果框架进行平滑。导出了固定参数的有效估计量和随机分量的有效预测量,并研究了它们的渐近性质。这是对Cai, T., Hyndman, R. J, and Wand, M. P.(2002)提出的方法的推广。基于混合模型的危害估计。计算与图形统计,11(4),784-798。https://doi.org/10.1198/106186002862]以纳入额外的针对特定主题的危险函数变化。用两个实例说明了结果。
{"title":"Estimating subject-specific hazard functions","authors":"Moumita Chatterjee, B. Ganguli, Sugata Sen Roy","doi":"10.1093/jrsssc/qlad030","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad030","url":null,"abstract":"\u0000 The central idea of this paper is to compare mean responses of several subjects in the presence of censoring and subject-specific variation. We develop a semiparametric mixed model for fitting subject-specific hazard curves to a set of censored failure times. A spline-based model and a mixed effects framework for smoothing are used. Efficient estimators of fixed parameters and predictors of the random components are derived and their asymptotic properties studied. This is a generalization of the method proposed by [Cai, T., Hyndman, R. J., & Wand, M. P. (2002). Mixed model-based hazard estimation. Journal of Computational and Graphical Statistics, 11(4), 784–798. https://doi.org/10.1198/106186002862] to incorporate additional subject-specific variation of the hazard function. The results are illustrated using two motivating examples.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"11 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88976374","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Yang Li, Haoyu Yang, Haochen Yu, Hanwen Huang, Ye Shen
Considering the inevitable correlation among different datasets within the same subject, we propose a framework of variable selection on multiply imputed data with penalized weighted least squares (PWLS–MI). The methodological development is motivated by an epidemiological study of A/H7N9 patients from Zhejiang province in China, where nearly half of the variables are not fully observed. Multiple imputation is commonly adopted as a missing data processing method. However, it generates correlations among imputed values within the same subject across datasets. Recent work on variable selection for multiply imputed data does not fully address such similarities. We propose PWLS–MI to incorporate the correlation when performing the variable selection. PWLS–MI can be considered as a framework for variable selection on multiply imputed data since it allows various penalties. We use adaptive LASSO as an illustrating example. Extensive simulation studies are conducted to compare PWLS–MI with recently developed methods and the results suggest that the proposed approach outperforms in terms of both selection accuracy and deletion accuracy. PWLS–MI is shown to select variables with clinical relevance when applied to the A/H7N9 database.
{"title":"Penalized weighted least-squares estimate for variable selection on correlated multiply imputed data","authors":"Yang Li, Haoyu Yang, Haochen Yu, Hanwen Huang, Ye Shen","doi":"10.1093/jrsssc/qlad028","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad028","url":null,"abstract":"\u0000 Considering the inevitable correlation among different datasets within the same subject, we propose a framework of variable selection on multiply imputed data with penalized weighted least squares (PWLS–MI). The methodological development is motivated by an epidemiological study of A/H7N9 patients from Zhejiang province in China, where nearly half of the variables are not fully observed. Multiple imputation is commonly adopted as a missing data processing method. However, it generates correlations among imputed values within the same subject across datasets. Recent work on variable selection for multiply imputed data does not fully address such similarities. We propose PWLS–MI to incorporate the correlation when performing the variable selection. PWLS–MI can be considered as a framework for variable selection on multiply imputed data since it allows various penalties. We use adaptive LASSO as an illustrating example. Extensive simulation studies are conducted to compare PWLS–MI with recently developed methods and the results suggest that the proposed approach outperforms in terms of both selection accuracy and deletion accuracy. PWLS–MI is shown to select variables with clinical relevance when applied to the A/H7N9 database.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"1 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91394783","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We develop in this manuscript a method for performing estimation and inference for the reproduction number of an epidemiological outbreak, focusing on the COVID-19 epidemic. The estimator is time-dependent and uses spline modelling to adapt to changes in the outbreak. This is accomplished by directly modelling the series of new infections as a function of time and subsequently using the derivative of the function to define a time-varying reproduction number, which is then used to assess the evolution of the epidemic for several countries.
{"title":"A spline-based time-varying reproduction number for modelling epidemiological outbreaks","authors":"Eugen Pircalabelu","doi":"10.1093/jrsssc/qlad027","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad027","url":null,"abstract":"\u0000 We develop in this manuscript a method for performing estimation and inference for the reproduction number of an epidemiological outbreak, focusing on the COVID-19 epidemic. The estimator is time-dependent and uses spline modelling to adapt to changes in the outbreak. This is accomplished by directly modelling the series of new infections as a function of time and subsequently using the derivative of the function to define a time-varying reproduction number, which is then used to assess the evolution of the epidemic for several countries.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"26 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74294069","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Ordinal endpoints are common in clinical studies. For example, many clinical trials for evaluating COVID-19 infection therapies have adopted an ordinal scale as recommended by the World Health Organization. Despite their importance in clinical studies, design methods for ordinal endpoints are limited; in practice, a dichotomized approach is often used for simplicity. Here, we introduce a Bayesian group sequential scheme to assess ordinal endpoints, which considers a proportional-odds (PO) model, a nonproportional-odds (NPO) model, and a PO/NPO-switch model to handle various scenarios. Extensive simulations are conducted to demonstrate desirable performance, and the R package BayesOrdDesign has been made publicly available.
{"title":"A Bayesian two-stage group sequential scheme for ordinal endpoints","authors":"Chengxue Zhong, Hongyu Miao, H. Pan","doi":"10.1093/jrsssc/qlad026","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad026","url":null,"abstract":"\u0000 Ordinal endpoints are common in clinical studies. For example, many clinical trials for evaluating COVID-19 infection therapies have adopted an ordinal scale as recommended by the World Health Organization. Despite their importance in clinical studies, design methods for ordinal endpoints are limited; in practice, a dichotomized approach is often used for simplicity. Here, we introduce a Bayesian group sequential scheme to assess ordinal endpoints, which considers a proportional-odds (PO) model, a nonproportional-odds (NPO) model, and a PO/NPO-switch model to handle various scenarios. Extensive simulations are conducted to demonstrate desirable performance, and the R package BayesOrdDesign has been made publicly available.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"1 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-04-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86828937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Henrik Wiechers, Benjamin Eltzner, Kanti V Mardia, Stephan F Huckemann
Abstract Three-dimensional RNA structures frequently contain atomic clashes. Usually, corrections approximate the biophysical chemistry, which is computationally intensive and often does not correct all clashes. We propose fast, data-driven reconstructions from clash-free benchmark data with two-scale shape analysis: microscopic (suites) dihedral backbone angles, mesoscopic sugar ring centre landmarks. Our analysis relates concentrated mesoscopic scale neighbourhoods to microscopic scale clusters, correcting within-suite-backbone-to-backbone clashes exploiting angular shape and size-and-shape Fréchet means. Validation shows that learned classes highly correspond with literature clusters and reconstructions are well within physical resolution. We illustrate the power of our method using cutting-edge SARS-CoV-2 RNA.
{"title":"Learning torus PCA-based classification for multiscale RNA correction with application to SARS-CoV-2","authors":"Henrik Wiechers, Benjamin Eltzner, Kanti V Mardia, Stephan F Huckemann","doi":"10.1093/jrsssc/qlad004","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad004","url":null,"abstract":"Abstract Three-dimensional RNA structures frequently contain atomic clashes. Usually, corrections approximate the biophysical chemistry, which is computationally intensive and often does not correct all clashes. We propose fast, data-driven reconstructions from clash-free benchmark data with two-scale shape analysis: microscopic (suites) dihedral backbone angles, mesoscopic sugar ring centre landmarks. Our analysis relates concentrated mesoscopic scale neighbourhoods to microscopic scale clusters, correcting within-suite-backbone-to-backbone clashes exploiting angular shape and size-and-shape Fréchet means. Validation shows that learned classes highly correspond with literature clusters and reconstructions are well within physical resolution. We illustrate the power of our method using cutting-edge SARS-CoV-2 RNA.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-03-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136091430","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Stochastic models are appealing for mortality forecasting in their ability to generate intervals that quantify uncertainties underlying the forecasts. We present a fully Bayesian implementation of the age-period-cohort-improvement (APCI) model with overdispersion, which is compared with the Lee–Carter model with cohorts. We show that naive prior specification can yield misleading inferences, where we propose Laplace prior as an elegant solution. We also perform model averaging to incorporate model uncertainty. Our findings indicate that the APCI model offers better fit and forecast for England and Wales data spanning 1961–2002. Our approach also allows coherent inclusion of multiple sources of uncertainty, producing well-calibrated probabilistic intervals.
{"title":"Bayesian model comparison for mortality forecasting","authors":"Jackie S. T. Wong, J. Forster, Peter W. F. Smith","doi":"10.1093/jrsssc/qlad021","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad021","url":null,"abstract":"\u0000 Stochastic models are appealing for mortality forecasting in their ability to generate intervals that quantify uncertainties underlying the forecasts. We present a fully Bayesian implementation of the age-period-cohort-improvement (APCI) model with overdispersion, which is compared with the Lee–Carter model with cohorts. We show that naive prior specification can yield misleading inferences, where we propose Laplace prior as an elegant solution. We also perform model averaging to incorporate model uncertainty. Our findings indicate that the APCI model offers better fit and forecast for England and Wales data spanning 1961–2002. Our approach also allows coherent inclusion of multiple sources of uncertainty, producing well-calibrated probabilistic intervals.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"46 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88234572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
High-resolution circumference dendrometers measure the irreversible growth and the reversible shrinking and swelling due to the water content of a tree stem. We propose a novel statistical method to decompose these measurements into a permanent and a temporary component, while explaining differences between the trees and years by covariates. Our model embeds Gaussian processes with parametric mean and covariance functions as response structures in a distributional regression framework with structured additive predictors. We discuss different mean and covariance functions, connections with other model classes, Markov chain Monte Carlo inference, and the efficiency of our sampling scheme.
{"title":"Modelling intra-annual tree stem growth with a distributional regression approach for Gaussian process responses","authors":"Hannes Riebl, N. Klein, T. Kneib","doi":"10.1093/jrsssc/qlad015","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad015","url":null,"abstract":"\u0000 High-resolution circumference dendrometers measure the irreversible growth and the reversible shrinking and swelling due to the water content of a tree stem. We propose a novel statistical method to decompose these measurements into a permanent and a temporary component, while explaining differences between the trees and years by covariates. Our model embeds Gaussian processes with parametric mean and covariance functions as response structures in a distributional regression framework with structured additive predictors. We discuss different mean and covariance functions, connections with other model classes, Markov chain Monte Carlo inference, and the efficiency of our sampling scheme.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"26 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84823517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Existing integer-valued generalised autoregressive conditional heteroskedasticity (INGARCH) models for spatio-temporal counts do not allow for negative parameter and autocorrelation values. Using approximately linear INGARCH models, the unified and flexible spatio-temporal (B)INGARCH framework for modelling unbounded (bounded) counts is proposed. These models combine negative dependencies with kinds of a long memory. They are easily adapted to special marginal features or cross-dependencies: When modelling precipitation data (counts of rainy hours), we account for zero-inflation, while for cloud-coverage data (counts of okta), we deal with missing data and additional cross-correlation. A copula related to the spatial error model shows an appealing performance.
{"title":"Approximately linear INGARCH models for spatio-temporal counts","authors":"Malte Jahn, C. Weiß, Hee-Young Kim","doi":"10.1093/jrsssc/qlad018","DOIUrl":"https://doi.org/10.1093/jrsssc/qlad018","url":null,"abstract":"\u0000 Existing integer-valued generalised autoregressive conditional heteroskedasticity (INGARCH) models for spatio-temporal counts do not allow for negative parameter and autocorrelation values. Using approximately linear INGARCH models, the unified and flexible spatio-temporal (B)INGARCH framework for modelling unbounded (bounded) counts is proposed. These models combine negative dependencies with kinds of a long memory. They are easily adapted to special marginal features or cross-dependencies: When modelling precipitation data (counts of rainy hours), we account for zero-inflation, while for cloud-coverage data (counts of okta), we deal with missing data and additional cross-correlation. A copula related to the spatial error model shows an appealing performance.","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"126 1","pages":""},"PeriodicalIF":1.6,"publicationDate":"2023-03-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80040119","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2023-03-15eCollection Date: 2023-05-01DOI: 10.1093/jrsssc/qlac005
Bora Jin, David B Dunson, Julia E Rager, David M Reif, Stephanie M Engel, Amy H Herring
We aim to infer bioactivity of each chemical by assay endpoint combination, addressing sparsity of toxicology data. We propose a Bayesian hierarchical framework which borrows information across different chemicals and assay endpoints, facilitates out-of-sample prediction of activity for chemicals not yet assayed, quantifies uncertainty of predicted activity, and adjusts for multiplicity in hypothesis testing. Furthermore, this paper makes a novel attempt in toxicology to simultaneously model heteroscedastic errors and a nonparametric mean function, leading to a broader definition of activity whose need has been suggested by toxicologists. Real application identifies chemicals most likely active for neurodevelopmental disorders and obesity.
{"title":"Bayesian matrix completion for hypothesis testing.","authors":"Bora Jin, David B Dunson, Julia E Rager, David M Reif, Stephanie M Engel, Amy H Herring","doi":"10.1093/jrsssc/qlac005","DOIUrl":"10.1093/jrsssc/qlac005","url":null,"abstract":"<p><p>We aim to infer bioactivity of each chemical by assay endpoint combination, addressing sparsity of toxicology data. We propose a Bayesian hierarchical framework which borrows information across different chemicals and assay endpoints, facilitates out-of-sample prediction of activity for chemicals not yet assayed, quantifies uncertainty of predicted activity, and adjusts for multiplicity in hypothesis testing. Furthermore, this paper makes a novel attempt in toxicology to simultaneously model heteroscedastic errors and a nonparametric mean function, leading to a broader definition of activity whose need has been suggested by toxicologists. Real application identifies chemicals most likely active for neurodevelopmental disorders and obesity.</p>","PeriodicalId":49981,"journal":{"name":"Journal of the Royal Statistical Society Series C-Applied Statistics","volume":"72 2","pages":"254-270"},"PeriodicalIF":1.0,"publicationDate":"2023-03-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10184491/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"9480094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}