Converging evidence indicates that the heterogeneity of cognitive profiles may arise through detectable alternations in brain functional connectivity. Despite an unprecedented opportunity to uncover neurobiological subtypes through clustering or subtyping analyses on multi-state functional connectivity, few existing approaches are applicable to accommodate the network topology and unique biological architecture. To address this issue, we propose an innovative Bayesian nonparametric network-variate clustering analysis to uncover subgroups of individuals with homogeneous brain functional network patterns under multiple cognitive states. In light of the existing neuroscience literature, we assume there are unknown state-specific modular structures within functional connectivity. Concurrently, we identify informative network features essential for defining subtypes. To further facilitate practical use, we develop a computationally efficient variational inference algorithm to approximate posterior inference with satisfactory estimation accuracy. Extensive simulations show the superiority of our method. We apply the method to the Adolescent Brain Cognitive Development (ABCD) study, and identify neurodevelopmental subtypes and brain sub-network phenotypes under each state to signal neurobiological heterogeneity, suggesting promising directions for further exploration and investigation in neuroscience.
{"title":"Bayesian subtyping for multi-state brain functional connectome with application on preadolescent brain cognition.","authors":"Tianqi Chen, Hongyu Zhao, Chichun Tan, Todd Constable, Sarah Yip, Yize Zhao","doi":"10.1093/biostatistics/kxae045","DOIUrl":"10.1093/biostatistics/kxae045","url":null,"abstract":"<p><p>Converging evidence indicates that the heterogeneity of cognitive profiles may arise through detectable alternations in brain functional connectivity. Despite an unprecedented opportunity to uncover neurobiological subtypes through clustering or subtyping analyses on multi-state functional connectivity, few existing approaches are applicable to accommodate the network topology and unique biological architecture. To address this issue, we propose an innovative Bayesian nonparametric network-variate clustering analysis to uncover subgroups of individuals with homogeneous brain functional network patterns under multiple cognitive states. In light of the existing neuroscience literature, we assume there are unknown state-specific modular structures within functional connectivity. Concurrently, we identify informative network features essential for defining subtypes. To further facilitate practical use, we develop a computationally efficient variational inference algorithm to approximate posterior inference with satisfactory estimation accuracy. Extensive simulations show the superiority of our method. We apply the method to the Adolescent Brain Cognitive Development (ABCD) study, and identify neurodevelopmental subtypes and brain sub-network phenotypes under each state to signal neurobiological heterogeneity, suggesting promising directions for further exploration and investigation in neuroscience.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823269/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142830270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxae025
Yihan Bao, Lauren Bell, Elizabeth Williamson, Claire Garnett, Tianchen Qian
Micro-randomized trials are commonly conducted for optimizing mobile health interventions such as push notifications for behavior change. In analyzing such trials, causal excursion effects are often of primary interest, and their estimation typically involves inverse probability weighting (IPW). However, in a micro-randomized trial, additional treatments can often occur during the time window over which an outcome is defined, and this can greatly inflate the variance of the causal effect estimator because IPW would involve a product of numerous weights. To reduce variance and improve estimation efficiency, we propose two new estimators using a modified version of IPW, which we call "per-decision IPW." The second estimator further improves efficiency using the projection idea from the semiparametric efficiency theory. These estimators are applicable when the outcome is binary and can be expressed as the maximum of a series of sub-outcomes defined over sub-intervals of time. We establish the estimators' consistency and asymptotic normality. Through simulation studies and real data applications, we demonstrate substantial efficiency improvement of the proposed estimator over existing estimators. The new estimators can be used to improve the precision of primary and secondary analyses for micro-randomized trials with binary outcomes.
{"title":"Estimating causal effects for binary outcomes using per-decision inverse probability weighting.","authors":"Yihan Bao, Lauren Bell, Elizabeth Williamson, Claire Garnett, Tianchen Qian","doi":"10.1093/biostatistics/kxae025","DOIUrl":"10.1093/biostatistics/kxae025","url":null,"abstract":"<p><p>Micro-randomized trials are commonly conducted for optimizing mobile health interventions such as push notifications for behavior change. In analyzing such trials, causal excursion effects are often of primary interest, and their estimation typically involves inverse probability weighting (IPW). However, in a micro-randomized trial, additional treatments can often occur during the time window over which an outcome is defined, and this can greatly inflate the variance of the causal effect estimator because IPW would involve a product of numerous weights. To reduce variance and improve estimation efficiency, we propose two new estimators using a modified version of IPW, which we call \"per-decision IPW.\" The second estimator further improves efficiency using the projection idea from the semiparametric efficiency theory. These estimators are applicable when the outcome is binary and can be expressed as the maximum of a series of sub-outcomes defined over sub-intervals of time. We establish the estimators' consistency and asymptotic normality. Through simulation studies and real data applications, we demonstrate substantial efficiency improvement of the proposed estimator over existing estimators. The new estimators can be used to improve the precision of primary and secondary analyses for micro-randomized trials with binary outcomes.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141794123","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxaf022
Amanda F Mejia, David Bolin, Daniel A Spencer, Ani Eloyan
Brain functional connectivity (FC), the temporal synchrony between brain networks, is essential to understand the functional organization of the brain and to identify changes due to neurological disorders, development, treatment, and other phenomena. Independent component analysis (ICA) is a matrix decomposition method used extensively for simultaneous estimation of functional brain topography and connectivity. However, estimation of FC via ICA is often sub-optimal due to the use of ad hoc estimation methods or temporal dimension reduction prior to ICA. Bayesian ICA can avoid dimension reduction, estimate latent variables and model parameters more accurately, and facilitate posterior inference. In this article, we develop a novel, computationally feasible Bayesian ICA method with population-derived priors on both the spatial ICs and their temporal correlation (that is, their FC). For the latter, we consider two priors: the inverse-Wishart, which is conjugate but is not ideally suited for modeling correlation matrices; and a novel informative prior for correlation matrices. For each prior, we derive a variational Bayes algorithm to estimate the model variables and facilitate posterior inference. Through extensive simulation studies, we evaluate the performance of the proposed methods and benchmark against existing approaches. We also analyze fMRI data from over 400 healthy adults in the Human Connectome Project. We find that our Bayesian ICA model and algorithms result in more accurate measures of functional connectivity and spatial brain features. Our novel prior for correlation matrices is more computationally intensive than the inverse-Wishart but provides improved accuracy and inference. The proposed framework is applicable to single-subject analysis, making it potentially clinically viable.
{"title":"Leveraging population information in brain connectivity via Bayesian ICA with a novel informative prior for correlation matrices.","authors":"Amanda F Mejia, David Bolin, Daniel A Spencer, Ani Eloyan","doi":"10.1093/biostatistics/kxaf022","DOIUrl":"https://doi.org/10.1093/biostatistics/kxaf022","url":null,"abstract":"<p><p>Brain functional connectivity (FC), the temporal synchrony between brain networks, is essential to understand the functional organization of the brain and to identify changes due to neurological disorders, development, treatment, and other phenomena. Independent component analysis (ICA) is a matrix decomposition method used extensively for simultaneous estimation of functional brain topography and connectivity. However, estimation of FC via ICA is often sub-optimal due to the use of ad hoc estimation methods or temporal dimension reduction prior to ICA. Bayesian ICA can avoid dimension reduction, estimate latent variables and model parameters more accurately, and facilitate posterior inference. In this article, we develop a novel, computationally feasible Bayesian ICA method with population-derived priors on both the spatial ICs and their temporal correlation (that is, their FC). For the latter, we consider two priors: the inverse-Wishart, which is conjugate but is not ideally suited for modeling correlation matrices; and a novel informative prior for correlation matrices. For each prior, we derive a variational Bayes algorithm to estimate the model variables and facilitate posterior inference. Through extensive simulation studies, we evaluate the performance of the proposed methods and benchmark against existing approaches. We also analyze fMRI data from over 400 healthy adults in the Human Connectome Project. We find that our Bayesian ICA model and algorithms result in more accurate measures of functional connectivity and spatial brain features. Our novel prior for correlation matrices is more computationally intensive than the inverse-Wishart but provides improved accuracy and inference. The proposed framework is applicable to single-subject analysis, making it potentially clinically viable.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12372588/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144979622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxae013
Qianhan Zeng, Jing Zhou, Ying Ji, Hansheng Wang
Computed tomography (CT) has been a powerful diagnostic tool since its emergence in the 1970s. Using CT data, 3D structures of human internal organs and tissues, such as blood vessels, can be reconstructed using professional software. This 3D reconstruction is crucial for surgical operations and can serve as a vivid medical teaching example. However, traditional 3D reconstruction heavily relies on manual operations, which are time-consuming, subjective, and require substantial experience. To address this problem, we develop a novel semiparametric Gaussian mixture model tailored for the 3D reconstruction of blood vessels. This model extends the classical Gaussian mixture model by enabling nonparametric variations in the component-wise parameters of interest according to voxel positions. We develop a kernel-based expectation-maximization algorithm for estimating the model parameters, accompanied by a supporting asymptotic theory. Furthermore, we propose a novel regression method for optimal bandwidth selection. Compared to the conventional cross-validation-based (CV) method, the regression method outperforms the CV method in terms of computational and statistical efficiency. In application, this methodology facilitates the fully automated reconstruction of 3D blood vessel structures with remarkable accuracy.
{"title":"A semiparametric Gaussian mixture model for chest CT-based 3D blood vessel reconstruction.","authors":"Qianhan Zeng, Jing Zhou, Ying Ji, Hansheng Wang","doi":"10.1093/biostatistics/kxae013","DOIUrl":"10.1093/biostatistics/kxae013","url":null,"abstract":"<p><p>Computed tomography (CT) has been a powerful diagnostic tool since its emergence in the 1970s. Using CT data, 3D structures of human internal organs and tissues, such as blood vessels, can be reconstructed using professional software. This 3D reconstruction is crucial for surgical operations and can serve as a vivid medical teaching example. However, traditional 3D reconstruction heavily relies on manual operations, which are time-consuming, subjective, and require substantial experience. To address this problem, we develop a novel semiparametric Gaussian mixture model tailored for the 3D reconstruction of blood vessels. This model extends the classical Gaussian mixture model by enabling nonparametric variations in the component-wise parameters of interest according to voxel positions. We develop a kernel-based expectation-maximization algorithm for estimating the model parameters, accompanied by a supporting asymptotic theory. Furthermore, we propose a novel regression method for optimal bandwidth selection. Compared to the conventional cross-validation-based (CV) method, the regression method outperforms the CV method in terms of computational and statistical efficiency. In application, this methodology facilitates the fully automated reconstruction of 3D blood vessel structures with remarkable accuracy.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140869271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Progress in neuroscience has provided unprecedented opportunities to advance our understanding of brain alterations and their correspondence to phenotypic profiles. With data collected from various imaging techniques, studies have integrated different types of information ranging from brain structure, function, or metabolism. More recently, an emerging way to categorize imaging traits is through a metric hierarchy, including localized node-level measurements and interactive network-level metrics. However, limited research has been conducted to integrate these different hierarchies and achieve a better understanding of the neurobiological mechanisms and communications. In this work, we address this literature gap by proposing a Bayesian regression model under both vector-variate and matrix-variate predictors. To characterize the interplay between different predicting components, we propose a set of biologically plausible prior models centered on an innovative joint thresholded prior. This captures the coupling and grouping effect of signal patterns, as well as their spatial contiguity across brain anatomy. By developing a posterior inference, we can identify and quantify the uncertainty of signaling node- and network-level neuromarkers, as well as their predictive mechanism for phenotypic outcomes. Through extensive simulations, we demonstrate that our proposed method outperforms the alternative approaches substantially in both out-of-sample prediction and feature selection. By implementing the model to study children's general mental abilities, we establish a powerful predictive mechanism based on the identified task contrast traits and resting-state sub-networks.
{"title":"Bayesian thresholded modeling for integrating brain node and network predictors.","authors":"Zhe Sun, Wanwan Xu, Tianxi Li, Jian Kang, Gregorio Alanis-Lobato, Yize Zhao","doi":"10.1093/biostatistics/kxae048","DOIUrl":"10.1093/biostatistics/kxae048","url":null,"abstract":"<p><p>Progress in neuroscience has provided unprecedented opportunities to advance our understanding of brain alterations and their correspondence to phenotypic profiles. With data collected from various imaging techniques, studies have integrated different types of information ranging from brain structure, function, or metabolism. More recently, an emerging way to categorize imaging traits is through a metric hierarchy, including localized node-level measurements and interactive network-level metrics. However, limited research has been conducted to integrate these different hierarchies and achieve a better understanding of the neurobiological mechanisms and communications. In this work, we address this literature gap by proposing a Bayesian regression model under both vector-variate and matrix-variate predictors. To characterize the interplay between different predicting components, we propose a set of biologically plausible prior models centered on an innovative joint thresholded prior. This captures the coupling and grouping effect of signal patterns, as well as their spatial contiguity across brain anatomy. By developing a posterior inference, we can identify and quantify the uncertainty of signaling node- and network-level neuromarkers, as well as their predictive mechanism for phenotypic outcomes. Through extensive simulations, we demonstrate that our proposed method outperforms the alternative approaches substantially in both out-of-sample prediction and feature selection. By implementing the model to study children's general mental abilities, we establish a powerful predictive mechanism based on the identified task contrast traits and resting-state sub-networks.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11823287/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142959273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxaf021
Ethan M Alt, Xiuya Chang, Qing Liu, Xun Jiang, May Mo, H Amy Xia, Joseph G Ibrahim
In clinical trials, it is often valuable to borrow information from external data sources. Unfortunately, when the external data are fully or partially incompatible with the current trial data, type I error rates can be highly inflated under traditional blanket discounting schemes such as power priors, commensurate priors, and meta-analytic predictive priors. However, such inflation of the probability of a false positive can be necessary, as the alternative is to have an underpowered study. For clinical trials with time-to-event (TTE) outcomes, this problem is exacerbated since many observations are censored. In this paper, we develop the latent exchangeability prior for TTE data. We also present a novel framework to borrow information about the treatment effect between groups as well as incorporate information from external controls. Simulation results suggest that, although efficiency gains can be achieved by borrowing information among external controls, operating characteristics in general can be quite poor under a lack of exchangeability. We apply our approach to a real clinical trial in second-line metastatic colorectal cancer.
{"title":"Control arm augmentation and hierarchical modeling in time-to-event trials: advantages and pitfalls.","authors":"Ethan M Alt, Xiuya Chang, Qing Liu, Xun Jiang, May Mo, H Amy Xia, Joseph G Ibrahim","doi":"10.1093/biostatistics/kxaf021","DOIUrl":"10.1093/biostatistics/kxaf021","url":null,"abstract":"<p><p>In clinical trials, it is often valuable to borrow information from external data sources. Unfortunately, when the external data are fully or partially incompatible with the current trial data, type I error rates can be highly inflated under traditional blanket discounting schemes such as power priors, commensurate priors, and meta-analytic predictive priors. However, such inflation of the probability of a false positive can be necessary, as the alternative is to have an underpowered study. For clinical trials with time-to-event (TTE) outcomes, this problem is exacerbated since many observations are censored. In this paper, we develop the latent exchangeability prior for TTE data. We also present a novel framework to borrow information about the treatment effect between groups as well as incorporate information from external controls. Simulation results suggest that, although efficiency gains can be achieved by borrowing information among external controls, operating characteristics in general can be quite poor under a lack of exchangeability. We apply our approach to a real clinical trial in second-line metastatic colorectal cancer.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"144838617","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxaf043
Haodong Tian, Ashish Patel, Stephen Burgess
Nonlinear causal effects are prevalent in many research scenarios involving continuous exposures, and instrumental variables (IVs) can be employed to investigate such effects, particularly in the presence of unmeasured confounders. However, common IV methods for nonlinear effect analysis, such as IV regression or the control-function method, have inherent limitations, leading to either low statistical power or potentially misleading conclusions. In this work, we propose an alternative IV framework for nonlinear effect analysis, which has recently emerged in genetic epidemiology and addresses many of the drawbacks of existing IV methods. The proposed IV framework consists of up to three key "S" elements: (i) the Stratification approach, which constructs multiple strata that are sub-samples of the population in which the IV core assumptions remain valid, (ii) the Scalar-on-function model and Scalar-on-scalar model, which connect local stratum-specific information to global effect estimation, and (iii) the Sum-of-single-effects method for effect estimation. This framework enables study of the effect function while avoiding unnecessary model assumptions. In particular, it facilitates the identification of change points or threshold values in causal effects. Through a wide variety of simulations, we demonstrate that our framework outperforms other representative nonlinear IV methods in predicting the effect shape when the instrument is weak and can accurately estimate the effect function as well as identify the change point and predict its value under various structural model and effect shape scenarios. We further apply our framework to assess the nonlinear effect of alcohol consumption on systolic blood pressure using a genetic instrument (ie Mendelian randomization) with UK Biobank data. Our analysis detects a threshold beyond which alcohol intake exhibits a clear causal effect on the outcome. Our results are consistent with published medical guidelines.
{"title":"Stratification-based instrumental variable analysis framework for nonlinear effect analysis.","authors":"Haodong Tian, Ashish Patel, Stephen Burgess","doi":"10.1093/biostatistics/kxaf043","DOIUrl":"10.1093/biostatistics/kxaf043","url":null,"abstract":"<p><p>Nonlinear causal effects are prevalent in many research scenarios involving continuous exposures, and instrumental variables (IVs) can be employed to investigate such effects, particularly in the presence of unmeasured confounders. However, common IV methods for nonlinear effect analysis, such as IV regression or the control-function method, have inherent limitations, leading to either low statistical power or potentially misleading conclusions. In this work, we propose an alternative IV framework for nonlinear effect analysis, which has recently emerged in genetic epidemiology and addresses many of the drawbacks of existing IV methods. The proposed IV framework consists of up to three key \"S\" elements: (i) the Stratification approach, which constructs multiple strata that are sub-samples of the population in which the IV core assumptions remain valid, (ii) the Scalar-on-function model and Scalar-on-scalar model, which connect local stratum-specific information to global effect estimation, and (iii) the Sum-of-single-effects method for effect estimation. This framework enables study of the effect function while avoiding unnecessary model assumptions. In particular, it facilitates the identification of change points or threshold values in causal effects. Through a wide variety of simulations, we demonstrate that our framework outperforms other representative nonlinear IV methods in predicting the effect shape when the instrument is weak and can accurately estimate the effect function as well as identify the change point and predict its value under various structural model and effect shape scenarios. We further apply our framework to assess the nonlinear effect of alcohol consumption on systolic blood pressure using a genetic instrument (ie Mendelian randomization) with UK Biobank data. Our analysis detects a threshold beyond which alcohol intake exhibits a clear causal effect on the outcome. Our results are consistent with published medical guidelines.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12665183/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145643011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxaf050
Huaqing Jin, Fei Jiang
Alzheimer's disease (AD) is a progressive, chronic neurodegenerative disorder affecting millions worldwide. A new clinical magnetoencephalography (MEG) study was conducted to identify neural activity biomarkers and key brain regions in AD. Traditional methods for analyzing MEG data, which typically extract features from power spectral density, suffer from information loss. Furthermore, functional regression with variable selection tends to produce non-robust results, making it less ideal for drawing reliable scientific conclusions. To address these challenges, we propose a high-dimensional hypothesis testing (HDHT) framework for functional covariates and introduce a rigorous inference process to support scientific conclusions. We establish the theoretical properties of the HDHT framework and validate its performance through simulation studies. Applying the HDHT framework to the AD MEG data, we identify 19 important regions associated with cognitive functions that align with established AD pathophysiology. These findings suggest that the non-invasive MEG can be a potential low-risk and low-toxicity modality for monitoring neurodegenerative progression.
{"title":"High-dimensional inference for functional regression with an application to the Alzheimer's disease magnetoencephalography study.","authors":"Huaqing Jin, Fei Jiang","doi":"10.1093/biostatistics/kxaf050","DOIUrl":"10.1093/biostatistics/kxaf050","url":null,"abstract":"<p><p>Alzheimer's disease (AD) is a progressive, chronic neurodegenerative disorder affecting millions worldwide. A new clinical magnetoencephalography (MEG) study was conducted to identify neural activity biomarkers and key brain regions in AD. Traditional methods for analyzing MEG data, which typically extract features from power spectral density, suffer from information loss. Furthermore, functional regression with variable selection tends to produce non-robust results, making it less ideal for drawing reliable scientific conclusions. To address these challenges, we propose a high-dimensional hypothesis testing (HDHT) framework for functional covariates and introduce a rigorous inference process to support scientific conclusions. We establish the theoretical properties of the HDHT framework and validate its performance through simulation studies. Applying the HDHT framework to the AD MEG data, we identify 19 important regions associated with cognitive functions that align with established AD pathophysiology. These findings suggest that the non-invasive MEG can be a potential low-risk and low-toxicity modality for monitoring neurodegenerative progression.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":"26 1","pages":""},"PeriodicalIF":2.0,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC12728160/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145822185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-12-31DOI: 10.1093/biostatistics/kxae015
Gen Li, Miaoyan Wang
Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.
{"title":"Simultaneous clustering and estimation of networks in multiple graphical models.","authors":"Gen Li, Miaoyan Wang","doi":"10.1093/biostatistics/kxae015","DOIUrl":"10.1093/biostatistics/kxae015","url":null,"abstract":"<p><p>Gaussian graphical models are widely used to study the dependence structure among variables. When samples are obtained from multiple conditions or populations, joint analysis of multiple graphical models are desired due to their capacity to borrow strength across populations. Nonetheless, existing methods often overlook the varying levels of similarity between populations, leading to unsatisfactory results. Moreover, in many applications, learning the population-level clustering structure itself is of particular interest. In this article, we develop a novel method, called Simultaneous Clustering and Estimation of Networks via Tensor decomposition (SCENT), that simultaneously clusters and estimates graphical models from multiple populations. Precision matrices from different populations are uniquely organized as a three-way tensor array, and a low-rank sparse model is proposed for joint population clustering and network estimation. We develop a penalized likelihood method and an augmented Lagrangian algorithm for model fitting. We also establish the clustering accuracy and norm consistency of the estimated precision matrices. We demonstrate the efficacy of the proposed method with comprehensive simulation studies. The application to the Genotype-Tissue Expression multi-tissue gene expression data provides important insights into tissue clustering and gene coexpression patterns in multiple brain tissues.</p>","PeriodicalId":55357,"journal":{"name":"Biostatistics","volume":" ","pages":""},"PeriodicalIF":1.8,"publicationDate":"2024-12-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11826093/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141263584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}