Xingche Guo, Bin Yang, Ji Meng Loh, Qinxia Wang, Yuanjia Wang
Mental disorders present challenges in diagnosis and treatment due to their complex and heterogeneous nature. Electroencephalogram (EEG) has shown promise as a source of potential biomarkers for these disorders. However, existing methods for analyzing EEG signals have limitations in addressing heterogeneity and capturing complex brain activity patterns between regions. This paper proposes a novel random effects state-space model (RESSM) for analyzing large-scale multi-channel resting-state EEG signals, accounting for the heterogeneity of brain connectivities between groups and individual subjects. We incorporate multi-level random effects for temporal dynamical and spatial mapping matrices and address non-stationarity so that the brain connectivity patterns can vary over time. The model is fitted under a Bayesian hierarchical model framework coupled with a Gibbs sampler. Compared to previous mixed-effects state-space models, we directly model high-dimensional random effects matrices of interest without structural constraints and tackle the challenge of identifiability. Through extensive simulation studies, we demonstrate that our approach yields valid estimation and inference. We apply RESSM to a multi-site clinical trial of major depressive disorder (MDD). Our analysis uncovers significant differences in resting-state brain temporal dynamics among MDD patients compared to healthy individuals. In addition, we show the subject-level EEG features derived from RESSM exhibit a superior predictive value for the heterogeneous treatment effect compared to the EEG frequency band power, suggesting the potential of EEG as a valuable biomarker for MDD.
{"title":"A hierarchical random effects state-space model for modeling brain activities from electroencephalogram data.","authors":"Xingche Guo, Bin Yang, Ji Meng Loh, Qinxia Wang, Yuanjia Wang","doi":"10.1093/biomtc/ujae130","DOIUrl":"10.1093/biomtc/ujae130","url":null,"abstract":"<p><p>Mental disorders present challenges in diagnosis and treatment due to their complex and heterogeneous nature. Electroencephalogram (EEG) has shown promise as a source of potential biomarkers for these disorders. However, existing methods for analyzing EEG signals have limitations in addressing heterogeneity and capturing complex brain activity patterns between regions. This paper proposes a novel random effects state-space model (RESSM) for analyzing large-scale multi-channel resting-state EEG signals, accounting for the heterogeneity of brain connectivities between groups and individual subjects. We incorporate multi-level random effects for temporal dynamical and spatial mapping matrices and address non-stationarity so that the brain connectivity patterns can vary over time. The model is fitted under a Bayesian hierarchical model framework coupled with a Gibbs sampler. Compared to previous mixed-effects state-space models, we directly model high-dimensional random effects matrices of interest without structural constraints and tackle the challenge of identifiability. Through extensive simulation studies, we demonstrate that our approach yields valid estimation and inference. We apply RESSM to a multi-site clinical trial of major depressive disorder (MDD). Our analysis uncovers significant differences in resting-state brain temporal dynamics among MDD patients compared to healthy individuals. In addition, we show the subject-level EEG features derived from RESSM exhibit a superior predictive value for the heterogeneous treatment effect compared to the EEG frequency band power, suggesting the potential of EEG as a valuable biomarker for MDD.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11540184/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142590082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Samuel Perreault, Gracia Y Dong, Alex Stringer, Hwashin Shin, Patrick E Brown
Over the last three decades, case-crossover designs have found many applications in health sciences, especially in air pollution epidemiology. They are typically used, in combination with partial likelihood techniques, to define a conditional logistic model for the responses, usually health outcomes, conditional on the exposures. Despite the fact that conditional logistic models have been shown equivalent, in typical air pollution epidemiology setups, to specific instances of the well-known Poisson time series model, it is often claimed that they cannot allow for overdispersion. This paper clarifies the relationship between case-crossover designs, the models that ensue from their use, and overdispersion. In particular, we propose to relax the assumption of independence between individuals traditionally made in case-crossover analyses, in order to explicitly introduce overdispersion in the conditional logistic model. As we show, the resulting overdispersed conditional logistic model coincides with the overdispersed, conditional Poisson model, in the sense that their likelihoods are simple re-expressions of one another. We further provide the technical details of a Bayesian implementation of the proposed case-crossover model, which we use to demonstrate, by means of a large simulation study, that standard case-crossover models can lead to dramatically underestimated coverage probabilities, while the proposed models do not. We also perform an illustrative analysis of the association between air pollution and morbidity in Toronto, Canada, which shows that the proposed models are more robust than standard ones to outliers such as those associated with public holidays.
{"title":"Case-crossover designs and overdispersion with application to air pollution epidemiology.","authors":"Samuel Perreault, Gracia Y Dong, Alex Stringer, Hwashin Shin, Patrick E Brown","doi":"10.1093/biomtc/ujae117","DOIUrl":"https://doi.org/10.1093/biomtc/ujae117","url":null,"abstract":"<p><p>Over the last three decades, case-crossover designs have found many applications in health sciences, especially in air pollution epidemiology. They are typically used, in combination with partial likelihood techniques, to define a conditional logistic model for the responses, usually health outcomes, conditional on the exposures. Despite the fact that conditional logistic models have been shown equivalent, in typical air pollution epidemiology setups, to specific instances of the well-known Poisson time series model, it is often claimed that they cannot allow for overdispersion. This paper clarifies the relationship between case-crossover designs, the models that ensue from their use, and overdispersion. In particular, we propose to relax the assumption of independence between individuals traditionally made in case-crossover analyses, in order to explicitly introduce overdispersion in the conditional logistic model. As we show, the resulting overdispersed conditional logistic model coincides with the overdispersed, conditional Poisson model, in the sense that their likelihoods are simple re-expressions of one another. We further provide the technical details of a Bayesian implementation of the proposed case-crossover model, which we use to demonstrate, by means of a large simulation study, that standard case-crossover models can lead to dramatically underestimated coverage probabilities, while the proposed models do not. We also perform an illustrative analysis of the association between air pollution and morbidity in Toronto, Canada, which shows that the proposed models are more robust than standard ones to outliers such as those associated with public holidays.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142457171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network deconvolution (ND) is a method to reconstruct a direct-effect network describing direct (or conditional) effects (or associations) between any two nodes from a given network depicting total (or marginal) effects (or associations). Its key idea is that, in a directed graph, a total effect can be decomposed into the sum of a direct and an indirect effects, with the latter further decomposed as the sum of various products of direct effects. This yields a simple closed-form solution for the direct-effect network, facilitating its important applications to distinguish direct and indirect effects. Despite its application to undirected graphs, it is not well known why the method works, leaving it with skepticism. We first clarify the implicit linear model assumption underlying ND, then derive a surprisingly simple result on the equivalence between ND and use of precision matrices, offering insightful justification and interpretation for the application of ND to undirected graphs. We also establish a formal result to characterize the effect of scaling a total-effect graph. Finally, leveraging large-scale genome-wide association study data, we show a novel application of ND to contrast marginal versus conditional genetic correlations between body height and risk of coronary artery disease; the results align with an inferred causal directed graph using ND. We conclude that ND is a promising approach with its easy and wide applicability to both directed and undirected graphs.
{"title":"On network deconvolution for undirected graphs.","authors":"Zhaotong Lin, Isaac Pan, Wei Pan","doi":"10.1093/biomtc/ujae112","DOIUrl":"https://doi.org/10.1093/biomtc/ujae112","url":null,"abstract":"<p><p>Network deconvolution (ND) is a method to reconstruct a direct-effect network describing direct (or conditional) effects (or associations) between any two nodes from a given network depicting total (or marginal) effects (or associations). Its key idea is that, in a directed graph, a total effect can be decomposed into the sum of a direct and an indirect effects, with the latter further decomposed as the sum of various products of direct effects. This yields a simple closed-form solution for the direct-effect network, facilitating its important applications to distinguish direct and indirect effects. Despite its application to undirected graphs, it is not well known why the method works, leaving it with skepticism. We first clarify the implicit linear model assumption underlying ND, then derive a surprisingly simple result on the equivalence between ND and use of precision matrices, offering insightful justification and interpretation for the application of ND to undirected graphs. We also establish a formal result to characterize the effect of scaling a total-effect graph. Finally, leveraging large-scale genome-wide association study data, we show a novel application of ND to contrast marginal versus conditional genetic correlations between body height and risk of coronary artery disease; the results align with an inferred causal directed graph using ND. We conclude that ND is a promising approach with its easy and wide applicability to both directed and undirected graphs.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11459367/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142387636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The problem of health and care of people is being revolutionized. An important component of that revolution is disease prevention and health improvement from home. A natural approach to the health problem is monitoring changes in people's behavior or activities. These changes can be indicators of potential health problems. However, due to a person's daily pattern, changes will be observed throughout each day, with, eg, an increase of events around meal times and fewer events during the night. We do not wish to detect such within-day changes but rather changes in the daily behavior pattern from one day to the next. To this end, we assume the set of event times within a given day as a single observation. We model this observation as the realization of an inhomogeneous Poisson process where the rate function can vary with the time of day. Then, we propose to detect changes in the sequence of inhomogeneous Poisson processes. This approach is appropriate for many phenomena, particularly for home activity data. Our methodology is evaluated on simulated data. Overall, our approach uses local change information to detect changes across days. At the same time, it allows us to visualize and interpret the results, changes, and trends over time, allowing the detection of potential health decline.
{"title":"Changepoint detection on daily home activity pattern: a sliced Poisson process method.","authors":"Israel Martínez-Hernández, Rebecca Killick","doi":"10.1093/biomtc/ujae114","DOIUrl":"https://doi.org/10.1093/biomtc/ujae114","url":null,"abstract":"<p><p>The problem of health and care of people is being revolutionized. An important component of that revolution is disease prevention and health improvement from home. A natural approach to the health problem is monitoring changes in people's behavior or activities. These changes can be indicators of potential health problems. However, due to a person's daily pattern, changes will be observed throughout each day, with, eg, an increase of events around meal times and fewer events during the night. We do not wish to detect such within-day changes but rather changes in the daily behavior pattern from one day to the next. To this end, we assume the set of event times within a given day as a single observation. We model this observation as the realization of an inhomogeneous Poisson process where the rate function can vary with the time of day. Then, we propose to detect changes in the sequence of inhomogeneous Poisson processes. This approach is appropriate for many phenomena, particularly for home activity data. Our methodology is evaluated on simulated data. Overall, our approach uses local change information to detect changes across days. At the same time, it allows us to visualize and interpret the results, changes, and trends over time, allowing the detection of potential health decline.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142457173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we introduce functional generalized canonical correlation analysis, a new framework for exploring associations between multiple random processes observed jointly. The framework is based on the multiblock regularized generalized canonical correlation analysis framework. It is robust to sparsely and irregularly observed data, making it applicable in many settings. We establish the monotonic property of the solving procedure and introduce a Bayesian approach for estimating canonical components. We propose an extension of the framework that allows the integration of a univariate or multivariate response into the analysis, paving the way for predictive applications. We evaluate the method's efficiency in simulation studies and present a use case on a longitudinal dataset.
{"title":"Functional generalized canonical correlation analysis for studying multiple longitudinal variables.","authors":"Lucas Sort, Laurent Le Brusquet, Arthur Tenenhaus","doi":"10.1093/biomtc/ujae113","DOIUrl":"https://doi.org/10.1093/biomtc/ujae113","url":null,"abstract":"<p><p>In this paper, we introduce functional generalized canonical correlation analysis, a new framework for exploring associations between multiple random processes observed jointly. The framework is based on the multiblock regularized generalized canonical correlation analysis framework. It is robust to sparsely and irregularly observed data, making it applicable in many settings. We establish the monotonic property of the solving procedure and introduce a Bayesian approach for estimating canonical components. We propose an extension of the framework that allows the integration of a univariate or multivariate response into the analysis, paving the way for predictive applications. We evaluate the method's efficiency in simulation studies and present a use case on a longitudinal dataset.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142457174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In regression-based analyses of group-level neuroimage data, researchers typically fit a series of marginal general linear models to image outcomes at each spatially referenced pixel. Spatial regularization of effects of interest is usually induced indirectly by applying spatial smoothing to the data during preprocessing. While this procedure often works well, the resulting inference can be poorly calibrated. Spatial modeling of effects of interest leads to more powerful analyses; however, the number of locations in a typical neuroimage can preclude standard computing methods in this setting. Here, we contribute a Bayesian spatial regression model for group-level neuroimaging analyses. We induce regularization of spatially varying regression coefficient functions through Gaussian process priors. When combined with a simple non-stationary model for the error process, our prior hierarchy can lead to more data-adaptive smoothing than standard methods. We achieve computational tractability through a Vecchia-type approximation of our prior that retains full spatial rank and can be constructed for a wide class of spatial correlation functions. We outline several ways to work with our model in practice and compare performance against standard vertex-wise analyses and several alternatives. Finally, we illustrate our methods in an analysis of cortical surface functional magnetic resonance imaging task contrast data from a large cohort of children enrolled in the adolescent brain cognitive development study.
{"title":"Bayesian inference for group-level cortical surface image-on-scalar regression with Gaussian process priors.","authors":"Andrew S Whiteman, Timothy D Johnson, Jian Kang","doi":"10.1093/biomtc/ujae116","DOIUrl":"10.1093/biomtc/ujae116","url":null,"abstract":"<p><p>In regression-based analyses of group-level neuroimage data, researchers typically fit a series of marginal general linear models to image outcomes at each spatially referenced pixel. Spatial regularization of effects of interest is usually induced indirectly by applying spatial smoothing to the data during preprocessing. While this procedure often works well, the resulting inference can be poorly calibrated. Spatial modeling of effects of interest leads to more powerful analyses; however, the number of locations in a typical neuroimage can preclude standard computing methods in this setting. Here, we contribute a Bayesian spatial regression model for group-level neuroimaging analyses. We induce regularization of spatially varying regression coefficient functions through Gaussian process priors. When combined with a simple non-stationary model for the error process, our prior hierarchy can lead to more data-adaptive smoothing than standard methods. We achieve computational tractability through a Vecchia-type approximation of our prior that retains full spatial rank and can be constructed for a wide class of spatial correlation functions. We outline several ways to work with our model in practice and compare performance against standard vertex-wise analyses and several alternatives. Finally, we illustrate our methods in an analysis of cortical surface functional magnetic resonance imaging task contrast data from a large cohort of children enrolled in the adolescent brain cognitive development study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11518852/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Population-based cancer registry databases are critical resources to bridge the information gap that results from a lack of sufficient statistical power from primary cohort data with small to moderate sample size. Although comprehensive data associated with tumor biomarkers often remain either unavailable or inconsistently measured in these registry databases, aggregate survival information sourced from these repositories has been well documented and publicly accessible. An appealing option is to integrate the aggregate survival information from the registry data with the primary cohort to enhance the evaluation of treatment impacts or prediction of survival outcomes across distinct tumor subtypes. Nevertheless, for rare types of cancer, even the sample sizes of cancer registries remain modest. The variability linked to the aggregated statistics could be non-negligible compared with the sample variation of the primary cohort. In response, we propose an externally informed likelihood approach, which facilitates the linkage between the primary cohort and external aggregate data, with consideration of the variation from aggregate information. We establish the asymptotic properties of the estimators and evaluate the finite sample performance via simulation studies. Through the application of our proposed method, we integrate data from the cohort of inflammatory breast cancer (IBC) patients at the University of Texas MD Anderson Cancer Center with aggregate survival data from the National Cancer Data Base, enabling us to appraise the effect of tri-modality treatment on survival across various tumor subtypes of IBC.
{"title":"Likelihood adaptively incorporated external aggregate information with uncertainty for survival data.","authors":"Ziqi Chen, Yu Shen, Jing Qin, Jing Ning","doi":"10.1093/biomtc/ujae120","DOIUrl":"10.1093/biomtc/ujae120","url":null,"abstract":"<p><p>Population-based cancer registry databases are critical resources to bridge the information gap that results from a lack of sufficient statistical power from primary cohort data with small to moderate sample size. Although comprehensive data associated with tumor biomarkers often remain either unavailable or inconsistently measured in these registry databases, aggregate survival information sourced from these repositories has been well documented and publicly accessible. An appealing option is to integrate the aggregate survival information from the registry data with the primary cohort to enhance the evaluation of treatment impacts or prediction of survival outcomes across distinct tumor subtypes. Nevertheless, for rare types of cancer, even the sample sizes of cancer registries remain modest. The variability linked to the aggregated statistics could be non-negligible compared with the sample variation of the primary cohort. In response, we propose an externally informed likelihood approach, which facilitates the linkage between the primary cohort and external aggregate data, with consideration of the variation from aggregate information. We establish the asymptotic properties of the estimators and evaluate the finite sample performance via simulation studies. Through the application of our proposed method, we integrate data from the cohort of inflammatory breast cancer (IBC) patients at the University of Texas MD Anderson Cancer Center with aggregate survival data from the National Cancer Data Base, enabling us to appraise the effect of tri-modality treatment on survival across various tumor subtypes of IBC.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11518850/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142520913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shuqi Wang, Peter F Thall, Kentaro Takeda, Ying Yuan
Optimizing doses for multiple indications is challenging. The pooled approach of finding a single optimal biological dose (OBD) for all indications ignores that dose-response or dose-toxicity curves may differ between indications, resulting in varying OBDs. Conversely, indication-specific dose optimization often requires a large sample size. To address this challenge, we propose a Randomized two-stage basket trial design that Optimizes doses in Multiple Indications (ROMI). In stage 1, for each indication, response and toxicity are evaluated for a high dose, which may be a previously obtained maximum tolerated dose, with a rule that stops accrual to indications where the high dose is unsafe or ineffective. Indications not terminated proceed to stage 2, where patients are randomized between the high dose and a specified lower dose. A latent-cluster Bayesian hierarchical model is employed to borrow information between indications, while considering the potential heterogeneity of OBD across indications. Indication-specific utilities are used to quantify response-toxicity trade-offs. At the end of stage 2, for each indication with at least one acceptable dose, the dose with highest posterior mean utility is selected as optimal. Two versions of ROMI are presented, one using only stage 2 data for dose optimization and the other optimizing doses using data from both stages. Simulations show that both versions have desirable operating characteristics compared to designs that either ignore indications or optimize dose independently for each indication.
{"title":"ROMI: a randomized two-stage basket trial design to optimize doses for multiple indications.","authors":"Shuqi Wang, Peter F Thall, Kentaro Takeda, Ying Yuan","doi":"10.1093/biomtc/ujae105","DOIUrl":"10.1093/biomtc/ujae105","url":null,"abstract":"<p><p>Optimizing doses for multiple indications is challenging. The pooled approach of finding a single optimal biological dose (OBD) for all indications ignores that dose-response or dose-toxicity curves may differ between indications, resulting in varying OBDs. Conversely, indication-specific dose optimization often requires a large sample size. To address this challenge, we propose a Randomized two-stage basket trial design that Optimizes doses in Multiple Indications (ROMI). In stage 1, for each indication, response and toxicity are evaluated for a high dose, which may be a previously obtained maximum tolerated dose, with a rule that stops accrual to indications where the high dose is unsafe or ineffective. Indications not terminated proceed to stage 2, where patients are randomized between the high dose and a specified lower dose. A latent-cluster Bayesian hierarchical model is employed to borrow information between indications, while considering the potential heterogeneity of OBD across indications. Indication-specific utilities are used to quantify response-toxicity trade-offs. At the end of stage 2, for each indication with at least one acceptable dose, the dose with highest posterior mean utility is selected as optimal. Two versions of ROMI are presented, one using only stage 2 data for dose optimization and the other optimizing doses using data from both stages. Simulations show that both versions have desirable operating characteristics compared to designs that either ignore indications or optimize dose independently for each indication.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11447723/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142364261","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Razieh Nabi, Matteo Bonvini, Edward H Kennedy, Ming-Yueh Huang, Marcela Smid, Daniel O Scharfstein
Establishing cause-effect relationships from observational data often relies on untestable assumptions. It is crucial to know whether, and to what extent, the conclusions drawn from non-experimental studies are robust to potential unmeasured confounding. In this paper, we focus on the average causal effect (ACE) as our target of inference. We generalize the sensitivity analysis approach developed by Robins et al., Franks et al., and Zhou and Yao. We use semiparametric theory to derive the non-parametric efficient influence function of the ACE, for fixed sensitivity parameters. We use this influence function to construct a one-step, split sample, truncated estimator of the ACE. Our estimator depends on semiparametric models for the distribution of the observed data; importantly, these models do not impose any restrictions on the values of sensitivity analysis parameters. We establish sufficient conditions ensuring that our estimator has $sqrt{n}$ asymptotics. We use our methodology to evaluate the causal effect of smoking during pregnancy on birth weight. We also evaluate the performance of estimation procedure in a simulation study.
{"title":"Semiparametric sensitivity analysis: unmeasured confounding in observational studies.","authors":"Razieh Nabi, Matteo Bonvini, Edward H Kennedy, Ming-Yueh Huang, Marcela Smid, Daniel O Scharfstein","doi":"10.1093/biomtc/ujae106","DOIUrl":"https://doi.org/10.1093/biomtc/ujae106","url":null,"abstract":"<p><p>Establishing cause-effect relationships from observational data often relies on untestable assumptions. It is crucial to know whether, and to what extent, the conclusions drawn from non-experimental studies are robust to potential unmeasured confounding. In this paper, we focus on the average causal effect (ACE) as our target of inference. We generalize the sensitivity analysis approach developed by Robins et al., Franks et al., and Zhou and Yao. We use semiparametric theory to derive the non-parametric efficient influence function of the ACE, for fixed sensitivity parameters. We use this influence function to construct a one-step, split sample, truncated estimator of the ACE. Our estimator depends on semiparametric models for the distribution of the observed data; importantly, these models do not impose any restrictions on the values of sensitivity analysis parameters. We establish sufficient conditions ensuring that our estimator has $sqrt{n}$ asymptotics. We use our methodology to evaluate the causal effect of smoking during pregnancy on birth weight. We also evaluate the performance of estimation procedure in a simulation study.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142457176","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Xinyuan Tian, Fan Li, Li Shen, Denise Esserman, Yize Zhao
Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by the need to explore the effect mechanism among genetic exposure, brain connectivity, and time to disease onset with maximum information extraction, we propose a Bayesian approach to model the effect pathway between each of these components while quantifying the mediating role of brain networks. To accommodate the biological architectures of brain connectivity constructed along white matter fiber tracts, we develop a structural model which includes a symmetric matrix-variate accelerated failure time model for disease onset and a symmetric matrix response regression for the network-variate mediator. We further impose within-graph sparsity and between-graph shrinkage to identify informative network configurations and eliminate the interference of noisy components. Simulations are carried out to confirm the advantages of our proposed method over existing alternatives. By applying the proposed method to the landmark Alzheimer's Disease Neuroimaging Initiative study, we obtain neurobiologically plausible insights that may inform future intervention strategies.
{"title":"Bayesian pathway analysis over brain network mediators for survival data.","authors":"Xinyuan Tian, Fan Li, Li Shen, Denise Esserman, Yize Zhao","doi":"10.1093/biomtc/ujae132","DOIUrl":"10.1093/biomtc/ujae132","url":null,"abstract":"<p><p>Technological advancements in noninvasive imaging facilitate the construction of whole brain interconnected networks, known as brain connectivity. Existing approaches to analyze brain connectivity frequently disaggregate the entire network into a vector of unique edges or summary measures, leading to a substantial loss of information. Motivated by the need to explore the effect mechanism among genetic exposure, brain connectivity, and time to disease onset with maximum information extraction, we propose a Bayesian approach to model the effect pathway between each of these components while quantifying the mediating role of brain networks. To accommodate the biological architectures of brain connectivity constructed along white matter fiber tracts, we develop a structural model which includes a symmetric matrix-variate accelerated failure time model for disease onset and a symmetric matrix response regression for the network-variate mediator. We further impose within-graph sparsity and between-graph shrinkage to identify informative network configurations and eliminate the interference of noisy components. Simulations are carried out to confirm the advantages of our proposed method over existing alternatives. By applying the proposed method to the landmark Alzheimer's Disease Neuroimaging Initiative study, we obtain neurobiologically plausible insights that may inform future intervention strategies.</p>","PeriodicalId":8930,"journal":{"name":"Biometrics","volume":"80 4","pages":""},"PeriodicalIF":1.4,"publicationDate":"2024-10-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11555425/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142614070","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}