首页 > 最新文献

Journal of the Royal Statistical Society Series B-Statistical Methodology最新文献

英文 中文
Mode-wise principal subspace pursuit and matrix spiked covariance model.
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-09-02 eCollection Date: 2025-02-01 DOI: 10.1093/jrsssb/qkae088
Runshi Tang, Ming Yuan, Anru R Zhang

This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection. These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyse the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.

{"title":"Mode-wise principal subspace pursuit and matrix spiked covariance model.","authors":"Runshi Tang, Ming Yuan, Anru R Zhang","doi":"10.1093/jrsssb/qkae088","DOIUrl":"10.1093/jrsssb/qkae088","url":null,"abstract":"<p><p>This paper introduces a novel framework called Mode-wise Principal Subspace Pursuit (MOP-UP) to extract hidden variations in both the row and column dimensions for matrix data. To enhance the understanding of the framework, we introduce a class of matrix-variate spiked covariance models that serve as inspiration for the development of the MOP-UP algorithm. The MOP-UP algorithm consists of two steps: Average Subspace Capture (ASC) and Alternating Projection. These steps are specifically designed to capture the row-wise and column-wise dimension-reduced subspaces which contain the most informative features of the data. ASC utilizes a novel average projection operator as initialization and achieves exact recovery in the noiseless setting. We analyse the convergence and non-asymptotic error bounds of MOP-UP, introducing a blockwise matrix eigenvalue perturbation bound that proves the desired bound, where classic perturbation bounds fail. The effectiveness and practical merits of the proposed framework are demonstrated through experiments on both simulated and real datasets. Lastly, we discuss generalizations of our approach to higher-order data.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"232-255"},"PeriodicalIF":3.1,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809223/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143411335","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Extended fiducial inference: toward an automated process of statistical inference.
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-08-05 eCollection Date: 2025-02-01 DOI: 10.1093/jrsssb/qkae082
Faming Liang, Sehwan Kim, Yan Sun

While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set-'inferring the uncertainty of model parameters on the basis of observations'-has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. Extended Fiducial inference involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing. Specifically, EFI provides higher fidelity in parameter estimation, especially when outliers are present in the observations; and eliminates the need for theoretical reference distributions in hypothesis testing, thereby automating the statistical inference process. Extended Fiducial inference also provides an innovative framework for semisupervised learning.

{"title":"Extended fiducial inference: toward an automated process of statistical inference.","authors":"Faming Liang, Sehwan Kim, Yan Sun","doi":"10.1093/jrsssb/qkae082","DOIUrl":"10.1093/jrsssb/qkae082","url":null,"abstract":"<p><p>While fiducial inference was widely considered a big blunder by R.A. Fisher, the goal he initially set-'inferring the uncertainty of model parameters on the basis of observations'-has been continually pursued by many statisticians. To this end, we develop a new statistical inference method called extended Fiducial inference (EFI). The new method achieves the goal of fiducial inference by leveraging advanced statistical computing techniques while remaining scalable for big data. Extended Fiducial inference involves jointly imputing random errors realized in observations using stochastic gradient Markov chain Monte Carlo and estimating the inverse function using a sparse deep neural network (DNN). The consistency of the sparse DNN estimator ensures that the uncertainty embedded in observations is properly propagated to model parameters through the estimated inverse function, thereby validating downstream statistical inference. Compared to frequentist and Bayesian methods, EFI offers significant advantages in parameter estimation and hypothesis testing. Specifically, EFI provides higher fidelity in parameter estimation, especially when outliers are present in the observations; and eliminates the need for theoretical reference distributions in hypothesis testing, thereby automating the statistical inference process. Extended Fiducial inference also provides an innovative framework for semisupervised learning.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"98-131"},"PeriodicalIF":3.1,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809222/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400591","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Catch me if you can: signal localization with knockoff e-values.
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-06-14 eCollection Date: 2025-02-01 DOI: 10.1093/jrsssb/qkae042
Paula Gablenz, Chiara Sabatti

We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage e-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analysing data from the UK Biobank.

{"title":"Catch me if you can: signal localization with knockoff <i>e</i>-values.","authors":"Paula Gablenz, Chiara Sabatti","doi":"10.1093/jrsssb/qkae042","DOIUrl":"10.1093/jrsssb/qkae042","url":null,"abstract":"<p><p>We consider problems where many, somewhat redundant, hypotheses are tested and we are interested in reporting the most precise rejections, with false discovery rate (FDR) control. This is the case, for example, when researchers are interested both in individual hypotheses as well as group hypotheses corresponding to intersections of sets of the original hypotheses, at several resolution levels. A concrete application is in genome-wide association studies, where, depending on the signal strengths, it might be possible to resolve the influence of individual genetic variants on a phenotype with greater or lower precision. To adapt to the unknown signal strength, analyses are conducted at multiple resolutions and researchers are most interested in the more precise discoveries. Assuring FDR control on the reported findings with these adaptive searches is, however, often impossible. To design a multiple comparison procedure that allows for an adaptive choice of resolution with FDR control, we leverage <i>e</i>-values and linear programming. We adapt this approach to problems where knockoffs and group knockoffs have been successfully applied to test conditional independence hypotheses. We demonstrate its efficacy by analysing data from the UK Biobank.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"87 1","pages":"56-73"},"PeriodicalIF":3.1,"publicationDate":"2024-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11809227/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143400584","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Model-assisted sensitivity analysis for treatment effects under unmeasured confounding via regularized calibrated estimation. 通过正则化校准估计,对未测量混杂因素下的治疗效果进行模型辅助敏感性分析。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-05-03 eCollection Date: 2024-11-01 DOI: 10.1093/jrsssb/qkae034
Zhiqiang Tan

Consider sensitivity analysis for estimating average treatment effects under unmeasured confounding, assumed to satisfy a marginal sensitivity model. At the population level, we provide new representations for the sharp population bounds and doubly robust estimating functions. We also derive new, relaxed population bounds, depending on weighted linear outcome quantile regression. At the sample level, we develop new methods and theory for obtaining not only doubly robust point estimators for the relaxed population bounds with respect to misspecification of a propensity score model or an outcome mean regression model, but also model-assisted confidence intervals which are valid if the propensity score model is correctly specified, but the outcome quantile and mean regression models may be misspecified. The relaxed population bounds reduce to the sharp bounds if outcome quantile regression is correctly specified. For a linear outcome mean regression model, the confidence intervals are also doubly robust. Our methods involve regularized calibrated estimation, with Lasso penalties but carefully chosen loss functions, for fitting propensity score and outcome mean and quantile regression models. We present a simulation study and an empirical application to an observational study on the effects of right-heart catheterization. The proposed method is implemented in the R package RCALsa.

在假设满足边际敏感性模型的情况下,考虑对未测量混杂因素下的平均治疗效果进行估计的敏感性分析。在人群水平上,我们为尖锐人群界限和双重稳健估计函数提供了新的表示方法。我们还根据加权线性结果量子回归推导出新的、宽松的人群界限。在样本层面,我们开发了新的方法和理论,不仅可以获得与倾向评分模型或结果均值回归模型的错误指定有关的松弛总体边界的双重稳健点估计值,还可以获得模型辅助置信区间,如果倾向评分模型指定正确,但结果量化和均值回归模型可能被错误指定,则置信区间有效。如果结果量值回归模型指定正确,则放宽的人口边界可减小为尖锐边界。对于线性结果均值回归模型,置信区间也具有双重稳健性。我们的方法涉及正则化校准估计,利用 Lasso 惩罚和精心选择的损失函数来拟合倾向评分和结果均值及量化回归模型。我们介绍了一项模拟研究和一项关于右心导管治疗效果的观察性研究的经验应用。提出的方法在 R 软件包 RCALsa 中实现。
{"title":"Model-assisted sensitivity analysis for treatment effects under unmeasured confounding via regularized calibrated estimation.","authors":"Zhiqiang Tan","doi":"10.1093/jrsssb/qkae034","DOIUrl":"10.1093/jrsssb/qkae034","url":null,"abstract":"<p><p>Consider sensitivity analysis for estimating average treatment effects under unmeasured confounding, assumed to satisfy a marginal sensitivity model. At the population level, we provide new representations for the sharp population bounds and doubly robust estimating functions. We also derive new, relaxed population bounds, depending on weighted linear outcome quantile regression. At the sample level, we develop new methods and theory for obtaining not only doubly robust point estimators for the relaxed population bounds with respect to misspecification of a propensity score model or an outcome mean regression model, but also model-assisted confidence intervals which are valid if the propensity score model is correctly specified, but the outcome quantile and mean regression models may be misspecified. The relaxed population bounds reduce to the sharp bounds if outcome quantile regression is correctly specified. For a linear outcome mean regression model, the confidence intervals are also doubly robust. Our methods involve regularized calibrated estimation, with Lasso penalties but carefully chosen loss functions, for fitting propensity score and outcome mean and quantile regression models. We present a simulation study and an empirical application to an observational study on the effects of right-heart catheterization. The proposed method is implemented in the R package RCALsa.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 5","pages":"1339-1363"},"PeriodicalIF":3.1,"publicationDate":"2024-05-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11558804/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142631137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Interpretable discriminant analysis for functional data supported on random nonlinear domains with an application to Alzheimer's disease. 对随机非线性域支持的功能数据进行可解释的判别分析,并应用于阿尔茨海默病。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-03-22 eCollection Date: 2024-09-01 DOI: 10.1093/jrsssb/qkae023
Eardi Lila, Wenbo Zhang, Swati Rane Levendovszky

We introduce a novel framework for the classification of functional data supported on nonlinear, and possibly random, manifold domains. The motivating application is the identification of subjects with Alzheimer's disease from their cortical surface geometry and associated cortical thickness map. The proposed model is based upon a reformulation of the classification problem as a regularized multivariate functional linear regression model. This allows us to adopt a direct approach to the estimation of the most discriminant direction while controlling for its complexity with appropriate differential regularization. Our approach does not require prior estimation of the covariance structure of the functional predictors, which is computationally prohibitive in our application setting. We provide a theoretical analysis of the out-of-sample prediction error of the proposed model and explore the finite sample performance in a simulation setting. We apply the proposed method to a pooled dataset from Alzheimer's Disease Neuroimaging Initiative and Parkinson's Progression Markers Initiative. Through this application, we identify discriminant directions that capture both cortical geometric and thickness predictive features of Alzheimer's disease that are consistent with the existing neuroscience literature.

我们引入了一个新框架,用于对非线性流形域(可能是随机流形域)上的功能数据进行分类。应用的动机是通过皮质表面几何图形和相关皮质厚度图识别阿尔茨海默氏症患者。所提出的模型是基于将分类问题重新表述为正则化多元函数线性回归模型。这使我们能够采用直接方法来估计最具区分度的方向,同时通过适当的微分正则化来控制其复杂性。我们的方法不需要对函数预测因子的协方差结构进行先验估计,而在我们的应用设置中,这种先验估计在计算上是难以实现的。我们对所提模型的样本外预测误差进行了理论分析,并在模拟环境中探索了有限样本性能。我们将提出的方法应用于阿尔茨海默病神经影像倡议和帕金森病进展标记倡议的集合数据集。通过这一应用,我们确定了同时捕捉阿尔茨海默病皮质几何和厚度预测特征的判别方向,这与现有的神经科学文献是一致的。
{"title":"Interpretable discriminant analysis for functional data supported on random nonlinear domains with an application to Alzheimer's disease.","authors":"Eardi Lila, Wenbo Zhang, Swati Rane Levendovszky","doi":"10.1093/jrsssb/qkae023","DOIUrl":"10.1093/jrsssb/qkae023","url":null,"abstract":"<p><p>We introduce a novel framework for the classification of functional data supported on nonlinear, and possibly random, manifold domains. The motivating application is the identification of subjects with Alzheimer's disease from their cortical surface geometry and associated cortical thickness map. The proposed model is based upon a reformulation of the classification problem as a regularized multivariate functional linear regression model. This allows us to adopt a direct approach to the estimation of the most discriminant direction while controlling for its complexity with appropriate differential regularization. Our approach does not require prior estimation of the covariance structure of the functional predictors, which is computationally prohibitive in our application setting. We provide a theoretical analysis of the out-of-sample prediction error of the proposed model and explore the finite sample performance in a simulation setting. We apply the proposed method to a pooled dataset from Alzheimer's Disease Neuroimaging Initiative and Parkinson's Progression Markers Initiative. Through this application, we identify discriminant directions that capture both cortical geometric and thickness predictive features of Alzheimer's disease that are consistent with the existing neuroscience literature.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"1013-1044"},"PeriodicalIF":3.1,"publicationDate":"2024-03-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398888/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299609","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GENIUS-MAWII: for robust Mendelian randomization with many weak invalid instruments. GENIUS-MAWII:用于有许多弱无效工具的稳健孟德尔随机化。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-03-14 eCollection Date: 2024-09-01 DOI: 10.1093/jrsssb/qkae024
Ting Ye, Zhonghua Liu, Baoluo Sun, Eric Tchetgen Tchetgen

Mendelian randomization (MR) addresses causal questions using genetic variants as instrumental variables. We propose a new MR method, G-Estimation under No Interaction with Unmeasured Selection (GENIUS)-MAny Weak Invalid IV, which simultaneously addresses the 2 salient challenges in MR: many weak instruments and widespread horizontal pleiotropy. Similar to MR-GENIUS, we use heteroscedasticity of the exposure to identify the treatment effect. We derive influence functions of the treatment effect, and then we construct a continuous updating estimator and establish its asymptotic properties under a many weak invalid instruments asymptotic regime by developing novel semiparametric theory. We also provide a measure of weak identification, an overidentification test, and a graphical diagnostic tool.

孟德尔随机化(Mendelian randomization,MR)利用遗传变异作为工具变量来解决因果问题。我们提出了一种新的孟德尔随机化方法,即 "未测量选择无交互作用下的 G-估计(GENIUS)-MAny Weak Invalid IV",它同时解决了孟德尔随机化的两大难题:许多弱工具和广泛的水平多义性。与 MR-GENIUS 类似,我们利用暴露的异方差性来识别治疗效果。我们推导出了治疗效果的影响函数,然后构建了一个连续更新估计器,并通过发展新颖的半参数理论,确立了它在许多弱无效工具渐近机制下的渐近特性。我们还提供了弱识别度量、过度识别检验和图形诊断工具。
{"title":"GENIUS-MAWII: for robust Mendelian randomization with many weak invalid instruments.","authors":"Ting Ye, Zhonghua Liu, Baoluo Sun, Eric Tchetgen Tchetgen","doi":"10.1093/jrsssb/qkae024","DOIUrl":"10.1093/jrsssb/qkae024","url":null,"abstract":"<p><p>Mendelian randomization (MR) addresses causal questions using genetic variants as instrumental variables. We propose a new MR method, G-Estimation under No Interaction with Unmeasured Selection (GENIUS)-MAny Weak Invalid IV, which simultaneously addresses the 2 salient challenges in MR: many weak instruments and widespread horizontal pleiotropy. Similar to MR-GENIUS, we use heteroscedasticity of the exposure to identify the treatment effect. We derive influence functions of the treatment effect, and then we construct a continuous updating estimator and establish its asymptotic properties under a many weak invalid instruments asymptotic regime by developing novel semiparametric theory. We also provide a measure of weak identification, an overidentification test, and a graphical diagnostic tool.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"1045-1067"},"PeriodicalIF":3.1,"publicationDate":"2024-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398887/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299608","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Doubly robust calibration of prediction sets under covariate shift. 协变量偏移下预测集的双稳健校准。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-03-04 eCollection Date: 2024-09-01 DOI: 10.1093/jrsssb/qkae009
Yachong Yang, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen

Conformal prediction has received tremendous attention in recent years and has offered new solutions to problems in missing data and causal inference; yet these advances have not leveraged modern semi-parametric efficiency theory for more efficient uncertainty quantification. We consider the problem of obtaining well-calibrated prediction regions that can data adaptively account for a shift in the distribution of covariates between training and test data. Under a covariate shift assumption analogous to the standard missing at random assumption, we propose a general framework based on efficient influence functions to construct well-calibrated prediction regions for the unobserved outcome in the test sample without compromising coverage.

近年来,共形预测受到了极大的关注,并为缺失数据和因果推理问题提供了新的解决方案;然而,这些进展并没有利用现代半参数效率理论来实现更有效的不确定性量化。我们考虑的问题是,如何获得校准良好的预测区域,并能自适应地考虑训练数据和测试数据之间协变量分布的变化。在类似于标准随机缺失假设的协变量偏移假设下,我们提出了一个基于高效影响函数的通用框架,在不影响覆盖率的情况下,为测试样本中未观察到的结果构建校准良好的预测区域。
{"title":"Doubly robust calibration of prediction sets under covariate shift.","authors":"Yachong Yang, Arun Kumar Kuchibhotla, Eric Tchetgen Tchetgen","doi":"10.1093/jrsssb/qkae009","DOIUrl":"10.1093/jrsssb/qkae009","url":null,"abstract":"<p><p>Conformal prediction has received tremendous attention in recent years and has offered new solutions to problems in missing data and causal inference; yet these advances have not leveraged modern semi-parametric efficiency theory for more efficient uncertainty quantification. We consider the problem of obtaining well-calibrated prediction regions that can data adaptively account for a shift in the distribution of covariates between training and test data. Under a covariate shift assumption analogous to the standard missing at random assumption, we propose a general framework based on efficient influence functions to construct well-calibrated prediction regions for the unobserved outcome in the test sample without compromising coverage.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 4","pages":"943-965"},"PeriodicalIF":3.1,"publicationDate":"2024-03-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11398884/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142299607","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Gradient synchronization for multivariate functional data, with application to brain connectivity. 多变量功能数据的梯度同步,并应用于大脑连接。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2024-01-22 eCollection Date: 2024-07-01 DOI: 10.1093/jrsssb/qkad140
Yaqing Chen, Shu-Chin Lin, Yang Zhou, Owen Carmichael, Hans-Georg Müller, Jane-Ling Wang

Quantifying the association between components of multivariate random curves is of general interest and is a ubiquitous and basic problem that can be addressed with functional data analysis. An important application is the problem of assessing functional connectivity based on functional magnetic resonance imaging (fMRI), where one aims to determine the similarity of fMRI time courses that are recorded on anatomically separated brain regions. In the functional brain connectivity literature, the static temporal Pearson correlation has been the prevailing measure for functional connectivity. However, recent research has revealed temporally changing patterns of functional connectivity, leading to the study of dynamic functional connectivity. This motivates new similarity measures for pairs of random curves that reflect the dynamic features of functional similarity. Specifically, we introduce gradient synchronization measures in a general setting. These similarity measures are based on the concordance and discordance of the gradients between paired smooth random functions. Asymptotic normality of the proposed estimates is obtained under regularity conditions. We illustrate the proposed synchronization measures via simulations and an application to resting-state fMRI signals from the Alzheimer's Disease Neuroimaging Initiative and they are found to improve discrimination between subjects with different disease status.

量化多元随机曲线各成分之间的关联性是一个普遍关注的问题,也是一个普遍存在的基本问题,可以通过功能数据分析来解决。一个重要的应用是基于功能性磁共振成像(fMRI)评估功能连通性的问题,其目的是确定在解剖学上分离的脑区记录的 fMRI 时间历程的相似性。在大脑功能连通性文献中,静态的时间皮尔逊相关性一直是功能连通性的主流测量方法。然而,最近的研究揭示了功能连通性的时间变化模式,从而引发了对动态功能连通性的研究。这就为反映功能相似性动态特征的随机曲线对提出了新的相似性测量方法。具体来说,我们在一般情况下引入梯度同步测量。这些相似性度量基于成对平滑随机函数之间梯度的一致性和不一致性。在正则性条件下,我们得到了拟议估计值的渐近正则性。我们通过模拟和对阿尔茨海默病神经成像计划静息态 fMRI 信号的应用来说明所提出的同步度量,发现它们能提高不同疾病状态受试者之间的区分度。
{"title":"Gradient synchronization for multivariate functional data, with application to brain connectivity.","authors":"Yaqing Chen, Shu-Chin Lin, Yang Zhou, Owen Carmichael, Hans-Georg Müller, Jane-Ling Wang","doi":"10.1093/jrsssb/qkad140","DOIUrl":"10.1093/jrsssb/qkad140","url":null,"abstract":"<p><p>Quantifying the association between components of multivariate random curves is of general interest and is a ubiquitous and basic problem that can be addressed with functional data analysis. An important application is the problem of assessing functional connectivity based on functional magnetic resonance imaging (fMRI), where one aims to determine the similarity of fMRI time courses that are recorded on anatomically separated brain regions. In the functional brain connectivity literature, the static temporal Pearson correlation has been the prevailing measure for functional connectivity. However, recent research has revealed temporally changing patterns of functional connectivity, leading to the study of dynamic functional connectivity. This motivates new similarity measures for pairs of random curves that reflect the dynamic features of functional similarity. Specifically, we introduce gradient synchronization measures in a general setting. These similarity measures are based on the concordance and discordance of the gradients between paired smooth random functions. Asymptotic normality of the proposed estimates is obtained under regularity conditions. We illustrate the proposed synchronization measures via simulations and an application to resting-state fMRI signals from the Alzheimer's Disease Neuroimaging Initiative and they are found to improve discrimination between subjects with different disease status.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 3","pages":"694-713"},"PeriodicalIF":3.1,"publicationDate":"2024-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11239314/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141617543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Identification and estimation of causal peer effects using double negative controls for unmeasured network confounding. 利用双重负向控制对未测量的网络干扰进行因果同伴效应的识别和估计。
IF 3.1 1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-12-15 eCollection Date: 2024-04-01 DOI: 10.1093/jrsssb/qkad132
Naoki Egami, Eric J Tchetgen Tchetgen

Identification and estimation of causal peer effects are challenging in observational studies for two reasons. The first is the identification challenge due to unmeasured network confounding, for example, homophily bias and contextual confounding. The second is network dependence of observations. We establish a framework that leverages a pair of negative control outcome and exposure variables (double negative controls) to non-parametrically identify causal peer effects in the presence of unmeasured network confounding. We then propose a generalised method of moments estimator and establish its consistency and asymptotic normality under an assumption about ψ-network dependence. Finally, we provide a consistent variance estimator.

在观察性研究中,因果同伴效应的识别和估计具有挑战性,原因有二。首先是由于未测量的网络混杂因素(如同质性偏差和背景混杂因素)造成的识别挑战。其次是观察结果的网络依赖性。我们建立了一个框架,利用一对负控制结果和暴露变量(双负控制),在存在未测量网络混杂的情况下,非参数地识别因果同伴效应。然后,我们提出了一种广义矩估计方法,并在ψ网络依赖性假设下确定了其一致性和渐近正态性。最后,我们提供了一个一致的方差估计器。
{"title":"Identification and estimation of causal peer effects using double negative controls for unmeasured network confounding.","authors":"Naoki Egami, Eric J Tchetgen Tchetgen","doi":"10.1093/jrsssb/qkad132","DOIUrl":"10.1093/jrsssb/qkad132","url":null,"abstract":"<p><p>Identification and estimation of causal peer effects are challenging in observational studies for two reasons. The first is the identification challenge due to unmeasured network confounding, for example, homophily bias and contextual confounding. The second is network dependence of observations. We establish a framework that leverages a pair of negative control outcome and exposure variables (double negative controls) to non-parametrically identify causal peer effects in the presence of unmeasured network confounding. We then propose a generalised method of moments estimator and establish its consistency and asymptotic normality under an assumption about <i>ψ</i>-network dependence. Finally, we provide a consistent variance estimator.</p>","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"86 2","pages":"487-511"},"PeriodicalIF":3.1,"publicationDate":"2023-12-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.ncbi.nlm.nih.gov/pmc/articles/PMC11009281/pdf/","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140873435","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Controlling the false discovery rate in transformational sparsity: Split Knockoffs 转换稀疏性中的错误发现率控制:拆分仿冒
1区 数学 Q1 STATISTICS & PROBABILITY Pub Date : 2023-11-14 DOI: 10.1093/jrsssb/qkad126
Yang Cao, Xinwei Sun, Yuan Yao
Abstract Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the sparsity constraint is not directly imposed on the parameters but on a linear transformation of the parameters to be estimated. Examples of such scenarios include total variations, wavelet transforms, fused LASSO, and trend filtering. In this paper, we propose a data-adaptive FDR control method, called the Split Knockoff method, for this transformational sparsity setting. The proposed method exploits both variable and data splitting. The linear transformation constraint is relaxed to its Euclidean proximity in a lifted parameter space, which yields an orthogonal design that enables the orthogonal Split Knockoff construction. To overcome the challenge that exchangeability fails due to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are developed via data splitting for provable FDR control without sacrificing power. Simulation experiments demonstrate that the proposed methodology achieves the desired FDR and power. We also provide an application to Alzheimer’s Disease study, where atrophy brain regions and their abnormal connections can be discovered based on a structural Magnetic Resonance Imaging dataset.
控制变量选择过程中的错误发现率(FDR)是可重复发现的关键,在稀疏线性模型中得到了广泛的研究。然而,在稀疏性约束不是直接施加在参数上,而是施加在待估计参数的线性变换上的情况下,它仍然很大程度上是开放的。这些场景的示例包括总变化、小波变换、融合LASSO和趋势过滤。在本文中,我们提出了一种数据自适应的FDR控制方法,称为分裂仿造方法,用于这种转换稀疏性设置。该方法同时利用了变量和数据分割。线性变换约束被放宽到其在提升参数空间中的欧几里得接近性,从而产生正交设计,使正交分裂仿造结构成为可能。为了克服变换带来的非均质噪声导致互换性失效的挑战,在不牺牲功率的情况下,通过数据分割开发了新的逆上鞅结构,用于可证明的FDR控制。仿真实验表明,该方法达到了预期的FDR和功率。我们还提供了一个应用程序,以阿尔茨海默病的研究,其中萎缩的大脑区域和他们的异常连接可以发现基于结构磁共振成像数据集。
{"title":"Controlling the false discovery rate in transformational sparsity: Split Knockoffs","authors":"Yang Cao, Xinwei Sun, Yuan Yao","doi":"10.1093/jrsssb/qkad126","DOIUrl":"https://doi.org/10.1093/jrsssb/qkad126","url":null,"abstract":"Abstract Controlling the False Discovery Rate (FDR) in a variable selection procedure is critical for reproducible discoveries, and it has been extensively studied in sparse linear models. However, it remains largely open in scenarios where the sparsity constraint is not directly imposed on the parameters but on a linear transformation of the parameters to be estimated. Examples of such scenarios include total variations, wavelet transforms, fused LASSO, and trend filtering. In this paper, we propose a data-adaptive FDR control method, called the Split Knockoff method, for this transformational sparsity setting. The proposed method exploits both variable and data splitting. The linear transformation constraint is relaxed to its Euclidean proximity in a lifted parameter space, which yields an orthogonal design that enables the orthogonal Split Knockoff construction. To overcome the challenge that exchangeability fails due to the heterogeneous noise brought by the transformation, new inverse supermartingale structures are developed via data splitting for provable FDR control without sacrificing power. Simulation experiments demonstrate that the proposed methodology achieves the desired FDR and power. We also provide an application to Alzheimer’s Disease study, where atrophy brain regions and their abnormal connections can be discovered based on a structural Magnetic Resonance Imaging dataset.","PeriodicalId":49982,"journal":{"name":"Journal of the Royal Statistical Society Series B-Statistical Methodology","volume":"29 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-11-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134991933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Journal of the Royal Statistical Society Series B-Statistical Methodology
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1