首页 > 最新文献

arXiv - STAT - Statistics Theory最新文献

英文 中文
Personalized and uncertainty-aware coronary hemodynamics simulations: From Bayesian estimation to improved multi-fidelity uncertainty quantification 个性化和不确定性感知的冠状动脉血流动力学模拟:从贝叶斯估计到改进的多保真度不确定性量化
Pub Date : 2024-09-03 DOI: arxiv-2409.02247
Karthik Menon, Andrea Zanoni, Owais Khan, Gianluca Geraci, Koen Nieman, Daniele E. Schiavazzi, Alison L. Marsden
Simulations of coronary hemodynamics have improved non-invasive clinical riskstratification and treatment outcomes for coronary artery disease, compared torelying on anatomical imaging alone. However, simulations typically useempirical approaches to distribute total coronary flow amongst the arteries inthe coronary tree. This ignores patient variability, the presence of disease,and other clinical factors. Further, uncertainty in the clinical data oftenremains unaccounted for in the modeling pipeline. We present an end-to-enduncertainty-aware pipeline to (1) personalize coronary flow simulations byincorporating branch-specific coronary flows as well as cardiac function; and(2) predict clinical and biomechanical quantities of interest with improvedprecision, while accounting for uncertainty in the clinical data. We assimilatepatient-specific measurements of myocardial blood flow from CT myocardialperfusion imaging to estimate branch-specific coronary flows. We use adaptiveMarkov Chain Monte Carlo sampling to estimate the joint posterior distributionsof model parameters with simulated noise in the clinical data. Additionally, wedetermine the posterior predictive distribution for relevant quantities ofinterest using a new approach combining multi-fidelity Monte Carlo estimationwith non-linear, data-driven dimensionality reduction. Our frameworkrecapitulates clinically measured cardiac function as well as branch-specificcoronary flows under measurement uncertainty. We substantially shrink theconfidence intervals for estimated quantities of interest compared tosingle-fidelity and state-of-the-art multi-fidelity Monte Carlo methods. Thisis especially true for quantities that showed limited correlation between thelow- and high-fidelity model predictions. Moreover, the proposed estimators aresignificantly cheaper to compute for a specified confidence level or variance.
与仅依靠解剖成像相比,冠状动脉血流动力学模拟改善了冠状动脉疾病的无创临床风险分级和治疗效果。然而,模拟通常使用经验方法在冠状动脉树中分配冠状动脉总流量。这忽略了患者的可变性、疾病的存在以及其他临床因素。此外,在建模过程中,临床数据的不确定性往往没有考虑在内。我们提出了一种端到端不确定性感知管道,以便:(1) 通过纳入特定冠状动脉分支的血流以及心脏功能,对冠状动脉血流进行个性化模拟;(2) 在考虑临床数据不确定性的同时,以更高的精度预测临床和生物力学相关量。我们从 CT 心肌灌注成像中同化了特定患者的心肌血流测量数据,以估算特定分支的冠状动脉血流。我们使用自适应马尔可夫链蒙特卡洛抽样来估计模型参数的联合后验分布,并模拟临床数据中的噪声。此外,我们还采用一种新方法,将多保真度蒙特卡罗估计与非线性、数据驱动的降维相结合,确定了相关感兴趣量的后验预测分布。我们的框架重现了临床测量的心脏功能,以及在测量不确定性条件下的特异性冠状动脉血流。与单保真度蒙特卡罗方法和最先进的多保真度蒙特卡罗方法相比,我们大幅缩小了相关估计量的置信区间。这对于低保真和高保真模型预测之间相关性有限的量来说尤其如此。此外,对于指定的置信度或方差,所提出的估计器的计算成本明显更低。
{"title":"Personalized and uncertainty-aware coronary hemodynamics simulations: From Bayesian estimation to improved multi-fidelity uncertainty quantification","authors":"Karthik Menon, Andrea Zanoni, Owais Khan, Gianluca Geraci, Koen Nieman, Daniele E. Schiavazzi, Alison L. Marsden","doi":"arxiv-2409.02247","DOIUrl":"https://doi.org/arxiv-2409.02247","url":null,"abstract":"Simulations of coronary hemodynamics have improved non-invasive clinical risk\u0000stratification and treatment outcomes for coronary artery disease, compared to\u0000relying on anatomical imaging alone. However, simulations typically use\u0000empirical approaches to distribute total coronary flow amongst the arteries in\u0000the coronary tree. This ignores patient variability, the presence of disease,\u0000and other clinical factors. Further, uncertainty in the clinical data often\u0000remains unaccounted for in the modeling pipeline. We present an end-to-end\u0000uncertainty-aware pipeline to (1) personalize coronary flow simulations by\u0000incorporating branch-specific coronary flows as well as cardiac function; and\u0000(2) predict clinical and biomechanical quantities of interest with improved\u0000precision, while accounting for uncertainty in the clinical data. We assimilate\u0000patient-specific measurements of myocardial blood flow from CT myocardial\u0000perfusion imaging to estimate branch-specific coronary flows. We use adaptive\u0000Markov Chain Monte Carlo sampling to estimate the joint posterior distributions\u0000of model parameters with simulated noise in the clinical data. Additionally, we\u0000determine the posterior predictive distribution for relevant quantities of\u0000interest using a new approach combining multi-fidelity Monte Carlo estimation\u0000with non-linear, data-driven dimensionality reduction. Our framework\u0000recapitulates clinically measured cardiac function as well as branch-specific\u0000coronary flows under measurement uncertainty. We substantially shrink the\u0000confidence intervals for estimated quantities of interest compared to\u0000single-fidelity and state-of-the-art multi-fidelity Monte Carlo methods. This\u0000is especially true for quantities that showed limited correlation between the\u0000low- and high-fidelity model predictions. Moreover, the proposed estimators are\u0000significantly cheaper to compute for a specified confidence level or variance.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Convergence of Noise-Free Sampling Algorithms with Regularized Wasserstein Proximals 使用正则化瓦塞尔斯坦近似值的无噪声采样算法的收敛性
Pub Date : 2024-09-03 DOI: arxiv-2409.01567
Fuqun Han, Stanley Osher, Wuchen Li
In this work, we investigate the convergence properties of the backwardregularized Wasserstein proximal (BRWP) method for sampling a targetdistribution. The BRWP approach can be shown as a semi-implicit timediscretization for a probability flow ODE with the score function whose densitysatisfies the Fokker-Planck equation of the overdamped Langevin dynamics.Specifically, the evolution of the score function is computed using a kernelformula derived from the regularized Wasserstein proximal operator. By applyingthe Laplace method to obtain the asymptotic expansion of this kernel formula,we establish guaranteed convergence in terms of the Kullback-Leibler divergencefor the BRWP method towards a strongly log-concave target distribution. Ouranalysis also identifies the optimal and maximum step sizes for convergence.Furthermore, we demonstrate that the deterministic and semi-implicit BRWPscheme outperforms many classical Langevin Monte Carlo methods, such as theUnadjusted Langevin Algorithm (ULA), by offering faster convergence and reducedbias. Numerical experiments further validate the convergence analysis of theBRWP method.
在这项工作中,我们研究了用于目标分布采样的后向规则化瓦瑟斯坦近似(BRWP)方法的收敛特性。BRWP 方法可视为概率流 ODE 的半隐式时间离散化,其得分函数的密度满足过阻 Langevin 动力学的 Fokker-Planck 方程。通过应用拉普拉斯方法获得该核公式的渐近展开,我们确定了 BRWP 方法在 Kullback-Leibler 分歧方面对强对数凹目标分布的保证收敛性。我们的分析还确定了收敛的最佳步长和最大步长。此外,我们还证明了确定性和半隐式 BRWP 方案优于许多经典的朗格文蒙特卡罗方法,如调整朗格文算法(ULA),收敛速度更快,偏差更小。数值实验进一步验证了 BRWP 方法的收敛性分析。
{"title":"Convergence of Noise-Free Sampling Algorithms with Regularized Wasserstein Proximals","authors":"Fuqun Han, Stanley Osher, Wuchen Li","doi":"arxiv-2409.01567","DOIUrl":"https://doi.org/arxiv-2409.01567","url":null,"abstract":"In this work, we investigate the convergence properties of the backward\u0000regularized Wasserstein proximal (BRWP) method for sampling a target\u0000distribution. The BRWP approach can be shown as a semi-implicit time\u0000discretization for a probability flow ODE with the score function whose density\u0000satisfies the Fokker-Planck equation of the overdamped Langevin dynamics.\u0000Specifically, the evolution of the score function is computed using a kernel\u0000formula derived from the regularized Wasserstein proximal operator. By applying\u0000the Laplace method to obtain the asymptotic expansion of this kernel formula,\u0000we establish guaranteed convergence in terms of the Kullback-Leibler divergence\u0000for the BRWP method towards a strongly log-concave target distribution. Our\u0000analysis also identifies the optimal and maximum step sizes for convergence.\u0000Furthermore, we demonstrate that the deterministic and semi-implicit BRWP\u0000scheme outperforms many classical Langevin Monte Carlo methods, such as the\u0000Unadjusted Langevin Algorithm (ULA), by offering faster convergence and reduced\u0000bias. Numerical experiments further validate the convergence analysis of the\u0000BRWP method.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"12 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multivariate Inference of Network Moments by Subsampling 通过子采样对网络矩进行多元推断
Pub Date : 2024-09-03 DOI: arxiv-2409.01599
Mingyu Qi, Tianxi Li, Wen Zhou
In this paper, we study the characterization of a network population byanalyzing a single observed network, focusing on the counts of multiple networkmotifs or their corresponding multivariate network moments. We introduce analgorithm based on node subsampling to approximate the nontrivial jointdistribution of the network moments, and prove its asymptotic accuracy. Byexamining the joint distribution of these moments, our approach capturescomplex dependencies among network motifs, making a significant advancementover earlier methods that rely on individual motifs marginally. This enablesmore accurate and robust network inference. Through real-world applications,such as comparing coexpression networks of distinct gene sets and analyzingcollaboration patterns within the statistical community, we demonstrate thatthe multivariate inference of network moments provides deeper insights thanmarginal approaches, thereby enhancing our understanding of network mechanisms.
在本文中,我们通过分析单个观测网络来研究网络群体的特征,重点是多个网络特征的计数或其相应的多变量网络矩。我们引入了基于节点子采样的分析方法来逼近网络矩的非三角联合分布,并证明了其渐近精度。通过研究这些矩的联合分布,我们的方法捕捉到了网络主题之间复杂的依赖关系,比早期仅依赖单个主题的方法有了显著进步。这使得网络推断更加准确和稳健。通过比较不同基因组的共表达网络和分析统计界的合作模式等实际应用,我们证明了网络矩的多元推断比边际方法提供了更深刻的见解,从而增强了我们对网络机制的理解。
{"title":"Multivariate Inference of Network Moments by Subsampling","authors":"Mingyu Qi, Tianxi Li, Wen Zhou","doi":"arxiv-2409.01599","DOIUrl":"https://doi.org/arxiv-2409.01599","url":null,"abstract":"In this paper, we study the characterization of a network population by\u0000analyzing a single observed network, focusing on the counts of multiple network\u0000motifs or their corresponding multivariate network moments. We introduce an\u0000algorithm based on node subsampling to approximate the nontrivial joint\u0000distribution of the network moments, and prove its asymptotic accuracy. By\u0000examining the joint distribution of these moments, our approach captures\u0000complex dependencies among network motifs, making a significant advancement\u0000over earlier methods that rely on individual motifs marginally. This enables\u0000more accurate and robust network inference. Through real-world applications,\u0000such as comparing coexpression networks of distinct gene sets and analyzing\u0000collaboration patterns within the statistical community, we demonstrate that\u0000the multivariate inference of network moments provides deeper insights than\u0000marginal approaches, thereby enhancing our understanding of network mechanisms.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"74 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192799","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simulation-calibration testing for inference in Lasso regressions Lasso 回归推理的模拟校准测试
Pub Date : 2024-09-03 DOI: arxiv-2409.02269
Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed
We propose a test of the significance of a variable appearing on the Lassopath and use it in a procedure for selecting one of the models of the Lassopath, controlling the Family-Wise Error Rate. Our null hypothesis depends on aset A of already selected variables and states that it contains all the activevariables. We focus on the regularization parameter value from which a firstvariable outside A is selected. As the test statistic, we use this quantity'sconditional p-value, which we define conditional on the non-penalized estimatedcoefficients of the model restricted to A. We estimate this by simulatingoutcome vectors and then calibrating them on the observed outcome's estimatedcoefficients. We adapt the calibration heuristically to the case of generalizedlinear models in which it turns into an iterative stochastic procedure. Weprove that the test controls the risk of selecting a false positive in linearmodels, both under the null hypothesis and, under a correlation condition, whenA does not contain all active variables. We assess the performance of ourprocedure through extensive simulation studies. We also illustrate it in thedetection of exposures associated with drug-induced liver injuries in theFrench pharmacovigilance database.
我们建议对拉索帕斯中出现的变量进行重要性检验,并将其用于选择拉索帕斯模型之一的程序中,同时控制全族平均误差率(Family-Wise Error Rate)。我们的零假设取决于已选定变量的集合 A,并指出它包含所有有效变量。我们的重点是正则化参数值,从中选出 A 以外的第一个变量。作为检验统计量,我们使用这个量的条件 p 值,它是以限制在 A 中的模型的非惩罚估计系数为条件定义的。我们通过模拟结果向量,然后根据观察结果的估计系数进行校准来估计这个值。我们将校准启发式地应用于广义线性模型的情况,在这种情况下,校准变成了一个迭代随机过程。我们证明,无论是在零假设下,还是在相关条件下,当 A 不包含所有活动变量时,该检验都能控制线性模型中选择假阳性的风险。我们通过大量的模拟研究评估了我们程序的性能。我们还以法国药物警戒数据库中与药物引起的肝损伤相关的暴露检测为例进行了说明。
{"title":"Simulation-calibration testing for inference in Lasso regressions","authors":"Matthieu Pluntz, Cyril Dalmasso, Pascale Tubert-Bitter, Ismail Ahmed","doi":"arxiv-2409.02269","DOIUrl":"https://doi.org/arxiv-2409.02269","url":null,"abstract":"We propose a test of the significance of a variable appearing on the Lasso\u0000path and use it in a procedure for selecting one of the models of the Lasso\u0000path, controlling the Family-Wise Error Rate. Our null hypothesis depends on a\u0000set A of already selected variables and states that it contains all the active\u0000variables. We focus on the regularization parameter value from which a first\u0000variable outside A is selected. As the test statistic, we use this quantity's\u0000conditional p-value, which we define conditional on the non-penalized estimated\u0000coefficients of the model restricted to A. We estimate this by simulating\u0000outcome vectors and then calibrating them on the observed outcome's estimated\u0000coefficients. We adapt the calibration heuristically to the case of generalized\u0000linear models in which it turns into an iterative stochastic procedure. We\u0000prove that the test controls the risk of selecting a false positive in linear\u0000models, both under the null hypothesis and, under a correlation condition, when\u0000A does not contain all active variables. We assess the performance of our\u0000procedure through extensive simulation studies. We also illustrate it in the\u0000detection of exposures associated with drug-induced liver injuries in the\u0000French pharmacovigilance database.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142193001","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the Pinsker bound of inner product kernel regression in large dimensions 论大维度内积核回归的平斯克边界
Pub Date : 2024-09-02 DOI: arxiv-2409.00915
Weihao Lu, Jialin Ding, Haobo Zhang, Qian Lin
Building on recent studies of large-dimensional kernel regression,particularly those involving inner product kernels on the sphere$mathbb{S}^{d}$, we investigate the Pinsker bound for inner product kernelregression in such settings. Specifically, we address the scenario where thesample size $n$ is given by $alpha d^{gamma}(1+o_{d}(1))$ for some $alpha,gamma>0$. We have determined the exact minimax risk for kernel regression inthis setting, not only identifying the minimax rate but also the exactconstant, known as the Pinsker constant, associated with the excess risk.
基于最近对大维核回归的研究,特别是涉及球面上内积核的研究,我们研究了在这种情况下内积核回归的平斯克约束。具体来说,我们研究了样本大小 $n$ 由 $alpha d^{/gamma}(1+o_{d}(1))$ 给出的情况,其中 $alpha,gamma>0$。我们已经确定了这种情况下核回归的精确最小风险,不仅确定了最小风险率,还确定了与超额风险相关的精确常数,即 Pinsker 常数。
{"title":"On the Pinsker bound of inner product kernel regression in large dimensions","authors":"Weihao Lu, Jialin Ding, Haobo Zhang, Qian Lin","doi":"arxiv-2409.00915","DOIUrl":"https://doi.org/arxiv-2409.00915","url":null,"abstract":"Building on recent studies of large-dimensional kernel regression,\u0000particularly those involving inner product kernels on the sphere\u0000$mathbb{S}^{d}$, we investigate the Pinsker bound for inner product kernel\u0000regression in such settings. Specifically, we address the scenario where the\u0000sample size $n$ is given by $alpha d^{gamma}(1+o_{d}(1))$ for some $alpha,\u0000gamma>0$. We have determined the exact minimax risk for kernel regression in\u0000this setting, not only identifying the minimax rate but also the exact\u0000constant, known as the Pinsker constant, associated with the excess risk.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"8 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192791","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On tail inference in iid settings with nonnegative extreme value index 关于具有非负极值指数的 iid 设置中的尾部推断
Pub Date : 2024-09-02 DOI: arxiv-2409.00906
Taku Moriyama
In extreme value inference it is a fundamental problem how the target valueis required to be extreme by the extreme value theory. In iid settings thisstudy both theoretically and numerically compares tail estimators, which arebased on either or both of the extreme value theory and the nonparametricsmoothing. This study considers tail probability estimation and mean excessfunction estimation. This study assumes that the extreme value index of the underlyingdistribution is nonnegative. Specifically, the Hall class or the Weibull classof distributions is supposed in order to obtain the convergence rates of theestimators. This study investigates the nonparametric kernel type estimators,the fitting estimators to the generalized Pareto distribution and the plug-inestimators of the Hall distribution, which was proposed by Hall and Weissman(1997). In simulation studies the mean squared errors of the estimators in somefinite sample cases are compared.
在极值推断中,极值理论如何要求目标值达到极值是一个基本问题。本研究对基于极值理论和非参数平滑的尾估计器进行了理论和数值上的比较。本研究考虑了尾部概率估计和均值函数估计。本研究假设基础分布的极值指数为非负。具体地说,为了获得估计器的收敛率,假定了霍尔类或韦布尔类分布。本研究探讨了非参数核型估计器、广义帕累托分布拟合估计器以及霍尔分布的插入式估计器(由霍尔和魏斯曼(1997)提出)。在模拟研究中,比较了估计器在一些无限样本情况下的均方误差。
{"title":"On tail inference in iid settings with nonnegative extreme value index","authors":"Taku Moriyama","doi":"arxiv-2409.00906","DOIUrl":"https://doi.org/arxiv-2409.00906","url":null,"abstract":"In extreme value inference it is a fundamental problem how the target value\u0000is required to be extreme by the extreme value theory. In iid settings this\u0000study both theoretically and numerically compares tail estimators, which are\u0000based on either or both of the extreme value theory and the nonparametric\u0000smoothing. This study considers tail probability estimation and mean excess\u0000function estimation. This study assumes that the extreme value index of the underlying\u0000distribution is nonnegative. Specifically, the Hall class or the Weibull class\u0000of distributions is supposed in order to obtain the convergence rates of the\u0000estimators. This study investigates the nonparametric kernel type estimators,\u0000the fitting estimators to the generalized Pareto distribution and the plug-in\u0000estimators of the Hall distribution, which was proposed by Hall and Weissman\u0000(1997). In simulation studies the mean squared errors of the estimators in some\u0000finite sample cases are compared.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"61 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Confidence regions for the multidimensional density in the uniform norm based on the recursive Wolverton-Wagner estimation 基于递归沃尔弗顿-瓦格纳估计的统一规范中多维密度的置信区域
Pub Date : 2024-09-02 DOI: arxiv-2409.01451
Maria Rosaria Formica, Eugeny Ostrovsky, Leonid Sirota
We construct an optimal exponential tail decreasing confidence region for anunknown density of distribution in the Lebesgue-Riesz as well as in theuniform} norm, built on the sample of the random vectors based of the famousrecursive Wolverton-Wagner density estimation.
我们根据著名的递归沃尔弗顿-瓦格纳密度估计,在随机向量样本的基础上,为未知的 Lebesgue-Riesz 和 Uniform} 规范分布密度构建了一个最优指数尾部递减置信区。
{"title":"Confidence regions for the multidimensional density in the uniform norm based on the recursive Wolverton-Wagner estimation","authors":"Maria Rosaria Formica, Eugeny Ostrovsky, Leonid Sirota","doi":"arxiv-2409.01451","DOIUrl":"https://doi.org/arxiv-2409.01451","url":null,"abstract":"We construct an optimal exponential tail decreasing confidence region for an\u0000unknown density of distribution in the Lebesgue-Riesz as well as in the\u0000uniform} norm, built on the sample of the random vectors based of the famous\u0000recursive Wolverton-Wagner density estimation.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"75 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A computational transition for detecting correlated stochastic block models by low-degree polynomials 用低度多项式检测相关随机块模型的计算过渡
Pub Date : 2024-09-02 DOI: arxiv-2409.00966
Guanyi Chen, Jian Ding, Shuyang Gong, Zhangsong Li
Detection of correlation in a pair of random graphs is a fundamentalstatistical and computational problem that has been extensively studied inrecent years. In this work, we consider a pair of correlated (sparse)stochastic block models $mathcal{S}(n,tfrac{lambda}{n};k,epsilon;s)$ thatare subsampled from a common parent stochastic block model $mathcalS(n,tfrac{lambda}{n};k,epsilon)$ with $k=O(1)$ symmetric communities,average degree $lambda=O(1)$, divergence parameter $epsilon$, and subsamplingprobability $s$. For the detection problem of distinguishing this model from a pair ofindependent ErdH{o}s-R'enyi graphs with the same edge density$mathcal{G}(n,tfrac{lambda s}{n})$, we focus on tests based onemph{low-degree polynomials} of the entries of the adjacency matrices, and wedetermine the threshold that separates the easy and hard regimes. Moreprecisely, we show that this class of tests can distinguish these two models ifand only if $s> min { sqrt{alpha}, frac{1}{lambda epsilon^2} }$, where$alphaapprox 0.338$ is the Otter's constant and $frac{1}{lambdaepsilon^2}$ is the Kesten-Stigum threshold. Our proof of low-degree hardnessis based on a conditional variant of the low-degree likelihood calculation.
在一对随机图中检测相关性是一个基本的统计和计算问题,近年来已被广泛研究。在这项工作中,我们考虑了一对相关(稀疏)随机块模型 $mathcal{S}(n,tfrac{lambda}{n};k,epsilon;s)$ ,它们是从一个共同的父随机块模型 $mathcalS(n,tfrac{lambda}{n};k,epsilon)$,对称群落为 $k=O(1)$,平均度数为 $lambda=O(1)$,发散参数为 $epsilon$,子采样概率为 $s$。对于将该模型与一对具有相同边密度的独立 ErdH{o}s-R'enyi 图区分开来的检测问题,我们关注基于邻接矩阵项的(emph{low-degree polynomials})的测试,并确定了区分简单和困难两种情况的阈值。更确切地说,我们证明了这一类检验可以区分这两种模型,前提是:$s> min { sqrt{alpha}, frac{1}{lambda epsilon^2}.其中$alpha/approx 0.338$是奥特常数,$frac{1}{lambda/epsilon^2}$是Kesten-Stigum阈值。我们的低度困难性证明是基于低度可能性计算的条件变体。
{"title":"A computational transition for detecting correlated stochastic block models by low-degree polynomials","authors":"Guanyi Chen, Jian Ding, Shuyang Gong, Zhangsong Li","doi":"arxiv-2409.00966","DOIUrl":"https://doi.org/arxiv-2409.00966","url":null,"abstract":"Detection of correlation in a pair of random graphs is a fundamental\u0000statistical and computational problem that has been extensively studied in\u0000recent years. In this work, we consider a pair of correlated (sparse)\u0000stochastic block models $mathcal{S}(n,tfrac{lambda}{n};k,epsilon;s)$ that\u0000are subsampled from a common parent stochastic block model $mathcal\u0000S(n,tfrac{lambda}{n};k,epsilon)$ with $k=O(1)$ symmetric communities,\u0000average degree $lambda=O(1)$, divergence parameter $epsilon$, and subsampling\u0000probability $s$. For the detection problem of distinguishing this model from a pair of\u0000independent ErdH{o}s-R'enyi graphs with the same edge density\u0000$mathcal{G}(n,tfrac{lambda s}{n})$, we focus on tests based on\u0000emph{low-degree polynomials} of the entries of the adjacency matrices, and we\u0000determine the threshold that separates the easy and hard regimes. More\u0000precisely, we show that this class of tests can distinguish these two models if\u0000and only if $s> min { sqrt{alpha}, frac{1}{lambda epsilon^2} }$, where\u0000$alphaapprox 0.338$ is the Otter's constant and $frac{1}{lambda\u0000epsilon^2}$ is the Kesten-Stigum threshold. Our proof of low-degree hardness\u0000is based on a conditional variant of the low-degree likelihood calculation.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"10 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192658","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Simultaneous Inference for Non-Stationary Random Fields, with Application to Gridded Data Analysis 非静态随机场的同步推理,在网格数据分析中的应用
Pub Date : 2024-09-02 DOI: arxiv-2409.01220
Yunyi Zhang, Zhou Zhou
Current statistics literature on statistical inference of random fieldstypically assumes that the fields are stationary or focuses on models ofnon-stationary Gaussian fields with parametric/semiparametric covariancefamilies, which may not be sufficiently flexible to tackle complex modern-erarandom field data. This paper performs simultaneous nonparametric statisticalinference for a general class of non-stationary and non-Gaussian random fieldsby modeling the fields as nonlinear systems with location-dependenttransformations of an underlying `shift random field'. Asymptotic results,including concentration inequalities and Gaussian approximation theorems forhigh dimensional sparse linear forms of the random field, are derived. Acomputationally efficient locally weighted multiplier bootstrap algorithm isproposed and theoretically verified as a unified tool for the simultaneousinference of the aforementioned non-stationary non-Gaussian random field.Simulations and real-life data examples demonstrate good performances and broadapplications of the proposed algorithm.
目前有关随机场统计推断的统计文献通常假定随机场是静止的,或侧重于具有参数/半参数协方差族的非静止高斯场模型,这些模型可能不够灵活,无法处理复杂的现代随机场数据。本文通过将随机场建模为非线性系统,并对其进行与位置相关的底层 "移位随机场 "变换,从而对一般类别的非稳态和非高斯随机场进行同步非参数统计推断。推导出了渐近结果,包括随机场高维稀疏线性形式的集中不等式和高斯逼近定理。提出了一种计算高效的局部加权乘法引导算法,并从理论上验证了该算法是上述非平稳非高斯随机场同步推断的统一工具。
{"title":"Simultaneous Inference for Non-Stationary Random Fields, with Application to Gridded Data Analysis","authors":"Yunyi Zhang, Zhou Zhou","doi":"arxiv-2409.01220","DOIUrl":"https://doi.org/arxiv-2409.01220","url":null,"abstract":"Current statistics literature on statistical inference of random fields\u0000typically assumes that the fields are stationary or focuses on models of\u0000non-stationary Gaussian fields with parametric/semiparametric covariance\u0000families, which may not be sufficiently flexible to tackle complex modern-era\u0000random field data. This paper performs simultaneous nonparametric statistical\u0000inference for a general class of non-stationary and non-Gaussian random fields\u0000by modeling the fields as nonlinear systems with location-dependent\u0000transformations of an underlying `shift random field'. Asymptotic results,\u0000including concentration inequalities and Gaussian approximation theorems for\u0000high dimensional sparse linear forms of the random field, are derived. A\u0000computationally efficient locally weighted multiplier bootstrap algorithm is\u0000proposed and theoretically verified as a unified tool for the simultaneous\u0000inference of the aforementioned non-stationary non-Gaussian random field.\u0000Simulations and real-life data examples demonstrate good performances and broad\u0000applications of the proposed algorithm.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"24 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192790","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stein transport for Bayesian inference 贝叶斯推理的斯坦因传输
Pub Date : 2024-09-02 DOI: arxiv-2409.01464
Nikolas Nüsken
We introduce $textit{Stein transport}$, a novel methodology for Bayesianinference designed to efficiently push an ensemble of particles along apredefined curve of tempered probability distributions. The driving vectorfield is chosen from a reproducing kernel Hilbert space and can be derivedeither through a suitable kernel ridge regression formulation or as aninfinitesimal optimal transport map in the Stein geometry. The update equationsof Stein transport resemble those of Stein variational gradient descent (SVGD),but introduce a time-varying score function as well as specific weightsattached to the particles. While SVGD relies on convergence in the long-timelimit, Stein transport reaches its posterior approximation at finite time$t=1$. Studying the mean-field limit, we discuss the errors incurred byregularisation and finite-particle effects, and we connect Stein transport tobirth-death dynamics and Fisher-Rao gradient flows. In a series of experiments,we show that in comparison to SVGD, Stein transport not only often reaches moreaccurate posterior approximations with a significantly reduced computationalbudget, but that it also effectively mitigates the variance collapse phenomenoncommonly observed in SVGD.
我们介绍了$textit{Stein transport}$,这是一种用于贝叶斯推断的新方法,旨在有效地推动粒子集合沿着经过调和的概率分布的预定义曲线前进。驱动向量场选自再现核希尔伯特空间,既可以通过合适的核脊回归公式得到,也可以作为斯坦因几何中的无限最优传输图得到。斯坦因传输的更新方程与斯坦因变分梯度下降(SVGD)相似,但引入了时变分值函数以及粒子的特定权重。SVGD 依赖于在长时限内的收敛,而 Stein 传输则是在有限时间 t=1 美元时达到其后向近似值。在研究均场极限时,我们讨论了规则化和有限粒子效应引起的误差,并将斯坦因输运与出生-死亡动力学和费雪-拉奥梯度流联系起来。在一系列实验中,我们发现与 SVGD 相比,斯坦因输运不仅能在显著降低计算预算的情况下获得更精确的后验近似,而且还能有效缓解 SVGD 中常见的方差崩溃现象。
{"title":"Stein transport for Bayesian inference","authors":"Nikolas Nüsken","doi":"arxiv-2409.01464","DOIUrl":"https://doi.org/arxiv-2409.01464","url":null,"abstract":"We introduce $textit{Stein transport}$, a novel methodology for Bayesian\u0000inference designed to efficiently push an ensemble of particles along a\u0000predefined curve of tempered probability distributions. The driving vector\u0000field is chosen from a reproducing kernel Hilbert space and can be derived\u0000either through a suitable kernel ridge regression formulation or as an\u0000infinitesimal optimal transport map in the Stein geometry. The update equations\u0000of Stein transport resemble those of Stein variational gradient descent (SVGD),\u0000but introduce a time-varying score function as well as specific weights\u0000attached to the particles. While SVGD relies on convergence in the long-time\u0000limit, Stein transport reaches its posterior approximation at finite time\u0000$t=1$. Studying the mean-field limit, we discuss the errors incurred by\u0000regularisation and finite-particle effects, and we connect Stein transport to\u0000birth-death dynamics and Fisher-Rao gradient flows. In a series of experiments,\u0000we show that in comparison to SVGD, Stein transport not only often reaches more\u0000accurate posterior approximations with a significantly reduced computational\u0000budget, but that it also effectively mitigates the variance collapse phenomenon\u0000commonly observed in SVGD.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"144 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192657","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Statistics Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1