首页 > 最新文献

arXiv - STAT - Statistics Theory最新文献

英文 中文
Asymptotics for Random Quadratic Transportation Costs 随机二次运输成本的渐近线
Pub Date : 2024-09-13 DOI: arxiv-2409.08612
Martin Huesmann, Michael Goldman, Dario Trevisan
We establish the validity of asymptotic limits for the general transportationproblem between random i.i.d. points and their common distribution, withrespect to the squared Euclidean distance cost, in any dimension larger thanthree. Previous results were essentially limited to the two (or one)dimensional case, or to distributions whose absolutely continuous part isuniform. The proof relies upon recent advances in the stability theory of optimaltransportation, combined with functional analytic techniques and some ideasfrom quantitative stochastic homogenization. The key tool we develop is aquantitative upper bound for the usual quadratic optimal transportation problemin terms of its boundary variant, where points can be freely transported alongthe boundary. The methods we use are applicable to more general randommeasures, including occupation measure of Brownian paths, and may open the doorto further progress on challenging problems at the interface of analysis,probability, and discrete mathematics.
我们建立了随机 i.i.d. 点之间的一般运输问题及其共同分布的渐近极限的有效性,并尊重大于三维的欧几里得距离成本平方。以前的结果基本上局限于二维(或一维)的情况,或者局限于绝对连续部分是均匀分布的情况。证明依赖于最优运输稳定性理论的最新进展,并结合了函数分析技术和定量随机均质化的一些观点。我们开发的关键工具是通常二次最优运输问题的定量上界,即其边界变体,在该变体中,点可以沿边界自由运输。我们使用的方法适用于更一般的随机度量,包括布朗路径的占位度量,并可能为在分析、概率和离散数学交界处的挑战性问题上取得进一步进展打开大门。
{"title":"Asymptotics for Random Quadratic Transportation Costs","authors":"Martin Huesmann, Michael Goldman, Dario Trevisan","doi":"arxiv-2409.08612","DOIUrl":"https://doi.org/arxiv-2409.08612","url":null,"abstract":"We establish the validity of asymptotic limits for the general transportation\u0000problem between random i.i.d. points and their common distribution, with\u0000respect to the squared Euclidean distance cost, in any dimension larger than\u0000three. Previous results were essentially limited to the two (or one)\u0000dimensional case, or to distributions whose absolutely continuous part is\u0000uniform. The proof relies upon recent advances in the stability theory of optimal\u0000transportation, combined with functional analytic techniques and some ideas\u0000from quantitative stochastic homogenization. The key tool we develop is a\u0000quantitative upper bound for the usual quadratic optimal transportation problem\u0000in terms of its boundary variant, where points can be freely transported along\u0000the boundary. The methods we use are applicable to more general random\u0000measures, including occupation measure of Brownian paths, and may open the door\u0000to further progress on challenging problems at the interface of analysis,\u0000probability, and discrete mathematics.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"6 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268002","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On spiked eigenvalues of a renormalized sample covariance matrix from multi-population 关于来自多人群的重归一化样本协方差矩阵的尖峰特征值
Pub Date : 2024-09-13 DOI: arxiv-2409.08715
Weiming Li, Zeng Li, Junpeng Zhu
Sample covariance matrices from multi-population typically exhibit severallarge spiked eigenvalues, which stem from differences between population meansand are crucial for inference on the underlying data structure. This paperinvestigates the asymptotic properties of spiked eigenvalues of a renormalizedsample covariance matrices from multi-population in the ultrahigh dimensionalcontext where the dimension-to-sample size ratio p/n go to infinity. The first-and second-order convergence of these spikes are established based onasymptotic properties of three types of sesquilinear forms frommulti-population. These findings are further applied to two scenarios,includingdetermination of total number of subgroups and a new criterion for evaluatingclustering results in the absence of true labels. Additionally, we provide aunified framework with p/n->cin (0,infty] that integrates the asymptoticresults in both high and ultrahigh dimensional settings.
来自多种群的样本协方差矩阵通常会表现出几个巨大的尖峰特征值,这些特征值源于种群均值之间的差异,对于推断底层数据结构至关重要。本文研究了在维数与样本大小比 p/n 为无穷大的超高维背景下,多种群重归一化样本协方差矩阵尖峰特征值的渐近特性。这些尖峰的一阶收敛性和二阶收敛性是基于来自多群体的三类芝麻线性形式的渐近特性建立起来的。这些发现被进一步应用于两种情况,包括子群总数的确定和在没有真实标签的情况下评估聚类结果的新标准。此外,我们还提供了一个统一的 p/n->cin (0,infty]框架,它整合了高维和超高维设置下的渐近结果。
{"title":"On spiked eigenvalues of a renormalized sample covariance matrix from multi-population","authors":"Weiming Li, Zeng Li, Junpeng Zhu","doi":"arxiv-2409.08715","DOIUrl":"https://doi.org/arxiv-2409.08715","url":null,"abstract":"Sample covariance matrices from multi-population typically exhibit several\u0000large spiked eigenvalues, which stem from differences between population means\u0000and are crucial for inference on the underlying data structure. This paper\u0000investigates the asymptotic properties of spiked eigenvalues of a renormalized\u0000sample covariance matrices from multi-population in the ultrahigh dimensional\u0000context where the dimension-to-sample size ratio p/n go to infinity. The first-\u0000and second-order convergence of these spikes are established based on\u0000asymptotic properties of three types of sesquilinear forms from\u0000multi-population. These findings are further applied to two scenarios,including\u0000determination of total number of subgroups and a new criterion for evaluating\u0000clustering results in the absence of true labels. Additionally, we provide a\u0000unified framework with p/n->cin (0,infty] that integrates the asymptotic\u0000results in both high and ultrahigh dimensional settings.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"2 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267997","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent 改进斯坦因变分梯度下降的有限粒子收敛速率
Pub Date : 2024-09-13 DOI: arxiv-2409.08469
Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal
We provide finite-particle convergence rates for the Stein VariationalGradient Descent (SVGD) algorithm in the Kernel Stein Discrepancy($mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is the observationthat the time derivative of the relative entropy between the joint density of$N$ particle locations and the $N$-fold product target measure, starting from aregular initial distribution, splits into a dominant `negative part'proportional to $N$ times the expected $mathsf{KSD}^2$ and a smaller `positivepart'. This observation leads to $mathsf{KSD}$ rates of order $1/sqrt{N}$,providing a near optimal double exponential improvement over the recent resultby~cite{shi2024finite}. Under mild assumptions on the kernel and potential,these bounds also grow linearly in the dimension $d$. By adding a bilinearcomponent to the kernel, the above approach is used to further obtainWasserstein-2 convergence. For the case of `bilinear + Mat'ern' kernels, wederive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar tothe i.i.d. setting. We also obtain marginal convergence and long-timepropagation of chaos results for the time-averaged particle laws.
我们为核斯坦因差异($mathsf{KSD}$)和瓦瑟斯坦-2度量中的斯坦因变分梯度下降(SVGD)算法提供了有限粒子收敛率。我们的主要见解是观察到,从正态初始分布开始,N$粒子位置的联合密度与N$折积目标度量之间的相对熵的时间导数会分裂成一个占主导地位的 "负部分"(与预期的$mathsf{KSD}^2$的N$倍成正比)和一个较小的 "正部分"。这一观察结果使 $mathsf{KSD}$ 率达到 1/sqrt{N}$ 的数量级,与最近由~cite{shi2024finite}得出的结果相比,提供了近乎最佳的双指数改进。在内核和势的温和假设下,这些边界在维数$d$上也呈线性增长。通过在内核中加入双线性分量,上述方法被用来进一步获得瓦瑟斯坦-2 收敛性。对于 "双线性 + Mat'ern' 内核 "的情况,我们得到的 Wasserstein-2 率表现出类似于 i.i.d. 设置的维度诅咒。我们还得到了时间平均粒子定律的边际收敛性和长时间传播的混沌结果。
{"title":"Improved Finite-Particle Convergence Rates for Stein Variational Gradient Descent","authors":"Krishnakumar Balasubramanian, Sayan Banerjee, Promit Ghosal","doi":"arxiv-2409.08469","DOIUrl":"https://doi.org/arxiv-2409.08469","url":null,"abstract":"We provide finite-particle convergence rates for the Stein Variational\u0000Gradient Descent (SVGD) algorithm in the Kernel Stein Discrepancy\u0000($mathsf{KSD}$) and Wasserstein-2 metrics. Our key insight is the observation\u0000that the time derivative of the relative entropy between the joint density of\u0000$N$ particle locations and the $N$-fold product target measure, starting from a\u0000regular initial distribution, splits into a dominant `negative part'\u0000proportional to $N$ times the expected $mathsf{KSD}^2$ and a smaller `positive\u0000part'. This observation leads to $mathsf{KSD}$ rates of order $1/sqrt{N}$,\u0000providing a near optimal double exponential improvement over the recent result\u0000by~cite{shi2024finite}. Under mild assumptions on the kernel and potential,\u0000these bounds also grow linearly in the dimension $d$. By adding a bilinear\u0000component to the kernel, the above approach is used to further obtain\u0000Wasserstein-2 convergence. For the case of `bilinear + Mat'ern' kernels, we\u0000derive Wasserstein-2 rates that exhibit a curse-of-dimensionality similar to\u0000the i.i.d. setting. We also obtain marginal convergence and long-time\u0000propagation of chaos results for the time-averaged particle laws.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"49 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice 基于分类的神经网络优化异常检测:理论与实践
Pub Date : 2024-09-13 DOI: arxiv-2409.08521
Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo
Anomaly detection is an important problem in many application areas, such asnetwork security. Many deep learning methods for unsupervised anomaly detectionproduce good empirical performance but lack theoretical guarantees. By castinganomaly detection into a binary classification problem, we establishnon-asymptotic upper bounds and a convergence rate on the excess risk onrectified linear unit (ReLU) neural networks trained on synthetic anomalies.Our convergence rate on the excess risk matches the minimax optimal rate in theliterature. Furthermore, we provide lower and upper bounds on the number ofsynthetic anomalies that can attain this optimality. For practicalimplementation, we relax some conditions to improve the search for theempirical risk minimizer, which leads to competitive performance to otherclassification-based methods for anomaly detection. Overall, our work providesthe first theoretical guarantees of unsupervised neural network-based anomalydetectors and empirical insights on how to design them well.
异常检测是网络安全等许多应用领域的重要问题。许多用于无监督异常检测的深度学习方法产生了良好的经验性能,但缺乏理论保证。通过将异常检测转化为二元分类问题,我们建立了非渐近上界以及在合成异常上训练的修正线性单元(ReLU)神经网络的超额风险收敛率。此外,我们还提供了能达到这一最优值的合成异常数量的下限和上限。在实际应用中,我们放宽了一些条件,以改进对经验风险最小值的搜索,从而使异常检测的性能与其他基于分类的方法相比更具竞争力。总之,我们的工作首次为基于无监督神经网络的异常检测提供了理论保证,并为如何设计好异常检测提供了经验启示。
{"title":"Optimal Classification-based Anomaly Detection with Neural Networks: Theory and Practice","authors":"Tian-Yi Zhou, Matthew Lau, Jizhou Chen, Wenke Lee, Xiaoming Huo","doi":"arxiv-2409.08521","DOIUrl":"https://doi.org/arxiv-2409.08521","url":null,"abstract":"Anomaly detection is an important problem in many application areas, such as\u0000network security. Many deep learning methods for unsupervised anomaly detection\u0000produce good empirical performance but lack theoretical guarantees. By casting\u0000anomaly detection into a binary classification problem, we establish\u0000non-asymptotic upper bounds and a convergence rate on the excess risk on\u0000rectified linear unit (ReLU) neural networks trained on synthetic anomalies.\u0000Our convergence rate on the excess risk matches the minimax optimal rate in the\u0000literature. Furthermore, we provide lower and upper bounds on the number of\u0000synthetic anomalies that can attain this optimality. For practical\u0000implementation, we relax some conditions to improve the search for the\u0000empirical risk minimizer, which leads to competitive performance to other\u0000classification-based methods for anomaly detection. Overall, our work provides\u0000the first theoretical guarantees of unsupervised neural network-based anomaly\u0000detectors and empirical insights on how to design them well.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"20 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268003","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On the maximal correlation coefficient for the bivariate Marshall Olkin distribution 关于二元马歇尔-奥尔金分布的最大相关系数
Pub Date : 2024-09-13 DOI: arxiv-2409.08661
Axel Bücher, Torben Staud
We prove a formula for the maximal correlation coefficient of the bivariateMarshall Olkin distribution that was conjectured in Lin, Lai, and Govindaraju(2016, Stat. Methodol., 29:1-9). The formula is applied to obtain a new prooffor a variance inequality in extreme value statistics that links the disjointand the sliding block maxima method.
我们证明了Lin、Lai和Govindaraju(2016,Stat. Methodol.)应用该公式可以得到极值统计中方差不等式的新原函数,它将不相交法和滑动块最大值法联系在一起。
{"title":"On the maximal correlation coefficient for the bivariate Marshall Olkin distribution","authors":"Axel Bücher, Torben Staud","doi":"arxiv-2409.08661","DOIUrl":"https://doi.org/arxiv-2409.08661","url":null,"abstract":"We prove a formula for the maximal correlation coefficient of the bivariate\u0000Marshall Olkin distribution that was conjectured in Lin, Lai, and Govindaraju\u0000(2016, Stat. Methodol., 29:1-9). The formula is applied to obtain a new proof\u0000for a variance inequality in extreme value statistics that links the disjoint\u0000and the sliding block maxima method.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"16 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142267998","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On Admissibility in Bipartite Incidence Graph Sampling 论双方位发生图抽样中的可采性
Pub Date : 2024-09-12 DOI: arxiv-2409.07970
Pedro García-Segador, Li-Chung Zhang
In bipartite incidence graph sampling, the target study units may be formedas connected population elements, which are distinct to the units of samplingand there may exist generally more than one way by which a given study unit canbe observed via sampling units. This generalizes ?nite-population element ormultistage sampling, where each element can only be sampled directly or via asingle primary sampling unit. We study the admissibility of estimators inbipartite incidence graph sampling and identify other admissible estimatorsthan the classic Horvitz-Thompson estimator. Our admissibility resultsencompass those for ?nite-population sampling.
在双方位发生图抽样中,目标研究单位可以是相互连接的人口要素,这些人口要素与抽样单位是不同的,通过抽样单位观察特定研究单位的方法一般可能不止一种。在这种情况下,每个要素只能直接或通过单一的主要抽样单元进行抽样。我们研究了双方位入射图抽样中估计器的可接受性,并确定了经典的 Horvitz-Thompson 估计器之外的其他可接受性估计器。我们的可接受性结果涵盖了尼特人群抽样的可接受性结果。
{"title":"On Admissibility in Bipartite Incidence Graph Sampling","authors":"Pedro García-Segador, Li-Chung Zhang","doi":"arxiv-2409.07970","DOIUrl":"https://doi.org/arxiv-2409.07970","url":null,"abstract":"In bipartite incidence graph sampling, the target study units may be formed\u0000as connected population elements, which are distinct to the units of sampling\u0000and there may exist generally more than one way by which a given study unit can\u0000be observed via sampling units. This generalizes ?nite-population element or\u0000multistage sampling, where each element can only be sampled directly or via a\u0000single primary sampling unit. We study the admissibility of estimators in\u0000bipartite incidence graph sampling and identify other admissible estimators\u0000than the classic Horvitz-Thompson estimator. Our admissibility results\u0000encompass those for ?nite-population sampling.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"39 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalized Independence Test for Modern Data 现代数据的广义独立性检验
Pub Date : 2024-09-12 DOI: arxiv-2409.07745
Mingshuo Liu, Doudou Zhou, Hao Chen
The test of independence is a crucial component of modern data analysis.However, traditional methods often struggle with the complex dependencystructures found in high-dimensional data. To overcome this challenge, weintroduce a novel test statistic that captures intricate relationships usingsimilarity and dissimilarity information derived from the data. The statisticexhibits strong power across a broad range of alternatives for high-dimensionaldata, as demonstrated in extensive simulation studies. Under mild conditions,we show that the new test statistic converges to the $chi^2_4$ distributionunder the permutation null distribution, ensuring straightforward type I errorcontrol. Furthermore, our research advances the moment method in proving thejoint asymptotic normality of multiple double-indexed permutation statistics.We showcase the practical utility of this new test with an application to theGenotype-Tissue Expression dataset, where it effectively measures associationsbetween human tissues.
独立性检验是现代数据分析的重要组成部分。然而,传统方法往往难以应对高维数据中复杂的依赖性结构。为了克服这一难题,我们引入了一种新型检验统计量,利用从数据中获得的相似性和不相似性信息来捕捉错综复杂的关系。大量的模拟研究表明,该统计量在高维数据的各种替代方案中都表现出强大的威力。在温和的条件下,我们证明新的检验统计量收敛于 permutation null 分布下的 $chi^2_4$ 分布,确保了直接的 I 型误差控制。此外,我们的研究还推进了矩方法的发展,证明了多个双指数置换统计量的联合渐近正态性。我们在基因型-组织表达数据集(Genotype-Tissue Expression dataset)上的应用展示了这一新检验的实用性,它能有效地测量人体组织之间的关联。
{"title":"Generalized Independence Test for Modern Data","authors":"Mingshuo Liu, Doudou Zhou, Hao Chen","doi":"arxiv-2409.07745","DOIUrl":"https://doi.org/arxiv-2409.07745","url":null,"abstract":"The test of independence is a crucial component of modern data analysis.\u0000However, traditional methods often struggle with the complex dependency\u0000structures found in high-dimensional data. To overcome this challenge, we\u0000introduce a novel test statistic that captures intricate relationships using\u0000similarity and dissimilarity information derived from the data. The statistic\u0000exhibits strong power across a broad range of alternatives for high-dimensional\u0000data, as demonstrated in extensive simulation studies. Under mild conditions,\u0000we show that the new test statistic converges to the $chi^2_4$ distribution\u0000under the permutation null distribution, ensuring straightforward type I error\u0000control. Furthermore, our research advances the moment method in proving the\u0000joint asymptotic normality of multiple double-indexed permutation statistics.\u0000We showcase the practical utility of this new test with an application to the\u0000Genotype-Tissue Expression dataset, where it effectively measures associations\u0000between human tissues.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Quickest Change Detection Using Mismatched CUSUM 使用不匹配 CUSUM 快速检测变化
Pub Date : 2024-09-12 DOI: arxiv-2409.07948
Austin Cooper, Sean Meyn
The field of quickest change detection (QCD) concerns design and analysis ofalgorithms to estimate in real time the time at which an important event takesplace and identify properties of the post-change behavior. The goal is todevise a stopping time adapted to the observations that minimizes an $L_1$loss. Approximately optimal solutions are well known under a variety ofassumptions. In the work surveyed here we consider the CUSUM statistic, whichis defined as a one-dimensional reflected random walk driven by a functional ofthe observations. It is known that the optimal functional is a log likelihoodratio subject to special statical assumptions. The paper concerns model free approaches to detection design, considering thefollowing questions: 1. What is the performance for a given functional of the observations? 2. How do the conclusions change when there is dependency between pre- andpost-change behavior? 3. How can techniques from statistics and machine learning be adapted toapproximate the best functional in a given class?
最快变化探测(QCD)领域涉及算法的设计和分析,以实时估计重要事件发生的时间,并确定变化后行为的属性。我们的目标是设计出一个与观测结果相适应的停止时间,使 L_1$ 损失最小。在各种假设条件下,近似最优解是众所周知的。在本文研究的工作中,我们考虑的是 CUSUM 统计量,它被定义为由观测值函数驱动的一维反射随机游走。众所周知,最优函数是一个对数似然比,但需符合特殊的统计假设。本文涉及无模型检测设计方法,考虑了以下问题:1.给定观测函数的性能如何?2.当变化前后的行为之间存在依赖关系时,结论会发生怎样的变化?3.如何调整统计和机器学习技术,以接近给定类别中的最佳函数?
{"title":"Quickest Change Detection Using Mismatched CUSUM","authors":"Austin Cooper, Sean Meyn","doi":"arxiv-2409.07948","DOIUrl":"https://doi.org/arxiv-2409.07948","url":null,"abstract":"The field of quickest change detection (QCD) concerns design and analysis of\u0000algorithms to estimate in real time the time at which an important event takes\u0000place and identify properties of the post-change behavior. The goal is to\u0000devise a stopping time adapted to the observations that minimizes an $L_1$\u0000loss. Approximately optimal solutions are well known under a variety of\u0000assumptions. In the work surveyed here we consider the CUSUM statistic, which\u0000is defined as a one-dimensional reflected random walk driven by a functional of\u0000the observations. It is known that the optimal functional is a log likelihood\u0000ratio subject to special statical assumptions. The paper concerns model free approaches to detection design, considering the\u0000following questions: 1. What is the performance for a given functional of the observations? 2. How do the conclusions change when there is dependency between pre- and\u0000post-change behavior? 3. How can techniques from statistics and machine learning be adapted to\u0000approximate the best functional in a given class?","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"13 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192787","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Foundation of Calculating Normalized Maximum Likelihood for Continuous Probability Models 计算连续概率模型归一化最大似然的基础
Pub Date : 2024-09-12 DOI: arxiv-2409.08387
Atsushi Suzuki, Kota Fukuzawa, Kenji Yamanishi
The normalized maximum likelihood (NML) code length is widely used as a modelselection criterion based on the minimum description length principle, wherethe model with the shortest NML code length is selected. A common method tocalculate the NML code length is to use the sum (for a discrete model) orintegral (for a continuous model) of a function defined by the distribution ofthe maximum likelihood estimator. While this method has been proven tocorrectly calculate the NML code length of discrete models, no proof has beenprovided for continuous cases. Consequently, it has remained unclear whetherthe method can accurately calculate the NML code length of continuous models.In this paper, we solve this problem affirmatively, proving that the method isalso correct for continuous cases. Remarkably, completing the proof forcontinuous cases is non-trivial in that it cannot be achieved by merelyreplacing the sums in discrete cases with integrals, as the decomposition trickapplied to sums in the discrete model case proof is not applicable to integralsin the continuous model case proof. To overcome this, we introduce a noveldecomposition approach based on the coarea formula from geometric measuretheory, which is essential to establishing our proof for continuous cases.
归一化最大似然(NML)码长被广泛用作基于最小描述长度原则的模型选择标准,即选择 NML 码长最短的模型。计算 NML 码长的常用方法是使用最大似然估计值分布所定义函数的和(对于离散模型)或积分(对于连续模型)。虽然这种方法已被证明能正确计算离散模型的 NML 码长,但还没有为连续模型提供证明。在本文中,我们肯定地解决了这个问题,证明了该方法在连续情况下也是正确的。值得注意的是,完成对连续情况的证明并非易事,因为它不能仅仅通过用积分代替离散情况下的和来实现,因为离散模型情况证明中应用于和的分解技巧不适用于连续模型情况证明中的积分。为了克服这个问题,我们引入了一种新的分解方法,它基于几何测度论中的 coarea 公式,这对我们建立连续情形的证明至关重要。
{"title":"Foundation of Calculating Normalized Maximum Likelihood for Continuous Probability Models","authors":"Atsushi Suzuki, Kota Fukuzawa, Kenji Yamanishi","doi":"arxiv-2409.08387","DOIUrl":"https://doi.org/arxiv-2409.08387","url":null,"abstract":"The normalized maximum likelihood (NML) code length is widely used as a model\u0000selection criterion based on the minimum description length principle, where\u0000the model with the shortest NML code length is selected. A common method to\u0000calculate the NML code length is to use the sum (for a discrete model) or\u0000integral (for a continuous model) of a function defined by the distribution of\u0000the maximum likelihood estimator. While this method has been proven to\u0000correctly calculate the NML code length of discrete models, no proof has been\u0000provided for continuous cases. Consequently, it has remained unclear whether\u0000the method can accurately calculate the NML code length of continuous models.\u0000In this paper, we solve this problem affirmatively, proving that the method is\u0000also correct for continuous cases. Remarkably, completing the proof for\u0000continuous cases is non-trivial in that it cannot be achieved by merely\u0000replacing the sums in discrete cases with integrals, as the decomposition trick\u0000applied to sums in the discrete model case proof is not applicable to integrals\u0000in the continuous model case proof. To overcome this, we introduce a novel\u0000decomposition approach based on the coarea formula from geometric measure\u0000theory, which is essential to establishing our proof for continuous cases.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142268000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Ratio Divergence Learning Using Target Energy in Restricted Boltzmann Machines: Beyond Kullback--Leibler Divergence Learning 利用受限玻尔兹曼机中的目标能量进行比率发散学习:超越库尔贝克--莱布勒发散学习
Pub Date : 2024-09-12 DOI: arxiv-2409.07679
Yuichi Ishida, Yuma Ichikawa, Aki Dote, Toshiyuki Miyazawa, Koji Hukushima
We propose ratio divergence (RD) learning for discrete energy-based models, amethod that utilizes both training data and a tractable target energy function.We apply RD learning to restricted Boltzmann machines (RBMs), which are aminimal model that satisfies the universal approximation theorem for discretedistributions. RD learning combines the strength of both forward and reverseKullback-Leibler divergence (KLD) learning, effectively addressing the"notorious" issues of underfitting with the forward KLD and mode-collapse withthe reverse KLD. Since the summation of forward and reverse KLD seems to besufficient to combine the strength of both approaches, we include this learningmethod as a direct baseline in numerical experiments to evaluate itseffectiveness. Numerical experiments demonstrate that RD learning significantlyoutperforms other learning methods in terms of energy function fitting,mode-covering, and learning stability across various discrete energy-basedmodels. Moreover, the performance gaps between RD learning and the otherlearning methods become more pronounced as the dimensions of target modelsincrease.
我们针对基于离散能量的模型提出了比值发散(RD)学习法,这种方法既利用了训练数据,又利用了可操作的目标能量函数。我们将比值发散学习法应用于受限玻尔兹曼机(RBM),RBM是一种满足离散分布普遍逼近定理的最小模型。RD 学习结合了正向和反向 Kullback-Leibler 发散(KLD)学习的优点,有效地解决了正向 KLD 的欠拟合和反向 KLD 的模式坍缩等 "臭名昭著 "的问题。由于正向 KLD 和反向 KLD 的总和似乎足以综合两种方法的优势,我们将这种学习方法作为直接基线纳入数值实验,以评估其效果。数值实验证明,在各种基于离散能量的模型中,RD 学习方法在能量函数拟合、模式覆盖和学习稳定性方面明显优于其他学习方法。此外,随着目标模型维度的增加,RD 学习与其他学习方法之间的性能差距变得更加明显。
{"title":"Ratio Divergence Learning Using Target Energy in Restricted Boltzmann Machines: Beyond Kullback--Leibler Divergence Learning","authors":"Yuichi Ishida, Yuma Ichikawa, Aki Dote, Toshiyuki Miyazawa, Koji Hukushima","doi":"arxiv-2409.07679","DOIUrl":"https://doi.org/arxiv-2409.07679","url":null,"abstract":"We propose ratio divergence (RD) learning for discrete energy-based models, a\u0000method that utilizes both training data and a tractable target energy function.\u0000We apply RD learning to restricted Boltzmann machines (RBMs), which are a\u0000minimal model that satisfies the universal approximation theorem for discrete\u0000distributions. RD learning combines the strength of both forward and reverse\u0000Kullback-Leibler divergence (KLD) learning, effectively addressing the\u0000\"notorious\" issues of underfitting with the forward KLD and mode-collapse with\u0000the reverse KLD. Since the summation of forward and reverse KLD seems to be\u0000sufficient to combine the strength of both approaches, we include this learning\u0000method as a direct baseline in numerical experiments to evaluate its\u0000effectiveness. Numerical experiments demonstrate that RD learning significantly\u0000outperforms other learning methods in terms of energy function fitting,\u0000mode-covering, and learning stability across various discrete energy-based\u0000models. Moreover, the performance gaps between RD learning and the other\u0000learning methods become more pronounced as the dimensions of target models\u0000increase.","PeriodicalId":501379,"journal":{"name":"arXiv - STAT - Statistics Theory","volume":"9 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142192576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
arXiv - STAT - Statistics Theory
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1