首页 > 最新文献

Information and Inference-A Journal of the Ima最新文献

英文 中文
Generalization error bounds for iterative recovery algorithms unfolded as neural networks 迭代恢复算法的泛化误差边界以神经网络的形式展开
4区 数学 Q1 Mathematics Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad023
Ekkehard Schnoor, Arash Behboodi, Holger Rauhut
Abstract Motivated by the learned iterative soft thresholding algorithm (LISTA), we introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements. By allowing a wide range of degrees of weight-sharing between the flayers, we enable a unified analysis for very different neural network types, ranging from recurrent ones to networks more similar to standard feedforward neural networks. Based on training samples, via empirical risk minimization, we aim at learning the optimal network parameters and thereby the optimal network that reconstructs signals from their low-dimensional linear measurements. We derive generalization bounds by analyzing the Rademacher complexity of hypothesis classes consisting of such deep networks, that also take into account the thresholding parameters. We obtain estimates of the sample complexity that essentially depend only linearly on the number of parameters and on the depth. We apply our main result to obtain specific generalization bounds for several practical examples, including different algorithms for (implicit) dictionary learning, and convolutional neural networks.
摘要:在学习迭代软阈值算法(LISTA)的激励下,我们引入了一类适用于从少量线性测量稀疏重建的神经网络。通过允许在剥层器之间广泛程度的权重共享,我们能够对非常不同的神经网络类型进行统一分析,从循环网络到更类似于标准前馈神经网络的网络。基于训练样本,通过经验风险最小化,我们的目标是学习最优网络参数,从而获得从低维线性测量中重建信号的最优网络。我们通过分析由这种深度网络组成的假设类的Rademacher复杂度来推导泛化边界,并且考虑了阈值参数。我们得到的样本复杂度的估计基本上只线性地依赖于参数的数量和深度。我们将我们的主要结果应用于几个实际示例,包括(隐式)字典学习和卷积神经网络的不同算法,以获得特定的泛化界限。
{"title":"Generalization error bounds for iterative recovery algorithms unfolded as neural networks","authors":"Ekkehard Schnoor, Arash Behboodi, Holger Rauhut","doi":"10.1093/imaiai/iaad023","DOIUrl":"https://doi.org/10.1093/imaiai/iaad023","url":null,"abstract":"Abstract Motivated by the learned iterative soft thresholding algorithm (LISTA), we introduce a general class of neural networks suitable for sparse reconstruction from few linear measurements. By allowing a wide range of degrees of weight-sharing between the flayers, we enable a unified analysis for very different neural network types, ranging from recurrent ones to networks more similar to standard feedforward neural networks. Based on training samples, via empirical risk minimization, we aim at learning the optimal network parameters and thereby the optimal network that reconstructs signals from their low-dimensional linear measurements. We derive generalization bounds by analyzing the Rademacher complexity of hypothesis classes consisting of such deep networks, that also take into account the thresholding parameters. We obtain estimates of the sample complexity that essentially depend only linearly on the number of parameters and on the depth. We apply our main result to obtain specific generalization bounds for several practical examples, including different algorithms for (implicit) dictionary learning, and convolutional neural networks.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Separation-free super-resolution from compressed measurements is possible: an orthonormal atomic norm minimization approach 从压缩测量中实现无分离的超分辨率是可能的:一种标准正交原子范数最小化方法
4区 数学 Q1 Mathematics Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad033
Jirong Yi, Soura Dasgupta, Jian-Feng Cai, Mathews Jacob, Jingchao Gao, Myung Cho, Weiyu Xu
Abstract We consider the problem of recovering the superposition of $R$ distinct complex exponential functions from compressed non-uniform time-domain samples. Total variation (TV) minimization or atomic norm minimization was proposed in the literature to recover the $R$ frequencies or the missing data. However, it is known that in order for TV minimization and atomic norm minimization to recover the missing data or the frequencies, the underlying $R$ frequencies are required to be well separated, even when the measurements are noiseless. This paper shows that the Hankel matrix recovery approach can super-resolve the $R$ complex exponentials and their frequencies from compressed non-uniform measurements, regardless of how close their frequencies are to each other. We propose a new concept of orthonormal atomic norm minimization (OANM), and demonstrate that the success of Hankel matrix recovery in separation-free super-resolution comes from the fact that the nuclear norm of a Hankel matrix is an orthonormal atomic norm. More specifically, we show that, in traditional atomic norm minimization, the underlying parameter values must be well separated to achieve successful signal recovery, if the atoms are changing continuously with respect to the continuously valued parameter. In contrast, for the OANM, it is possible the OANM is successful even though the original atoms can be arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof using the theory of compressed sensing.
摘要研究了从压缩的非均匀时域样本中恢复R不同复指数函数叠加的问题。文献中提出了总变差最小化或原子范数最小化来恢复R频率或丢失的数据。然而,众所周知,为了使TV最小化和原子范数最小化来恢复丢失的数据或频率,即使在测量是无噪声的情况下,也需要很好地分离底层R频率。本文证明了汉克尔矩阵恢复方法可以从压缩的非均匀测量中超分辨出$R$复指数及其频率,而不管它们的频率彼此有多接近。我们提出了标准正交原子范数最小化(OANM)的新概念,并证明了Hankel矩阵在无分离超分辨中恢复的成功源于Hankel矩阵的核范数是一个标准正交原子范数。更具体地说,我们表明,在传统的原子范数最小化中,如果原子相对于连续值参数连续变化,则必须很好地分离底层参数值以实现成功的信号恢复。相比之下,对于OANM,即使原始原子可以任意接近,OANM也有可能成功。作为本研究的副产品,我们给出了核范数的一个矩阵理论不等式,并利用压缩感知理论给出了它的证明。
{"title":"Separation-free super-resolution from compressed measurements is possible: an orthonormal atomic norm minimization approach","authors":"Jirong Yi, Soura Dasgupta, Jian-Feng Cai, Mathews Jacob, Jingchao Gao, Myung Cho, Weiyu Xu","doi":"10.1093/imaiai/iaad033","DOIUrl":"https://doi.org/10.1093/imaiai/iaad033","url":null,"abstract":"Abstract We consider the problem of recovering the superposition of $R$ distinct complex exponential functions from compressed non-uniform time-domain samples. Total variation (TV) minimization or atomic norm minimization was proposed in the literature to recover the $R$ frequencies or the missing data. However, it is known that in order for TV minimization and atomic norm minimization to recover the missing data or the frequencies, the underlying $R$ frequencies are required to be well separated, even when the measurements are noiseless. This paper shows that the Hankel matrix recovery approach can super-resolve the $R$ complex exponentials and their frequencies from compressed non-uniform measurements, regardless of how close their frequencies are to each other. We propose a new concept of orthonormal atomic norm minimization (OANM), and demonstrate that the success of Hankel matrix recovery in separation-free super-resolution comes from the fact that the nuclear norm of a Hankel matrix is an orthonormal atomic norm. More specifically, we show that, in traditional atomic norm minimization, the underlying parameter values must be well separated to achieve successful signal recovery, if the atoms are changing continuously with respect to the continuously valued parameter. In contrast, for the OANM, it is possible the OANM is successful even though the original atoms can be arbitrarily close. As a byproduct of this research, we provide one matrix-theoretic inequality of nuclear norm, and give its proof using the theory of compressed sensing.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimax detection of localized signals in statistical inverse problems 统计逆问题中局域信号的极大极小检测
4区 数学 Q1 Mathematics Pub Date : 2023-04-27 DOI: 10.1093/imaiai/iaad026
Markus Pohlmann, Frank Werner, Axel Munk
Abstract We investigate minimax testing for detecting local signals or linear combinations of such signals when only indirect data are available. Naturally, in the presence of noise, signals that are too small cannot be reliably detected. In a Gaussian white noise model, we discuss upper and lower bounds for the minimal size of the signal such that testing with small error probabilities is possible. In certain situations we are able to characterize the asymptotic minimax detection boundary. Our results are applied to inverse problems such as numerical differentiation, deconvolution and the inversion of the Radon transform.
摘要:我们研究了在只有间接数据可用的情况下检测局部信号或这些信号的线性组合的极大极小检验。当然,在噪声存在的情况下,太小的信号不能被可靠地检测到。在高斯白噪声模型中,我们讨论了信号最小尺寸的上界和下界,使小误差概率的测试成为可能。在某些情况下,我们能够描述渐近极大极小检测边界。我们的结果应用于数值微分、反卷积和Radon变换反演等反问题。
{"title":"Minimax detection of localized signals in statistical inverse problems","authors":"Markus Pohlmann, Frank Werner, Axel Munk","doi":"10.1093/imaiai/iaad026","DOIUrl":"https://doi.org/10.1093/imaiai/iaad026","url":null,"abstract":"Abstract We investigate minimax testing for detecting local signals or linear combinations of such signals when only indirect data are available. Naturally, in the presence of noise, signals that are too small cannot be reliably detected. In a Gaussian white noise model, we discuss upper and lower bounds for the minimal size of the signal such that testing with small error probabilities is possible. In certain situations we are able to characterize the asymptotic minimax detection boundary. Our results are applied to inverse problems such as numerical differentiation, deconvolution and the inversion of the Radon transform.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"136266731","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sharp, strong and unique minimizers for low complexity robust recovery 锐利,强大和独特的最小化低复杂性稳健恢复
4区 数学 Q1 Mathematics Pub Date : 2023-04-23 DOI: 10.1093/imaiai/iaad005
Jalal Fadili, Tran T. A. Nghia, Trinh T. T. Tran
Abstract In this paper, we show the important roles of sharp minima and strong minima for robust recovery. We also obtain several characterizations of sharp minima for convex regularized optimization problems. Our characterizations are quantitative and verifiable especially for the case of decomposable norm regularized problems including sparsity, group-sparsity and low-rank convex problems. For group-sparsity optimization problems, we show that a unique solution is a strong solution and obtains quantitative characterizations for solution uniqueness.
摘要本文证明了锐极小值和强极小值在鲁棒恢复中的重要作用。对于凸正则化优化问题,我们也得到了尖锐极小值的几个特征。特别是对于可分解范数正则化问题,包括稀疏性、群稀疏性和低秩凸性问题,我们的刻画是定量的和可验证的。对于群稀疏优化问题,我们证明了唯一解是强解,并得到了解唯一性的定量表征。
{"title":"Sharp, strong and unique minimizers for low complexity robust recovery","authors":"Jalal Fadili, Tran T. A. Nghia, Trinh T. T. Tran","doi":"10.1093/imaiai/iaad005","DOIUrl":"https://doi.org/10.1093/imaiai/iaad005","url":null,"abstract":"Abstract In this paper, we show the important roles of sharp minima and strong minima for robust recovery. We also obtain several characterizations of sharp minima for convex regularized optimization problems. Our characterizations are quantitative and verifiable especially for the case of decomposable norm regularized problems including sparsity, group-sparsity and low-rank convex problems. For group-sparsity optimization problems, we show that a unique solution is a strong solution and obtains quantitative characterizations for solution uniqueness.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134956496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Non-adaptive algorithms for threshold group testing with consecutive positives 连续阳性阈值组检测的非自适应算法
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2023-04-04 DOI: 10.1093/imaiai/iaad009
Given up to $d$ positive items in a large population of $n$ items ($d ll n$), the goal of threshold group testing is to efficiently identify the positives via tests, where a test on a subset of items is positive if the subset contains at least $u$ positive items, negative if it contains up to $ell $ positive items and arbitrary (either positive or negative) otherwise. The parameter $g = u - ell - 1$ is called the gap. In non-adaptive strategies, all tests are fixed in advance and can be represented as a measurement matrix, in which each row and column represent a test and an item, respectively. In this paper, we consider non-adaptive threshold group testing with consecutive positives in which the items are linearly ordered and the positives are consecutive in that order. We show that by designing deterministic and strongly explicit measurement matrices, $lceil log _{2}{lceil frac {n}{d} rceil } rceil + 2d + 3$ (respectively, $lceil log _{2}{lceil frac {n}{d} rceil } rceil + 3d$) tests suffice to identify the positives in $O left ( log _{2}{frac {n}{d}} + d right )$ time when $g = 0$ (respectively, $g> 0$). The results significantly improve the state-of-the-art scheme that needs $15 lceil log _{2}{lceil frac {n}{d} rceil } rceil + 4d + 71$ tests to identify the positives in $O left ( frac {n}{d} log _{2}{frac {n}{d}} + ud^{2} right )$ time, and whose associated measurement matrices are random and (non-strongly) explicit.
在大量的$n$项($d ll n$)中给出$d$阳性项目,阈值组测试的目标是通过测试有效地识别阳性项目,其中,如果对项目子集的测试至少包含$u$个阳性项目,则为阳性,如果包含多达$ell $个阳性项目,则为阴性,否则为任意(阳性或阴性)。参数$g = u - ell - 1$称为间隙。在非自适应策略中,所有的测试都是预先固定的,可以用测量矩阵表示,其中每一行和每一列分别代表一个测试和一个项目。在本文中,我们考虑具有连续阳性的非自适应阈值群检验,其中项目是线性有序的,阳性是连续的。我们表明,通过设计确定性和强显式测量矩阵,$lceil log _{2}{lceil frac {n}{d} rceil } rceil + 2d + 3$(分别,$lceil log _{2}{lceil frac {n}{d} rceil } rceil + 3d$)测试足以识别$g = 0$(分别,$g> 0$)时$O left ( log _{2}{frac {n}{d}} + d right )$时间的阳性。结果显著改进了最先进的方案,该方案需要$15 lceil log _{2}{lceil frac {n}{d} rceil } rceil + 4d + 71$测试来识别$O left ( frac {n}{d} log _{2}{frac {n}{d}} + ud^{2} right )$时间内的阳性,并且其相关的测量矩阵是随机和(非强)显式的。
{"title":"Non-adaptive algorithms for threshold group testing with consecutive positives","authors":"","doi":"10.1093/imaiai/iaad009","DOIUrl":"https://doi.org/10.1093/imaiai/iaad009","url":null,"abstract":"\u0000 Given up to $d$ positive items in a large population of $n$ items ($d ll n$), the goal of threshold group testing is to efficiently identify the positives via tests, where a test on a subset of items is positive if the subset contains at least $u$ positive items, negative if it contains up to $ell $ positive items and arbitrary (either positive or negative) otherwise. The parameter $g = u - ell - 1$ is called the gap. In non-adaptive strategies, all tests are fixed in advance and can be represented as a measurement matrix, in which each row and column represent a test and an item, respectively. In this paper, we consider non-adaptive threshold group testing with consecutive positives in which the items are linearly ordered and the positives are consecutive in that order. We show that by designing deterministic and strongly explicit measurement matrices, $lceil log _{2}{lceil frac {n}{d} rceil } rceil + 2d + 3$ (respectively, $lceil log _{2}{lceil frac {n}{d} rceil } rceil + 3d$) tests suffice to identify the positives in $O left ( log _{2}{frac {n}{d}} + d right )$ time when $g = 0$ (respectively, $g> 0$). The results significantly improve the state-of-the-art scheme that needs $15 lceil log _{2}{lceil frac {n}{d} rceil } rceil + 4d + 71$ tests to identify the positives in $O left ( frac {n}{d} log _{2}{frac {n}{d}} + ud^{2} right )$ time, and whose associated measurement matrices are random and (non-strongly) explicit.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-04-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74278014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Theoretical analysis and computation of the sample Fréchet mean of sets of large graphs for various metrics 各种指标的大图集样本均值的理论分析与计算
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2023-03-28 DOI: 10.1093/imaiai/iaad002
Daniel Ferguson, F. G. Meyer
To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that has been adapted to metric spaces. A standard approach is to consider the Fréchet mean. In practice, computing the Fréchet mean for sets of large graphs presents many computational issues. In this work, we suggest a method that may be used to compute the Fréchet mean for sets of graphs which is metric independent. We show that the technique proposed can be used to determine the Fréchet mean when considering the Hamming distance or a distance defined by the difference between the spectra of the adjacency matrices of the graphs.
为了描述一组图的位置(平均值,中位数),我们需要一个适用于度量空间的中心性概念。一种标准的方法是考虑fr切特平均值。在实践中,计算大型图集的fr平均值会出现许多计算问题。在这项工作中,我们提出了一种方法,可用于计算与度量无关的图集的fr平均值。我们证明,当考虑汉明距离或由图的邻接矩阵的谱之间的差定义的距离时,所提出的技术可以用来确定fr平均。
{"title":"Theoretical analysis and computation of the sample Fréchet mean of sets of large graphs for various metrics","authors":"Daniel Ferguson, F. G. Meyer","doi":"10.1093/imaiai/iaad002","DOIUrl":"https://doi.org/10.1093/imaiai/iaad002","url":null,"abstract":"\u0000 To characterize the location (mean, median) of a set of graphs, one needs a notion of centrality that has been adapted to metric spaces. A standard approach is to consider the Fréchet mean. In practice, computing the Fréchet mean for sets of large graphs presents many computational issues. In this work, we suggest a method that may be used to compute the Fréchet mean for sets of graphs which is metric independent. We show that the technique proposed can be used to determine the Fréchet mean when considering the Hamming distance or a distance defined by the difference between the spectra of the adjacency matrices of the graphs.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-03-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89007827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local Viterbi property in decoding 译码中的局部维特比特性
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2023-03-20 DOI: 10.1093/imaiai/iaad004
J. Lember
The article studies the decoding problem (also known as the classification or the segmentation problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and the underlying state sequence form a two-dimensional Markov chain, a natural generalization of hidden Markov model. The standard solutions to the decoding problem are the so-called Viterbi path—a sequence with maximum state path probability given the observations—or the pointwise maximum a posteriori (PMAP) path that maximizes the expected number of correctly classified entries. When the goal is to simultaneously maximize both criterions—conditional probability (corresponding to Viterbi path) and pointwise conditional probability (corresponding to PMAP path)—then they are combined into one single criterion via the regularization parameter $C$. The main objective of the article is to study the behaviour of the solution—called the hybrid path—as $C$ grows. Increasing $C$ increases the conditional probability of the hybrid path and when $C$ is big enough then every hybrid path is a Viterbi path. We show that hybrid paths also approach the Viterbi path locally: we define $m$-locally Viterbi paths and show that the hybrid path is $m$-locally Viterbi whenever $C$ is big enough. This all might lead to an impression that when $C$ is relatively big then any hybrid path that is not yet Viterbi differs from the Viterbi path by a few single entries only. We argue that this intuition is wrong, because when unique and $m$-locally Viterbi, then different hybrid paths differ by at least $m$ entries. Thus, when $C$ increases then the different hybrid paths tend to differ from each other by larger and larger intervals. Hence the hybrid paths might offer a variety of rather different solutions to the decoding problem.
本文研究了基于成对马尔可夫模型的解码问题(也称为分类或分割问题)。PMM是观测过程和底层状态序列形成二维马尔可夫链的过程,是隐马尔可夫模型的自然推广。解码问题的标准解决方案是所谓的Viterbi路径——在给定观测值的情况下具有最大状态路径概率的序列——或者是使正确分类条目的期望数量最大化的点向最大后验路径(PMAP)。当目标是同时最大化条件概率(对应于Viterbi路径)和点向条件概率(对应于PMAP路径)这两个标准时,它们通过正则化参数$C$组合成一个标准。本文的主要目的是研究解(称为混合路径)随着C的增长的行为。增加C会增加混合路径的条件概率当C足够大时每个混合路径都是维特比路径。我们证明了混合路径也接近局部Viterbi路径:我们定义了$m$-局部Viterbi路径,并证明当$C$足够大时混合路径是$m$-局部Viterbi。这可能会给人一种印象,当$C$比较大时,任何还不是Viterbi的混合路径与Viterbi路径只相差几个单条目。我们认为这种直觉是错误的,因为当唯一且$m$-局部维特比时,不同的混合路径至少相差$m$项。因此,当$C$增加时,不同的混合路径之间的差异会越来越大。因此,混合路径可能为解码问题提供各种不同的解决方案。
{"title":"Local Viterbi property in decoding","authors":"J. Lember","doi":"10.1093/imaiai/iaad004","DOIUrl":"https://doi.org/10.1093/imaiai/iaad004","url":null,"abstract":"\u0000 The article studies the decoding problem (also known as the classification or the segmentation problem) with pairwise Markov models (PMMs). A PMM is a process where the observation process and the underlying state sequence form a two-dimensional Markov chain, a natural generalization of hidden Markov model. The standard solutions to the decoding problem are the so-called Viterbi path—a sequence with maximum state path probability given the observations—or the pointwise maximum a posteriori (PMAP) path that maximizes the expected number of correctly classified entries. When the goal is to simultaneously maximize both criterions—conditional probability (corresponding to Viterbi path) and pointwise conditional probability (corresponding to PMAP path)—then they are combined into one single criterion via the regularization parameter $C$. The main objective of the article is to study the behaviour of the solution—called the hybrid path—as $C$ grows. Increasing $C$ increases the conditional probability of the hybrid path and when $C$ is big enough then every hybrid path is a Viterbi path. We show that hybrid paths also approach the Viterbi path locally: we define $m$-locally Viterbi paths and show that the hybrid path is $m$-locally Viterbi whenever $C$ is big enough. This all might lead to an impression that when $C$ is relatively big then any hybrid path that is not yet Viterbi differs from the Viterbi path by a few single entries only. We argue that this intuition is wrong, because when unique and $m$-locally Viterbi, then different hybrid paths differ by at least $m$ entries. Thus, when $C$ increases then the different hybrid paths tend to differ from each other by larger and larger intervals. Hence the hybrid paths might offer a variety of rather different solutions to the decoding problem.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-03-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82746142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Minimum probability of error of list M-ary hypothesis testing 列表m -玛利假设检验的最小误差概率
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2023-02-27 DOI: 10.1093/imaiai/iaad001
Ehsan Asadi Kangarshahi, A. Guillén i Fàbregas
We study a variation of Bayesian $M$-ary hypothesis testing in which the test outputs a list of $L$ candidates out of the $M$ possible upon processing the observation. We study the minimum error probability of list hypothesis testing, where an error is defined as the event where the true hypothesis is not in the list output by the test. We derive two exact expressions of the minimum probability or error. The first is expressed as the error probability of a certain non-Bayesian binary hypothesis test and is reminiscent of the meta-converse bound by Polyanskiy, Poor and Verdú (2010). The second, is expressed as the tail probability of the likelihood ratio between the two distributions involved in the aforementioned non-Bayesian binary hypothesis test. Hypothesis testing, error probability, information theory.
我们研究了贝叶斯$M$任意假设检验的一种变体,在该检验中,在处理观察结果后,该检验从$M$可能的候选中输出$L$的候选列表。我们研究了列表假设检验的最小错误概率,其中错误定义为测试输出的列表中不存在真实假设的事件。我们导出了最小概率或最小误差的两个精确表达式。第一个表示为某个非贝叶斯二元假设检验的错误概率,让人想起Polyanskiy, Poor和Verdú(2010)的元逆界。第二个,表示为上述非贝叶斯二元假设检验中涉及的两个分布之间的似然比的尾部概率。假设检验,错误概率,信息论。
{"title":"Minimum probability of error of list M-ary hypothesis testing","authors":"Ehsan Asadi Kangarshahi, A. Guillén i Fàbregas","doi":"10.1093/imaiai/iaad001","DOIUrl":"https://doi.org/10.1093/imaiai/iaad001","url":null,"abstract":"\u0000 We study a variation of Bayesian $M$-ary hypothesis testing in which the test outputs a list of $L$ candidates out of the $M$ possible upon processing the observation. We study the minimum error probability of list hypothesis testing, where an error is defined as the event where the true hypothesis is not in the list output by the test. We derive two exact expressions of the minimum probability or error. The first is expressed as the error probability of a certain non-Bayesian binary hypothesis test and is reminiscent of the meta-converse bound by Polyanskiy, Poor and Verdú (2010). The second, is expressed as the tail probability of the likelihood ratio between the two distributions involved in the aforementioned non-Bayesian binary hypothesis test. Hypothesis testing, error probability, information theory.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2023-02-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90910163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A unifying view of modal clustering 模态聚类的统一观点
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac030
Ery Arias-Castro;Wanli Qiao
Two important non-parametric approaches to clustering emerged in the 1970s: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hostetler. In a recent paper, we draw a connection between these two approaches, in particular, by showing that the gradient flow provides a way to move along the cluster tree. Here, we argue the case that these two approaches are fundamentally the same. We do so by proposing two ways of obtaining a partition from the cluster tree—each one of them very natural in its own right—and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.
20世纪70年代出现了两种重要的非参数聚类方法:Hartigan提出的通过水平集或聚类树进行聚类,以及Fukunaga和Hostetler提出的通过梯度线或梯度流进行聚类。在最近的一篇论文中,我们将这两种方法联系起来,特别是通过展示梯度流提供了一种沿着聚类树移动的方式。在这里,我们认为这两种方法从根本上是相同的。我们提出了两种从聚类树中获得分区的方法——每种方法都非常自然——并表明在采样密度的标准假设下,这两种方法都可以简化为梯度流给出的分区。
{"title":"A unifying view of modal clustering","authors":"Ery Arias-Castro;Wanli Qiao","doi":"10.1093/imaiai/iaac030","DOIUrl":"https://doi.org/10.1093/imaiai/iaac030","url":null,"abstract":"Two important non-parametric approaches to clustering emerged in the 1970s: clustering by level sets or cluster tree as proposed by Hartigan, and clustering by gradient lines or gradient flow as proposed by Fukunaga and Hostetler. In a recent paper, we draw a connection between these two approaches, in particular, by showing that the gradient flow provides a way to move along the cluster tree. Here, we argue the case that these two approaches are fundamentally the same. We do so by proposing two ways of obtaining a partition from the cluster tree—each one of them very natural in its own right—and showing that both of them reduce to the partition given by the gradient flow under standard assumptions on the sampling density.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298052","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On sharp stochastic zeroth-order Hessian estimators over Riemannian manifolds 关于黎曼流形上的零阶随机Hessian估计
IF 1.6 4区 数学 Q1 Mathematics Pub Date : 2022-08-01 DOI: 10.1093/imaiai/iaac027
Tianyu Wang
We study Hessian estimators for functions defined over an $n$-dimensional complete analytic Riemannian manifold. We introduce new stochastic zeroth-order Hessian estimators using $O (1)$ function evaluations. We show that, for an analytic real-valued function $f$, our estimator achieves a bias bound of order $ O ( gamma delta ^2 ) $, where $ gamma $ depends on both the Levi–Civita connection and function $f$, and $delta $ is the finite difference step size. To the best of our knowledge, our results provide the first bias bound for Hessian estimators that explicitly depends on the geometry of the underlying Riemannian manifold. We also study downstream computations based on our Hessian estimators. The supremacy of our method is evidenced by empirical evaluations.
我们研究了在$n$-维完全解析黎曼流形上定义的函数的Hessian估计。我们使用$O(1)$函数评估引入了新的随机零阶Hessian估计量。我们证明,对于分析实值函数$f$,我们的估计器实现了$O(gammadelta^2)$阶的偏差界,其中$gamma$依赖于Levi–Civita连接和函数$f$$$$delta$是有限差分步长。据我们所知,我们的结果为Hessian估计量提供了第一个偏差界,该估计量明确地依赖于底层黎曼流形的几何。我们还研究了基于Hessian估计量的下游计算。经验评估证明了我们方法的优越性。
{"title":"On sharp stochastic zeroth-order Hessian estimators over Riemannian manifolds","authors":"Tianyu Wang","doi":"10.1093/imaiai/iaac027","DOIUrl":"https://doi.org/10.1093/imaiai/iaac027","url":null,"abstract":"We study Hessian estimators for functions defined over an \u0000<tex>$n$</tex>\u0000-dimensional complete analytic Riemannian manifold. We introduce new stochastic zeroth-order Hessian estimators using \u0000<tex>$O (1)$</tex>\u0000 function evaluations. We show that, for an analytic real-valued function \u0000<tex>$f$</tex>\u0000, our estimator achieves a bias bound of order \u0000<tex>$ O ( gamma delta ^2 ) $</tex>\u0000, where \u0000<tex>$ gamma $</tex>\u0000 depends on both the Levi–Civita connection and function \u0000<tex>$f$</tex>\u0000, and \u0000<tex>$delta $</tex>\u0000 is the finite difference step size. To the best of our knowledge, our results provide the first bias bound for Hessian estimators that explicitly depends on the geometry of the underlying Riemannian manifold. We also study downstream computations based on our Hessian estimators. The supremacy of our method is evidenced by empirical evaluations.","PeriodicalId":45437,"journal":{"name":"Information and Inference-A Journal of the Ima","volume":null,"pages":null},"PeriodicalIF":1.6,"publicationDate":"2022-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"50298049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"数学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Information and Inference-A Journal of the Ima
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1