首页 > 最新文献

2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)最新文献

英文 中文
A Sequential Detection Theory for Statistically Periodic Random Processes 统计周期随机过程的顺序检测理论
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919699
T. Banerjee, Prudhvi K. Gurram, Gene T. Whipps
Periodic statistical behavior of data is observed in many practical problems encountered in cyber-physical systems and biology. A new class of stochastic processes called independent and periodically identically distributed (i.p.i.d.) processes is defined to model such data. An optimal stopping theory is developed to solve sequential detection problems for i.p.i.d. processes. The developed theory is then applied to detect a change in the distribution of an i.p.i.d. process. It is shown that the optimal change detection algorithm is a stopping rule based on a periodic sequence of thresholds. Numerical results are provided to demonstrate that a single-threshold policy is not strictly optimal.
在网络物理系统和生物学中遇到的许多实际问题中都观察到数据的周期性统计行为。一类新的随机过程被称为独立和周期性同分布过程(i.p.i.d)被定义来模拟这类数据。为了解决i.p.i.d.过程的顺序检测问题,提出了一种最优停止理论。然后将发展的理论应用于检测i.p.i.d.过程分布的变化。结果表明,最优变化检测算法是基于阈值周期序列的停止规则。数值结果表明,单阈值策略不是严格最优的。
{"title":"A Sequential Detection Theory for Statistically Periodic Random Processes","authors":"T. Banerjee, Prudhvi K. Gurram, Gene T. Whipps","doi":"10.1109/ALLERTON.2019.8919699","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919699","url":null,"abstract":"Periodic statistical behavior of data is observed in many practical problems encountered in cyber-physical systems and biology. A new class of stochastic processes called independent and periodically identically distributed (i.p.i.d.) processes is defined to model such data. An optimal stopping theory is developed to solve sequential detection problems for i.p.i.d. processes. The developed theory is then applied to detect a change in the distribution of an i.p.i.d. process. It is shown that the optimal change detection algorithm is a stopping rule based on a periodic sequence of thresholds. Numerical results are provided to demonstrate that a single-threshold policy is not strictly optimal.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127835150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Acquisition Games with Partial-Asymmetric Information 部分信息不对称的获取博弈
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919935
V. Kavitha, M. Maheshwari, E. Altman
We consider an example of stochastic games with partial, asymmetric and non-classical information. We obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only to partial information updates, and our approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. We solved the problem for the case with two agents and two locks and conjectured the results for N-agents. We showed that a pair of (1partial) state-dependent time-threshold policies form a Nash equilibrium.
我们考虑了一个具有部分、不对称和非经典信息的随机博弈的例子。我们使用一种新的方法获得相关的均衡策略,该方法允许以结构化的方式管理信念更新。智能体只能访问部分信息更新,我们的方法是在信息更新之前考虑最优开环控制。智能体不断控制其泊松搜索时钟的频率以获得锁,智能体要在其他人获得奖励之前获得所有锁。然而,代理不知道其他人的获取状态,并且将产生与他们的费率过程成正比的成本。我们解决了两个代理和两个锁的情况下的问题,并推测了n个代理的结果。我们证明了一对(部分)依赖于状态的时间阈值策略形成了纳什均衡。
{"title":"Acquisition Games with Partial-Asymmetric Information","authors":"V. Kavitha, M. Maheshwari, E. Altman","doi":"10.1109/ALLERTON.2019.8919935","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919935","url":null,"abstract":"We consider an example of stochastic games with partial, asymmetric and non-classical information. We obtain relevant equilibrium policies using a new approach which allows managing the belief updates in a structured manner. Agents have access only to partial information updates, and our approach is to consider optimal open loop control until the information update. The agents continuously control the rates of their Poisson search clocks to acquire the locks, the agent to get all the locks before others would get reward one. However, the agents have no information about the acquisition status of others and will incur a cost proportional to their rate process. We solved the problem for the case with two agents and two locks and conjectured the results for N-agents. We showed that a pair of (1partial) state-dependent time-threshold policies form a Nash equilibrium.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121675181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On Matrix Momentum Stochastic Approximation and Applications to Q-learning 矩阵动量随机逼近及其在q学习中的应用
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919828
Adithya M. Devraj, A. Bušić, Sean P. Meyn
Stochastic approximation (SA) algorithms are recursive techniques used to obtain the roots of functions that can be expressed as expectations of a noisy parameterized family of functions. In this paper two new SA algorithms are introduced: 1) PolSA, an extension of Polyak’s momentum technique with a specially designed matrix momentum, and 2) NeSA, which can either be regarded as a variant of Nesterov’s acceleration method, or a simplification of PolSA. The rates of convergence of SA algorithms is well understood. Under special conditions, the mean square error of the parameter estimates is bounded by $sigma^{2}/n+o(1/n)$, where $sigma^{2} geq 0$ is an identifiable constant. If these conditions fail, the rate is typically sub-linear. There are two well known SA algorithms that ensure a linear rate, with minimal value of variance, $sigma^{2}$: the Ruppert-Polyak averaging technique, and the stochastic Newton-Raphson (SNR) algorithm. It is demonstrated here that under mild technical assumptions, the PolSA algorithm also achieves this optimality criteria. This result is established via novel coupling arguments: It is shown that the parameter estimates obtained from the PolSA algorithm couple with those of the optimal variance (but computationally more expensive) SNR algorithm, at a rate $O(1/n^{2})$. The newly proposed algorithms are extended to a reinforcement learning setting to obtain new Q-learning algorithms, and numerical results confirm the coupling of PolSA and SNR.
随机逼近(SA)算法是一种递归技术,用于获得函数的根,这些函数可以表示为噪声参数化函数族的期望。本文介绍了两种新的SA算法:1)PolSA,它是Polyak动量技术的扩展,具有特殊设计的矩阵动量;2)NeSA,它可以看作是Nesterov加速方法的一种变体,也可以看作是PolSA的一种简化。SA算法的收敛速度是很容易理解的。在特殊条件下,参数估计的均方误差以$sigma^{2}/n+o(1/n)$为界,其中$sigma^{2} geq 0$为可识别常数。如果这些条件不满足,速率通常是次线性的。有两种众所周知的SA算法可以确保线性速率,方差值最小,$sigma^{2}$: Ruppert-Polyak平均技术和随机牛顿-拉夫森(SNR)算法。这里证明,在温和的技术假设下,PolSA算法也达到了这一最优性准则。这一结果是通过新颖的耦合参数建立的:结果表明,从PolSA算法获得的参数估计与最优方差(但计算成本更高)信噪比算法的参数估计以$O(1/n^{2})$的速率耦合。将新提出的算法扩展到一个强化学习环境,得到新的q -学习算法,数值结果证实了PolSA和信噪比的耦合性。
{"title":"On Matrix Momentum Stochastic Approximation and Applications to Q-learning","authors":"Adithya M. Devraj, A. Bušić, Sean P. Meyn","doi":"10.1109/ALLERTON.2019.8919828","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919828","url":null,"abstract":"Stochastic approximation (SA) algorithms are recursive techniques used to obtain the roots of functions that can be expressed as expectations of a noisy parameterized family of functions. In this paper two new SA algorithms are introduced: 1) PolSA, an extension of Polyak’s momentum technique with a specially designed matrix momentum, and 2) NeSA, which can either be regarded as a variant of Nesterov’s acceleration method, or a simplification of PolSA. The rates of convergence of SA algorithms is well understood. Under special conditions, the mean square error of the parameter estimates is bounded by $sigma^{2}/n+o(1/n)$, where $sigma^{2} geq 0$ is an identifiable constant. If these conditions fail, the rate is typically sub-linear. There are two well known SA algorithms that ensure a linear rate, with minimal value of variance, $sigma^{2}$: the Ruppert-Polyak averaging technique, and the stochastic Newton-Raphson (SNR) algorithm. It is demonstrated here that under mild technical assumptions, the PolSA algorithm also achieves this optimality criteria. This result is established via novel coupling arguments: It is shown that the parameter estimates obtained from the PolSA algorithm couple with those of the optimal variance (but computationally more expensive) SNR algorithm, at a rate $O(1/n^{2})$. The newly proposed algorithms are extended to a reinforcement learning setting to obtain new Q-learning algorithms, and numerical results confirm the coupling of PolSA and SNR.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130627511","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Iterative Collaborative Filtering for Sparse Noisy Tensor Estimation 稀疏噪声张量估计的迭代协同滤波
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919933
D. Shah, C. Yu
We consider the task of tensor estimation, i.e. estimating a low-rank 3-order $n times n times n$ tensor from noisy observations of randomly chosen entries in the sparse regime. In the context of matrix (2-order tensor) estimation, a variety of algorithms have been proposed and analyzed in the literature including the popular collaborative filtering algorithm that is extremely well utilized in practice. However, in the context of tensor estimation, there is limited progress. No natural extensions of collaborative filtering are known beyond “flattening” the tensor into a matrix and applying standard collaborative filtering. As the main contribution of this work, we introduce a generalization of the collaborative filtering algorithm for the setting of tensor estimation and argue that it achieves sample complexity that (nearly) matches the conjectured lower bound on the sample complexity. Interestingly, our generalization uses the matrix obtained from the “flattened” tensor to compute similarity as in the classical collaborative filtering but by defining a novel “graph” using it. The algorithm recovers the tensor with mean-squared-error (MSE) decaying to 0 as long as each entry is observed independently with probability $p= Omega(n^{-3/2+epsilon})$ for any arbitrarily small $epsilon > 0$. It turns out that $p = Omega(n^{-3/2})$ is the conjectured lower bound as well as “connectivity threshold” of graph considered to compute similarity in our algorithm.
我们考虑张量估计的任务,即从稀疏区域中随机选择的条目的噪声观测中估计一个低秩3阶张量$n times n times n$。在矩阵(二阶张量)估计的背景下,文献中已经提出并分析了多种算法,其中包括在实践中得到很好应用的流行的协同滤波算法。然而,在张量估计方面,进展有限。除了将张量“扁平化”为矩阵并应用标准协同过滤之外,还没有已知的协同过滤的自然扩展。作为这项工作的主要贡献,我们引入了一种用于张量估计设置的协同滤波算法的推广,并认为它实现的样本复杂度(几乎)匹配样本复杂度的推测下界。有趣的是,我们的推广使用从“扁平”张量中获得的矩阵来计算相似度,就像在经典的协同过滤中一样,但通过定义一个新的“图”来使用它。该算法恢复张量,均方误差(MSE)衰减到0,只要每个条目以任意小$epsilon > 0$的概率$p= Omega(n^{-3/2+epsilon})$独立观察。结果表明,$p = Omega(n^{-3/2})$是我们算法中计算相似度所考虑的图的推测下界和“连通性阈值”。
{"title":"Iterative Collaborative Filtering for Sparse Noisy Tensor Estimation","authors":"D. Shah, C. Yu","doi":"10.1109/ALLERTON.2019.8919933","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919933","url":null,"abstract":"We consider the task of tensor estimation, i.e. estimating a low-rank 3-order $n times n times n$ tensor from noisy observations of randomly chosen entries in the sparse regime. In the context of matrix (2-order tensor) estimation, a variety of algorithms have been proposed and analyzed in the literature including the popular collaborative filtering algorithm that is extremely well utilized in practice. However, in the context of tensor estimation, there is limited progress. No natural extensions of collaborative filtering are known beyond “flattening” the tensor into a matrix and applying standard collaborative filtering. As the main contribution of this work, we introduce a generalization of the collaborative filtering algorithm for the setting of tensor estimation and argue that it achieves sample complexity that (nearly) matches the conjectured lower bound on the sample complexity. Interestingly, our generalization uses the matrix obtained from the “flattened” tensor to compute similarity as in the classical collaborative filtering but by defining a novel “graph” using it. The algorithm recovers the tensor with mean-squared-error (MSE) decaying to 0 as long as each entry is observed independently with probability $p= Omega(n^{-3/2+epsilon})$ for any arbitrarily small $epsilon > 0$. It turns out that $p = Omega(n^{-3/2})$ is the conjectured lower bound as well as “connectivity threshold” of graph considered to compute similarity in our algorithm.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130445710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Slepian-Wolf Polar Coding with Unknown Correlation 未知相关性的睡眠-狼极性编码
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919653
Karthik Nagarjuna Tunuguntla, P. Siegel
We consider the source coding problem of a binary discrete memoryless source with correlated side information available only at the receiver whose conditional distribution given the source is unknown to the encoder. We propose two methods based on polar codes to attain the achievable rates under this setting. The first method incorporates a staircase scheme, which has been used for universal polar coding for a compound channel. The second method is based on the technique of universalization using bit-channel combining. We also give a list of pros and cons for the two proposed methods.
本文研究了一个二进制离散无记忆源的编码问题,该源的相关侧信息仅在接收端可用,其条件分布对编码器来说是未知的。我们提出了两种基于极码的方法来获得在这种设置下可实现的速率。第一种方法采用阶梯方案,该方案已用于复合信道的通用极性编码。第二种方法是基于位信道合并的通用化技术。我们还列出了这两种建议方法的优缺点。
{"title":"Slepian-Wolf Polar Coding with Unknown Correlation","authors":"Karthik Nagarjuna Tunuguntla, P. Siegel","doi":"10.1109/ALLERTON.2019.8919653","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919653","url":null,"abstract":"We consider the source coding problem of a binary discrete memoryless source with correlated side information available only at the receiver whose conditional distribution given the source is unknown to the encoder. We propose two methods based on polar codes to attain the achievable rates under this setting. The first method incorporates a staircase scheme, which has been used for universal polar coding for a compound channel. The second method is based on the technique of universalization using bit-channel combining. We also give a list of pros and cons for the two proposed methods.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130606570","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Derandomized Asymmetrical Balanced Allocation 非随机非对称均衡分配
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919887
Dengwang Tang, V. Subramanian
Balls-in-bins model, in which n balls are sequentially placed into n bins according to some dispatching policy, is an important model with a wide range of applications despite its simplicity. The power-of-d choices (Pod) policy, in which each ball samples d independent uniform random bins and join the one with the least load (where ties are broken arbitrarily), can yield a maximum load of $frac{loglog n}{log d} + Theta(1)$ with high probability whenever $dgeq 2$. Vöking later proposed a variant of power-of-d scheme in which bins are divided into d groups, and d bins are sampled from each group respectively. One important feature of this scheme is that ties are broken asymmetrically based on groups. Comparing with Pod, this scheme reduces the maximum load to $frac{loglog n}{dlogphi_{d}}+Theta(1)$ where $1 lt phi_{d} lt 2$. Our recent work shows that one can replace independent uniform sampling with random walk based sampling while having the same performance of Pod in terms of the maximum load of all bins. In this work, we propose multiple derandomized variants of Vöking’s asymmetrical schemes and we show that they can yield the same performance as the original scheme, i.e. the maximum load is bounded by $frac{log log n}{d log phi_{d}}+Theta(1)$
将n个球按照一定的调度策略依次放置到n个桶中,是一种重要的模型,虽然简单,但应用范围广泛。d次幂选择(Pod)策略,其中每个球采样d个独立的均匀随机箱子,并加入具有最小负载的箱子(其中平局被任意打破),可以在$dgeq 2$时以高概率产生最大负载$frac{loglog n}{log d} + Theta(1)$。Vöking后来提出了一种d次方方案的变体,将箱子分成d组,从每组中分别抽取d个箱子。该方案的一个重要特征是,关系是根据群体不对称地打破的。与Pod相比,该方案将最大负载降低到$frac{loglog n}{dlogphi_{d}}+Theta(1)$,其中$1 lt phi_{d} lt 2$。我们最近的工作表明,可以用基于随机行走的抽样代替独立均匀抽样,同时在所有箱子的最大负载方面具有与Pod相同的性能。在这项工作中,我们提出了Vöking的不对称方案的多个非随机化变体,并表明它们可以产生与原始方案相同的性能,即最大负载由 $frac{log log n}{d log phi_{d}}+Theta(1)$
{"title":"Derandomized Asymmetrical Balanced Allocation","authors":"Dengwang Tang, V. Subramanian","doi":"10.1109/ALLERTON.2019.8919887","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919887","url":null,"abstract":"Balls-in-bins model, in which n balls are sequentially placed into n bins according to some dispatching policy, is an important model with a wide range of applications despite its simplicity. The power-of-d choices (Pod) policy, in which each ball samples d independent uniform random bins and join the one with the least load (where ties are broken arbitrarily), can yield a maximum load of $frac{loglog n}{log d} + Theta(1)$ with high probability whenever $dgeq 2$. Vöking later proposed a variant of power-of-d scheme in which bins are divided into d groups, and d bins are sampled from each group respectively. One important feature of this scheme is that ties are broken asymmetrically based on groups. Comparing with Pod, this scheme reduces the maximum load to $frac{loglog n}{dlogphi_{d}}+Theta(1)$ where $1 lt phi_{d} lt 2$. Our recent work shows that one can replace independent uniform sampling with random walk based sampling while having the same performance of Pod in terms of the maximum load of all bins. In this work, we propose multiple derandomized variants of Vöking’s asymmetrical schemes and we show that they can yield the same performance as the original scheme, i.e. the maximum load is bounded by $frac{log log n}{d log phi_{d}}+Theta(1)$","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"66 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115781641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cubic Regularized ADMM with Convergence to a Local Minimum in Non-convex Optimization 非凸优化中收敛到局部极小的三次正则ADMM
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919772
Zai Shi, A. Eryilmaz
How to escape saddle points is a critical issue in non-convex optimization. Previous methods on this issue mainly assume that the objective function is Hessian-Lipschitz, which leave a gap for applications using non-Hessian-Lipschitz functions. In this paper, we propose Cubic Regularized Alternating Direction Method of Multipliers (CR-ADMM) to escape saddle points of separable non-convex functions containing a non-Hessian-Lipschitz component. By carefully choosing a parameter, we prove that CR-ADMM converges to a local minimum of the original function with a rate of $O(1 /T^{1/3})$ in time horizon T, which is faster than gradient-based methods. We also show that when one or more steps of CR-ADMM are not solved exactly, CRADMM can converge to a neighborhood of the local minimum. Through the experiments of matrix factorization problems, CRADMM is shown to have a faster rate and a lower optimality gap compared with other gradient-based methods. Our approach can also find applications in other scenarios where regularized non-convex cost minimization is performed, such as parameter optimization of deep neural networks.
如何摆脱鞍点是非凸优化中的一个关键问题。以往的方法主要假设目标函数为Hessian-Lipschitz函数,这给非Hessian-Lipschitz函数的应用留下了空白。在本文中,我们提出了三次正则化交替方向乘法器(CR-ADMM)来逃避包含非hessian - lipschitz分量的可分离非凸函数的鞍点。通过仔细选择参数,我们证明了CR-ADMM在时间范围T内收敛到原始函数的局部极小值,速度为$O(1 /T^{1/3})$,比基于梯度的方法更快。我们还证明了当CR-ADMM的一个或多个步骤没有精确求解时,CRADMM可以收敛到局部最小值的邻域。通过矩阵分解问题的实验表明,与其他基于梯度的方法相比,CRADMM具有更快的速度和更小的最优性差距。我们的方法也可以在执行正则化非凸成本最小化的其他场景中找到应用,例如深度神经网络的参数优化。
{"title":"Cubic Regularized ADMM with Convergence to a Local Minimum in Non-convex Optimization","authors":"Zai Shi, A. Eryilmaz","doi":"10.1109/ALLERTON.2019.8919772","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919772","url":null,"abstract":"How to escape saddle points is a critical issue in non-convex optimization. Previous methods on this issue mainly assume that the objective function is Hessian-Lipschitz, which leave a gap for applications using non-Hessian-Lipschitz functions. In this paper, we propose Cubic Regularized Alternating Direction Method of Multipliers (CR-ADMM) to escape saddle points of separable non-convex functions containing a non-Hessian-Lipschitz component. By carefully choosing a parameter, we prove that CR-ADMM converges to a local minimum of the original function with a rate of $O(1 /T^{1/3})$ in time horizon T, which is faster than gradient-based methods. We also show that when one or more steps of CR-ADMM are not solved exactly, CRADMM can converge to a neighborhood of the local minimum. Through the experiments of matrix factorization problems, CRADMM is shown to have a faster rate and a lower optimality gap compared with other gradient-based methods. Our approach can also find applications in other scenarios where regularized non-convex cost minimization is performed, such as parameter optimization of deep neural networks.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123903203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stability of Wireless Random Access Systems 无线随机接入系统的稳定性
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919898
Ahmad Alammouri, J. Andrews, F. Baccelli
We characterize the stability, metastability, and the stationary regime of traffic dynamics in a single-cell uplink wireless system. The traffic is represented in terms of spatial birth-death processes, in which users arrive as a Poisson point process in time and space, each with a file to transmit to the base station. The service rate of each user is based on its signal to interference plus noise ratio, where the interference is from other active users in the cell. Once the file is fully transmitted, the user leaves the cell. We derive the necessary and sufficient condition for network stability, which is independent of the specific path loss function as long as it satisfies mild bound- edness conditions. A novel observation, shown through mean- field analysis and simulations, is that for a certain range of arrival rates, the network appears stable for possibly a long time, but can suddenly become unstable. This property is called metastability which is widely known in statistical physics but rarely observed in wireless communication. Finally, using mean- field analysis, we propose a heuristic characterization of the network steady-state regime when it exists, and demonstrate that it is tight for the whole range of arrival rates.
我们描述了单细胞上行无线系统中业务动态的稳定性、亚稳态和静止状态。流量以空间生-死过程表示,其中用户到达时间和空间上的泊松点过程,每个用户都有一个文件要传输到基站。每个用户的服务率基于其信噪比,其中干扰来自小区中其他活跃用户。一旦文件被完全传输,用户就离开单元。导出了网络稳定性的充分必要条件,该条件与具体的路径损失函数无关,只要满足轻度有界条件即可。通过平均场分析和模拟得出的一个新的观察结果是,对于一定范围的到达率,网络可能在很长一段时间内看起来稳定,但可能突然变得不稳定。这种性质被称为亚稳态,在统计物理中广为人知,但在无线通信中却很少观察到。最后,利用平均场分析,我们提出了网络稳态状态存在时的启发式表征,并证明了它在整个到达率范围内是紧密的。
{"title":"Stability of Wireless Random Access Systems","authors":"Ahmad Alammouri, J. Andrews, F. Baccelli","doi":"10.1109/ALLERTON.2019.8919898","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919898","url":null,"abstract":"We characterize the stability, metastability, and the stationary regime of traffic dynamics in a single-cell uplink wireless system. The traffic is represented in terms of spatial birth-death processes, in which users arrive as a Poisson point process in time and space, each with a file to transmit to the base station. The service rate of each user is based on its signal to interference plus noise ratio, where the interference is from other active users in the cell. Once the file is fully transmitted, the user leaves the cell. We derive the necessary and sufficient condition for network stability, which is independent of the specific path loss function as long as it satisfies mild bound- edness conditions. A novel observation, shown through mean- field analysis and simulations, is that for a certain range of arrival rates, the network appears stable for possibly a long time, but can suddenly become unstable. This property is called metastability which is widely known in statistical physics but rarely observed in wireless communication. Finally, using mean- field analysis, we propose a heuristic characterization of the network steady-state regime when it exists, and demonstrate that it is tight for the whole range of arrival rates.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"114 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123396858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
On the Asymptotic Sample Complexity of HGR Maximal Correlation Functions in Semi-supervised Learning 半监督学习中HGR极大相关函数的渐近样本复杂度
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919892
Xiangxiang Xu, Shao-Lun Huang
The Hirschfeld-Gebelein-Rényi (HGR) maximal correlation has been shown useful in many machine learning applications, where the alternating conditional expectation (ACE) algorithm is widely adopted to estimate the HGR maximal correlation functions from data samples. In this paper, we consider the asymptotic sample complexity of estimating the HGR maximal correlation functions in semi-supervised learning, where both labeled and unlabeled data samples are used for the estimation. First, we propose a generalized ACE algorithm to deal with the unlabeled data samples. Then, we develop a mathematical framework to characterize the learning errors between the maximal correlation functions computed from the true distribution and the functions estimated from the generalized ACE algorithm. We establish the analytical expressions for the error exponents of the learning errors, which indicate the number of training samples required for estimating the HGR maximal correlation functions by the generalized ACE algorithm. Moreover, with our theoretical results, we investigate the sampling strategy for different types of samples in semisupervised learning with a total sampling budget constraint, and an optimal sampling strategy is developed to maximize the error exponent of the learning error. Finally, the numerical simulations are presented to support our theoretical results.
hirschfeld - gebelein - r尼米(HGR)最大相关函数在许多机器学习应用中被证明是有用的,其中交替条件期望(ACE)算法被广泛采用来从数据样本中估计HGR最大相关函数。本文考虑了半监督学习中估计HGR最大相关函数的渐近样本复杂度,其中使用了标记和未标记的数据样本进行估计。首先,我们提出了一种通用的ACE算法来处理未标记的数据样本。然后,我们建立了一个数学框架来表征从真实分布计算的最大相关函数与从广义ACE算法估计的函数之间的学习误差。我们建立了学习误差的误差指数的解析表达式,它表示用广义ACE算法估计HGR最大相关函数所需的训练样本数。在此基础上,研究了在总抽样预算约束下半监督学习中不同样本类型的抽样策略,并给出了使学习误差的误差指数最大化的最优抽样策略。最后,通过数值模拟对理论结果进行了验证。
{"title":"On the Asymptotic Sample Complexity of HGR Maximal Correlation Functions in Semi-supervised Learning","authors":"Xiangxiang Xu, Shao-Lun Huang","doi":"10.1109/ALLERTON.2019.8919892","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919892","url":null,"abstract":"The Hirschfeld-Gebelein-Rényi (HGR) maximal correlation has been shown useful in many machine learning applications, where the alternating conditional expectation (ACE) algorithm is widely adopted to estimate the HGR maximal correlation functions from data samples. In this paper, we consider the asymptotic sample complexity of estimating the HGR maximal correlation functions in semi-supervised learning, where both labeled and unlabeled data samples are used for the estimation. First, we propose a generalized ACE algorithm to deal with the unlabeled data samples. Then, we develop a mathematical framework to characterize the learning errors between the maximal correlation functions computed from the true distribution and the functions estimated from the generalized ACE algorithm. We establish the analytical expressions for the error exponents of the learning errors, which indicate the number of training samples required for estimating the HGR maximal correlation functions by the generalized ACE algorithm. Moreover, with our theoretical results, we investigate the sampling strategy for different types of samples in semisupervised learning with a total sampling budget constraint, and an optimal sampling strategy is developed to maximize the error exponent of the learning error. Finally, the numerical simulations are presented to support our theoretical results.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124957656","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Spectral Analysis of the Adjacency Matrix of Random Geometric Graphs 随机几何图邻接矩阵的谱分析
Pub Date : 2019-09-01 DOI: 10.1109/ALLERTON.2019.8919798
Mounia Hamidouche, L. Cottatellucci, Konstantin Avrachenkov
In this article, we analyze the limiting eigen-value distribution (LED) of random geometric graphs (RGGs). The RGG is constructed by uniformly distributing n nodes on the d-dimensional torus $Gamma^{d}equiv [0, 1]^{d}$ and connecting two nodes if their $ell_{p}$-distance, $ pin [1, infty]$ is at most rn. In particular, we study the LED of the adjacency matrix of RGGs in the connectivity regime, in which the average vertex degree scales as $log(n)$ or faster, i.e., $Omega(log(n))$. In the connectivity regime and under some conditions on the radius rn, we show that the LED of the adjacency matrix of RGGs converges to the LED of the adjacency matrix of a deterministic geometric graph (DGG) with nodes in a grid as n goes to infinity. Then, for n finite, we use the structure of the DGG to approximate the eigenvalues of the adjacency matrix of the RGG and provide an upper bound for the approximation error. Index Terms--Random geometric graphs, adjacency matrix, limiting eigenvalue distribution, Levy distance.
本文分析了随机几何图的极限特征值分布(LED)。RGG的构造方法是在d维环面$Gamma^{d}equiv [0, 1]^{d}$上均匀分布n个节点,如果两个节点的$ell_{p}$ -distance, $ pin [1, infty]$不超过rn,则将它们连接起来。特别地,我们研究了RGGs的邻接矩阵的LED,其中平均顶点度的尺度为$log(n)$或更快,即$Omega(log(n))$。在连通性区域和半径为rn的某些条件下,我们证明了当n趋于无穷时,RGGs邻接矩阵的LED收敛于网格中有节点的确定性几何图(DGG)邻接矩阵的LED。然后,对于n有限,我们使用DGG的结构来近似RGG邻接矩阵的特征值,并提供近似误差的上界。索引项——随机几何图,邻接矩阵,极限特征值分布,列维距离。
{"title":"Spectral Analysis of the Adjacency Matrix of Random Geometric Graphs","authors":"Mounia Hamidouche, L. Cottatellucci, Konstantin Avrachenkov","doi":"10.1109/ALLERTON.2019.8919798","DOIUrl":"https://doi.org/10.1109/ALLERTON.2019.8919798","url":null,"abstract":"In this article, we analyze the limiting eigen-value distribution (LED) of random geometric graphs (RGGs). The RGG is constructed by uniformly distributing n nodes on the d-dimensional torus $Gamma^{d}equiv [0, 1]^{d}$ and connecting two nodes if their $ell_{p}$-distance, $ pin [1, infty]$ is at most rn. In particular, we study the LED of the adjacency matrix of RGGs in the connectivity regime, in which the average vertex degree scales as $log(n)$ or faster, i.e., $Omega(log(n))$. In the connectivity regime and under some conditions on the radius rn, we show that the LED of the adjacency matrix of RGGs converges to the LED of the adjacency matrix of a deterministic geometric graph (DGG) with nodes in a grid as n goes to infinity. Then, for n finite, we use the structure of the DGG to approximate the eigenvalues of the adjacency matrix of the RGG and provide an upper bound for the approximation error. Index Terms--Random geometric graphs, adjacency matrix, limiting eigenvalue distribution, Levy distance.","PeriodicalId":120479,"journal":{"name":"2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)","volume":"28 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128785925","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1