首页 > 最新文献

J. Mach. Learn. Res.最新文献

英文 中文
Low-rank Tensor Learning with Nonconvex Overlapped Nuclear Norm Regularization 非凸重叠核范数正则化的低秩张量学习
Pub Date : 2022-05-06 DOI: 10.48550/arXiv.2205.03059
Quanming Yao, Yaqing Wang, Bo Han, J. Kwok
Nonconvex regularization has been popularly used in low-rank matrix learning. However, extending it for low-rank tensor learning is still computationally expensive. To address this problem, we develop an efficient solver for use with a nonconvex extension of the overlapped nuclear norm regularizer. Based on the proximal average algorithm, the proposed algorithm can avoid expensive tensor folding/unfolding operations. A special"sparse plus low-rank"structure is maintained throughout the iterations, and allows fast computation of the individual proximal steps. Empirical convergence is further improved with the use of adaptive momentum. We provide convergence guarantees to critical points on smooth losses and also on objectives satisfying the Kurdyka-{L}ojasiewicz condition. While the optimization problem is nonconvex and nonsmooth, we show that its critical points still have good statistical performance on the tensor completion problem. Experiments on various synthetic and real-world data sets show that the proposed algorithm is efficient in both time and space and more accurate than the existing state-of-the-art.
非凸正则化在低秩矩阵学习中得到了广泛的应用。然而,将其扩展到低秩张量学习仍然是计算昂贵的。为了解决这个问题,我们开发了一个有效的求解器,用于重叠核范数正则化器的非凸扩展。基于最近邻平均算法,该算法可以避免昂贵的张量折叠/展开操作。在整个迭代过程中保持了一种特殊的“稀疏加低秩”结构,并允许快速计算单个近端步骤。利用自适应动量进一步改善了经验收敛性。我们在光滑损失和满足Kurdyka-{L}ojasiewicz条件的目标上给出了临界点的收敛保证。虽然优化问题是非凸和非光滑的,但我们证明了它的临界点在张量补全问题上仍然具有良好的统计性能。在各种合成数据集和实际数据集上的实验表明,该算法在时间和空间上都是有效的,并且比现有的技术更准确。
{"title":"Low-rank Tensor Learning with Nonconvex Overlapped Nuclear Norm Regularization","authors":"Quanming Yao, Yaqing Wang, Bo Han, J. Kwok","doi":"10.48550/arXiv.2205.03059","DOIUrl":"https://doi.org/10.48550/arXiv.2205.03059","url":null,"abstract":"Nonconvex regularization has been popularly used in low-rank matrix learning. However, extending it for low-rank tensor learning is still computationally expensive. To address this problem, we develop an efficient solver for use with a nonconvex extension of the overlapped nuclear norm regularizer. Based on the proximal average algorithm, the proposed algorithm can avoid expensive tensor folding/unfolding operations. A special\"sparse plus low-rank\"structure is maintained throughout the iterations, and allows fast computation of the individual proximal steps. Empirical convergence is further improved with the use of adaptive momentum. We provide convergence guarantees to critical points on smooth losses and also on objectives satisfying the Kurdyka-{L}ojasiewicz condition. While the optimization problem is nonconvex and nonsmooth, we show that its critical points still have good statistical performance on the tensor completion problem. Experiments on various synthetic and real-world data sets show that the proposed algorithm is efficient in both time and space and more accurate than the existing state-of-the-art.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"23 1","pages":"136:1-136:60"},"PeriodicalIF":0.0,"publicationDate":"2022-05-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73533017","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Polynomial-Time Algorithms for Counting and Sampling Markov Equivalent DAGs with Applications 马尔可夫等效dag计数和抽样的多项式时间算法及其应用
Pub Date : 2022-05-05 DOI: 10.48550/arXiv.2205.02654
Marcel Wienöbst, Max Bannach, M. Liskiewicz
Counting and sampling directed acyclic graphs from a Markov equivalence class are fundamental tasks in graphical causal analysis. In this paper we show that these tasks can be performed in polynomial time, solving a long-standing open problem in this area. Our algorithms are effective and easily implementable. As we show in experiments, these breakthroughs make thought-to-be-infeasible strategies in active learning of causal structures and causal effect identification with regard to a Markov equivalence class practically applicable.
从马尔可夫等价类中对有向无环图进行计数和抽样是图因果分析的基本任务。在本文中,我们证明这些任务可以在多项式时间内执行,解决了该领域长期存在的开放问题。我们的算法是有效的和容易实现的。正如我们在实验中所展示的那样,这些突破使得在主动学习因果结构和因果效应识别方面被认为不可行的策略在马尔可夫等价类中具有实际应用价值。
{"title":"Polynomial-Time Algorithms for Counting and Sampling Markov Equivalent DAGs with Applications","authors":"Marcel Wienöbst, Max Bannach, M. Liskiewicz","doi":"10.48550/arXiv.2205.02654","DOIUrl":"https://doi.org/10.48550/arXiv.2205.02654","url":null,"abstract":"Counting and sampling directed acyclic graphs from a Markov equivalence class are fundamental tasks in graphical causal analysis. In this paper we show that these tasks can be performed in polynomial time, solving a long-standing open problem in this area. Our algorithms are effective and easily implementable. As we show in experiments, these breakthroughs make thought-to-be-infeasible strategies in active learning of causal structures and causal effect identification with regard to a Markov equivalence class practically applicable.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"8 1","pages":"213:1-213:45"},"PeriodicalIF":0.0,"publicationDate":"2022-05-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73462406","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness 一种利用距离感知提高单模型深度不确定性的简单方法
Pub Date : 2022-05-01 DOI: 10.48550/arXiv.2205.00403
J. Liu, Shreyas Padhy, Jie Jessie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zachary Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan
Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning. Code is open-sourced at https://github.com/google/uncertainty-baselines
准确的不确定性量化是深度学习的主要挑战,因为神经网络可能会产生过度自信的错误,并将高置信度的预测分配给分布外(OOD)输入。估计深度学习中预测不确定性的最流行方法是将来自多个神经网络(如贝叶斯神经网络(bnn)和深度集成)的预测结合起来的方法。然而,由于高内存和计算成本,它们在实时、工业规模应用中的实用性受到限制。此外,集成和bnn不一定能解决底层成员网络的所有问题。在这项工作中,我们研究了基于单一确定性表示的原则方法来提高单个网络的不确定性。通过将不确定性量化形式化为极小极大学习问题,我们首先识别距离感知,即模型量化测试样例与训练数据之间距离的能力,这是DNN实现高质量(即极小极大最优)不确定性估计的必要条件。然后,我们提出了谱归一化神经高斯过程(SNGP),这是一种简单的方法,通过两个简单的改变来提高现代dnn的距离感知能力:(1)对隐藏权重应用谱归一化以增强表征中的bi-Lipschitz平滑性;(2)用高斯过程层替换最后一个输出层。在一系列视觉和语言理解基准测试中,SNGP在预测、校准和域外检测方面优于其他单模型方法。此外,SNGP为深度集成和数据增强等流行技术提供了补充优势,使其成为概率深度学习的简单且可扩展的构建块。代码在https://github.com/google/uncertainty-baselines上是开源的
{"title":"A Simple Approach to Improve Single-Model Deep Uncertainty via Distance-Awareness","authors":"J. Liu, Shreyas Padhy, Jie Jessie Ren, Zi Lin, Yeming Wen, Ghassen Jerfel, Zachary Nado, Jasper Snoek, Dustin Tran, Balaji Lakshminarayanan","doi":"10.48550/arXiv.2205.00403","DOIUrl":"https://doi.org/10.48550/arXiv.2205.00403","url":null,"abstract":"Accurate uncertainty quantification is a major challenge in deep learning, as neural networks can make overconfident errors and assign high confidence predictions to out-of-distribution (OOD) inputs. The most popular approaches to estimate predictive uncertainty in deep learning are methods that combine predictions from multiple neural networks, such as Bayesian neural networks (BNNs) and deep ensembles. However their practicality in real-time, industrial-scale applications are limited due to the high memory and computational cost. Furthermore, ensembles and BNNs do not necessarily fix all the issues with the underlying member networks. In this work, we study principled approaches to improve uncertainty property of a single network, based on a single, deterministic representation. By formalizing the uncertainty quantification as a minimax learning problem, we first identify distance awareness, i.e., the model's ability to quantify the distance of a testing example from the training data, as a necessary condition for a DNN to achieve high-quality (i.e., minimax optimal) uncertainty estimation. We then propose Spectral-normalized Neural Gaussian Process (SNGP), a simple method that improves the distance-awareness ability of modern DNNs with two simple changes: (1) applying spectral normalization to hidden weights to enforce bi-Lipschitz smoothness in representations and (2) replacing the last output layer with a Gaussian process layer. On a suite of vision and language understanding benchmarks, SNGP outperforms other single-model approaches in prediction, calibration and out-of-domain detection. Furthermore, SNGP provides complementary benefits to popular techniques such as deep ensembles and data augmentation, making it a simple and scalable building block for probabilistic deep learning. Code is open-sourced at https://github.com/google/uncertainty-baselines","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"82 1","pages":"42:1-42:63"},"PeriodicalIF":0.0,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83998193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models 使用强化学习和生成模型的深度策略训练和评估
Pub Date : 2022-04-18 DOI: 10.48550/arXiv.2204.08573
Ali Ghadirzadeh, Petra Poklukar, Karol Arndt, Chelsea Finn, V. Kyrki, D. Kragic, Marten Bjorkman
We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basketball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.
我们提出了一个数据高效的框架来解决序列决策问题,该框架利用强化学习(RL)和潜在变量生成模型的结合。该框架称为GenRL,通过引入一个动作潜在变量来训练深度策略,这样前馈策略搜索可以分为两个部分:(i)训练一个子策略,该子策略在给定系统状态的情况下输出动作潜在变量的分布,以及(ii)生成模型的无监督训练,该模型输出一系列以潜在动作变量为条件的运动动作。GenRL能够安全探索并缓解数据效率低下的问题,因为它利用了关于有效运动动作序列的先验知识。此外,我们提供了一组用于评估生成模型的措施,这样我们就能够在物理机器人的实际训练之前预测RL策略训练的性能。我们通过实验确定了生成模型的特征,这些特征对两个机器人任务:冰球射击和篮球投掷的最终策略训练性能影响最大。此外,我们通过经验证明,与两种最先进的强化学习方法相比,GenRL是唯一能够安全有效地解决机器人任务的方法。
{"title":"Training and Evaluation of Deep Policies using Reinforcement Learning and Generative Models","authors":"Ali Ghadirzadeh, Petra Poklukar, Karol Arndt, Chelsea Finn, V. Kyrki, D. Kragic, Marten Bjorkman","doi":"10.48550/arXiv.2204.08573","DOIUrl":"https://doi.org/10.48550/arXiv.2204.08573","url":null,"abstract":"We present a data-efficient framework for solving sequential decision-making problems which exploits the combination of reinforcement learning (RL) and latent variable generative models. The framework, called GenRL, trains deep policies by introducing an action latent variable such that the feed-forward policy search can be divided into two parts: (i) training a sub-policy that outputs a distribution over the action latent variable given a state of the system, and (ii) unsupervised training of a generative model that outputs a sequence of motor actions conditioned on the latent action variable. GenRL enables safe exploration and alleviates the data-inefficiency problem as it exploits prior knowledge about valid sequences of motor actions. Moreover, we provide a set of measures for evaluation of generative models such that we are able to predict the performance of the RL policy training prior to the actual training on a physical robot. We experimentally determine the characteristics of generative models that have most influence on the performance of the final policy training on two robotics tasks: shooting a hockey puck and throwing a basketball. Furthermore, we empirically demonstrate that GenRL is the only method which can safely and efficiently solve the robotics tasks compared to two state-of-the-art RL methods.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"34 1","pages":"174:1-174:37"},"PeriodicalIF":0.0,"publicationDate":"2022-04-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73886843","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generalization Error Bounds for Multiclass Sparse Linear Classifiers 多类稀疏线性分类器的泛化误差范围
Pub Date : 2022-04-13 DOI: 10.48550/arXiv.2204.06264
Tomer Levy, F. Abramovich
We consider high-dimensional multiclass classification by sparse multinomial logistic regression. Unlike binary classification, in the multiclass setup one can think about an entire spectrum of possible notions of sparsity associated with different structural assumptions on the regression coefficients matrix. We propose a computationally feasible feature selection procedure based on penalized maximum likelihood with convex penalties capturing a specific type of sparsity at hand. In particular, we consider global sparsity, double row-wise sparsity, and low-rank sparsity, and show that with the properly chosen tuning parameters the derived plug-in classifiers attain the minimax generalization error bounds (in terms of misclassification excess risk) within the corresponding classes of multiclass sparse linear classifiers. The developed approach is general and can be adapted to other types of sparsity as well.
利用稀疏多项式逻辑回归研究高维多类分类问题。与二元分类不同,在多类设置中,可以考虑与回归系数矩阵上不同结构假设相关的稀疏性概念的整个范围。我们提出了一种计算上可行的基于凸惩罚的最大似然的特征选择方法,该方法可以捕获手边特定类型的稀疏性。特别地,我们考虑了全局稀疏性、双行稀疏性和低秩稀疏性,并表明在适当选择调优参数的情况下,派生的插件分类器在相应的多类稀疏线性分类器中获得了最小最大泛化误差界限(就误分类超额风险而言)。所开发的方法是通用的,也可以适用于其他类型的稀疏性。
{"title":"Generalization Error Bounds for Multiclass Sparse Linear Classifiers","authors":"Tomer Levy, F. Abramovich","doi":"10.48550/arXiv.2204.06264","DOIUrl":"https://doi.org/10.48550/arXiv.2204.06264","url":null,"abstract":"We consider high-dimensional multiclass classification by sparse multinomial logistic regression. Unlike binary classification, in the multiclass setup one can think about an entire spectrum of possible notions of sparsity associated with different structural assumptions on the regression coefficients matrix. We propose a computationally feasible feature selection procedure based on penalized maximum likelihood with convex penalties capturing a specific type of sparsity at hand. In particular, we consider global sparsity, double row-wise sparsity, and low-rank sparsity, and show that with the properly chosen tuning parameters the derived plug-in classifiers attain the minimax generalization error bounds (in terms of misclassification excess risk) within the corresponding classes of multiclass sparse linear classifiers. The developed approach is general and can be adapted to other types of sparsity as well.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"10 1","pages":"151:1-151:35"},"PeriodicalIF":0.0,"publicationDate":"2022-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79815817","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Unbiased Multilevel Monte Carlo methods for intractable distributions: MLMC meets MCMC 难处理分布的无偏多水平蒙特卡罗方法:MLMC满足MCMC
Pub Date : 2022-04-11 DOI: 10.48550/arXiv.2204.04808
Guanyang Wang, T. Wang
Constructing unbiased estimators from Markov chain Monte Carlo (MCMC) outputs is a difficult problem that has recently received a lot of attention in the statistics and machine learning communities. However, the current unbiased MCMC framework only works when the quantity of interest is an expectation, which excludes many practical applications. In this paper, we propose a general method for constructing unbiased estimators for functions of expectations and extend it to construct unbiased estimators for nested expectations. Our approach combines and generalizes the unbiased MCMC and Multilevel Monte Carlo (MLMC) methods. In contrast to traditional sequential methods, our estimator can be implemented on parallel processors. We show that our estimator has a finite variance and computational complexity and can achieve $varepsilon$-accuracy within the optimal $O(1/varepsilon^2)$ computational cost under mild conditions. Our numerical experiments confirm our theoretical findings and demonstrate the benefits of unbiased estimators in the massively parallel regime.
从马尔可夫链蒙特卡罗(MCMC)输出构造无偏估计量是近年来统计学和机器学习领域备受关注的一个难题。然而,目前的无偏MCMC框架仅在期望兴趣量时有效,这排除了许多实际应用。本文提出了构造期望函数无偏估计量的一般方法,并将其推广到构造嵌套期望的无偏估计量。我们的方法结合并推广了无偏MCMC和多层蒙特卡罗(MLMC)方法。与传统的顺序方法相比,我们的估计器可以在并行处理器上实现。我们证明了我们的估计器具有有限的方差和计算复杂度,并且在温和的条件下可以在最优的$O(1/varepsilon^2)$计算成本内实现$varepsilon$-精度。我们的数值实验证实了我们的理论发现,并证明了无偏估计器在大规模并行状态下的好处。
{"title":"Unbiased Multilevel Monte Carlo methods for intractable distributions: MLMC meets MCMC","authors":"Guanyang Wang, T. Wang","doi":"10.48550/arXiv.2204.04808","DOIUrl":"https://doi.org/10.48550/arXiv.2204.04808","url":null,"abstract":"Constructing unbiased estimators from Markov chain Monte Carlo (MCMC) outputs is a difficult problem that has recently received a lot of attention in the statistics and machine learning communities. However, the current unbiased MCMC framework only works when the quantity of interest is an expectation, which excludes many practical applications. In this paper, we propose a general method for constructing unbiased estimators for functions of expectations and extend it to construct unbiased estimators for nested expectations. Our approach combines and generalizes the unbiased MCMC and Multilevel Monte Carlo (MLMC) methods. In contrast to traditional sequential methods, our estimator can be implemented on parallel processors. We show that our estimator has a finite variance and computational complexity and can achieve $varepsilon$-accuracy within the optimal $O(1/varepsilon^2)$ computational cost under mild conditions. Our numerical experiments confirm our theoretical findings and demonstrate the benefits of unbiased estimators in the massively parallel regime.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"369 1","pages":"249:1-249:40"},"PeriodicalIF":0.0,"publicationDate":"2022-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82737072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models ReservoirComputing。[j]:一种高效模块化的油藏计算模型库
Pub Date : 2022-04-08 DOI: 10.48550/arXiv.2204.05117
Francesco Martinuzzi, Chris Rackauckas, Anas Abdelrehim, M. Mahecha, Karin Mora
We introduce ReservoirComputing.jl, an open source Julia library for reservoir computing models. The software offers a great number of algorithms presented in the literature, and allows to expand on them with both internal and external tools in a simple way. The implementation is highly modular, fast and comes with a comprehensive documentation, which includes reproduced experiments from literature. The code and documentation are hosted on Github under an MIT license https://github.com/SciML/ReservoirComputing.jl.
我们介绍了水库计算。jl,一个开源的Julia库,用于油藏计算模型。该软件提供了大量在文献中提出的算法,并允许以一种简单的方式用内部和外部工具扩展它们。该实现是高度模块化的,快速的,并附带了一个全面的文档,其中包括从文献中复制的实验。代码和文档在MIT许可https://github.com/SciML/ReservoirComputing.jl下托管在Github上。
{"title":"ReservoirComputing.jl: An Efficient and Modular Library for Reservoir Computing Models","authors":"Francesco Martinuzzi, Chris Rackauckas, Anas Abdelrehim, M. Mahecha, Karin Mora","doi":"10.48550/arXiv.2204.05117","DOIUrl":"https://doi.org/10.48550/arXiv.2204.05117","url":null,"abstract":"We introduce ReservoirComputing.jl, an open source Julia library for reservoir computing models. The software offers a great number of algorithms presented in the literature, and allows to expand on them with both internal and external tools in a simple way. The implementation is highly modular, fast and comes with a comprehensive documentation, which includes reproduced experiments from literature. The code and documentation are hosted on Github under an MIT license https://github.com/SciML/ReservoirComputing.jl.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"22 1","pages":"288:1-288:8"},"PeriodicalIF":0.0,"publicationDate":"2022-04-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"87233960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems 非线性广义纳什均衡问题的一阶算法
Pub Date : 2022-04-07 DOI: 10.48550/arXiv.2204.03132
Michael I. Jordan, Tianyi Lin, M. Zampetakis
We consider the problem of computing an equilibrium in a class of textit{nonlinear generalized Nash equilibrium problems (NGNEPs)} in which the strategy sets for each player are defined by equality and inequality constraints that may depend on the choices of rival players. While the asymptotic global convergence and local convergence rates of algorithms to solve this problem have been extensively investigated, the analysis of nonasymptotic iteration complexity is still in its infancy. This paper presents two first-order algorithms -- based on the quadratic penalty method (QPM) and augmented Lagrangian method (ALM), respectively -- with an accelerated mirror-prox algorithm as the solver in each inner loop. We establish a global convergence guarantee for solving monotone and strongly monotone NGNEPs and provide nonasymptotic complexity bounds expressed in terms of the number of gradient evaluations. Experimental results demonstrate the efficiency of our algorithms in practice.
我们考虑了一类textit{非线性广义纳什均衡问题(NGNEPs)中的均衡计算问题},其中每个参与人的策略集由可能依赖于对手参与人选择的相等和不等式约束定义。虽然解决该问题的算法的渐近全局收敛性和局部收敛率已经得到了广泛的研究,但对非渐近迭代复杂性的分析仍处于起步阶段。本文分别提出了基于二次惩罚法(QPM)和增广拉格朗日法(ALM)的两种一阶算法,并在每个内环中使用加速镜像-prox算法作为求解器。建立了求解单调和强单调ngnep的全局收敛保证,并给出了用梯度求值次数表示的非渐近复杂度界。实验结果证明了算法的有效性。
{"title":"First-Order Algorithms for Nonlinear Generalized Nash Equilibrium Problems","authors":"Michael I. Jordan, Tianyi Lin, M. Zampetakis","doi":"10.48550/arXiv.2204.03132","DOIUrl":"https://doi.org/10.48550/arXiv.2204.03132","url":null,"abstract":"We consider the problem of computing an equilibrium in a class of textit{nonlinear generalized Nash equilibrium problems (NGNEPs)} in which the strategy sets for each player are defined by equality and inequality constraints that may depend on the choices of rival players. While the asymptotic global convergence and local convergence rates of algorithms to solve this problem have been extensively investigated, the analysis of nonasymptotic iteration complexity is still in its infancy. This paper presents two first-order algorithms -- based on the quadratic penalty method (QPM) and augmented Lagrangian method (ALM), respectively -- with an accelerated mirror-prox algorithm as the solver in each inner loop. We establish a global convergence guarantee for solving monotone and strongly monotone NGNEPs and provide nonasymptotic complexity bounds expressed in terms of the number of gradient evaluations. Experimental results demonstrate the efficiency of our algorithms in practice.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"20 1","pages":"38:1-38:46"},"PeriodicalIF":0.0,"publicationDate":"2022-04-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85567811","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Smooth Robust Tensor Completion for Background/Foreground Separation with Missing Pixels: Novel Algorithm with Convergence Guarantee 平滑鲁棒张量补全的背景/前景缺失像素分离:具有收敛性保证的新算法
Pub Date : 2022-03-29 DOI: 10.48550/arXiv.2203.16328
Bo Shen, Weijun Xie, Zhen Kong
The objective of this study is to address the problem of background/foreground separation with missing pixels by combining the video acquisition, video recovery, background/foreground separation into a single framework. To achieve this, a smooth robust tensor completion (SRTC) model is proposed to recover the data and decompose it into the static background and smooth foreground, respectively. Specifically, the static background is modeled by the low-rank tucker decomposition and the smooth foreground (moving objects) is modeled by the spatiotemporal continuity, which is enforced by the total variation regularization. An efficient algorithm based on tensor proximal alternating minimization (tenPAM) is implemented to solve the proposed model with global convergence guarantee under very mild conditions. Extensive experiments on real data demonstrate that the proposed method significantly outperforms the state-of-the-art approaches for background/foreground separation with missing pixels.
本研究的目的是通过将视频采集、视频恢复、背景/前景分离结合到一个框架中来解决缺少像素的背景/前景分离问题。为此,提出了一种平滑鲁棒张量补全(SRTC)模型来恢复数据并将其分别分解为静态背景和平滑前景。其中,静态背景通过低秩tucker分解建模,平滑前景(运动目标)通过时空连续性建模,并通过全变分正则化实现。提出了一种基于张量近端交替最小化(tenPAM)的高效算法,在非常温和的条件下保证了模型的全局收敛性。在真实数据上进行的大量实验表明,该方法在缺少像素的背景/前景分离方面明显优于目前最先进的方法。
{"title":"Smooth Robust Tensor Completion for Background/Foreground Separation with Missing Pixels: Novel Algorithm with Convergence Guarantee","authors":"Bo Shen, Weijun Xie, Zhen Kong","doi":"10.48550/arXiv.2203.16328","DOIUrl":"https://doi.org/10.48550/arXiv.2203.16328","url":null,"abstract":"The objective of this study is to address the problem of background/foreground separation with missing pixels by combining the video acquisition, video recovery, background/foreground separation into a single framework. To achieve this, a smooth robust tensor completion (SRTC) model is proposed to recover the data and decompose it into the static background and smooth foreground, respectively. Specifically, the static background is modeled by the low-rank tucker decomposition and the smooth foreground (moving objects) is modeled by the spatiotemporal continuity, which is enforced by the total variation regularization. An efficient algorithm based on tensor proximal alternating minimization (tenPAM) is implemented to solve the proposed model with global convergence guarantee under very mild conditions. Extensive experiments on real data demonstrate that the proposed method significantly outperforms the state-of-the-art approaches for background/foreground separation with missing pixels.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"24 1","pages":"217:1-217:40"},"PeriodicalIF":0.0,"publicationDate":"2022-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78880960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
A Framework and Benchmark for Deep Batch Active Learning for Regression 面向回归的深度批处理主动学习框架与基准
Pub Date : 2022-03-17 DOI: 10.48550/arXiv.2203.09410
David Holzmüller, V. Zaverkin, Johannes Kastner, Ingo Steinwart
The acquisition of labels for supervised learning can be expensive. To improve the sample efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our framework encompasses many existing Bayesian methods based on Gaussian process approximations of neural networks as well as non-Bayesian methods. Additionally, we propose to replace the commonly used last-layer features with sketched finite-width neural tangent kernels and to combine them with a novel clustering method. To evaluate different methods, we introduce an open-source benchmark consisting of 15 large tabular regression data sets. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code. We provide open-source code that includes efficient implementations of all kernels, kernel transformations, and selection methods, and can be used for reproducing our results.
为监督学习获取标签可能是昂贵的。为了提高神经网络回归的样本效率,我们研究了一种主动学习方法,该方法可以自适应地选择批量的未标记数据进行标记。我们提出了一个框架,用于从(依赖于网络的)基核、核转换和选择方法中构造这样的方法。我们的框架包含了许多现有的基于神经网络高斯过程近似的贝叶斯方法以及非贝叶斯方法。此外,我们提出用草图有限宽度神经切线核取代常用的最后一层特征,并将它们与一种新的聚类方法相结合。为了评估不同的方法,我们引入了一个由15个大型表格回归数据集组成的开源基准。我们提出的方法在基准测试中优于最先进的方法,可扩展到大型数据集,并且无需调整网络架构或训练代码即可开箱即用。我们提供的开源代码包括所有内核、内核转换和选择方法的有效实现,并可用于再现我们的结果。
{"title":"A Framework and Benchmark for Deep Batch Active Learning for Regression","authors":"David Holzmüller, V. Zaverkin, Johannes Kastner, Ingo Steinwart","doi":"10.48550/arXiv.2203.09410","DOIUrl":"https://doi.org/10.48550/arXiv.2203.09410","url":null,"abstract":"The acquisition of labels for supervised learning can be expensive. To improve the sample efficiency of neural network regression, we study active learning methods that adaptively select batches of unlabeled data for labeling. We present a framework for constructing such methods out of (network-dependent) base kernels, kernel transformations, and selection methods. Our framework encompasses many existing Bayesian methods based on Gaussian process approximations of neural networks as well as non-Bayesian methods. Additionally, we propose to replace the commonly used last-layer features with sketched finite-width neural tangent kernels and to combine them with a novel clustering method. To evaluate different methods, we introduce an open-source benchmark consisting of 15 large tabular regression data sets. Our proposed method outperforms the state-of-the-art on our benchmark, scales to large data sets, and works out-of-the-box without adjusting the network architecture or training code. We provide open-source code that includes efficient implementations of all kernels, kernel transformations, and selection methods, and can be used for reproducing our results.","PeriodicalId":14794,"journal":{"name":"J. Mach. Learn. Res.","volume":"112 1","pages":"164:1-164:81"},"PeriodicalIF":0.0,"publicationDate":"2022-03-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79638527","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
期刊
J. Mach. Learn. Res.
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1