无损失的学习

Fixed Point Theory and Applications Pub Date : 2021-07-26 DOI:10.1186/s13663-021-00697-1

Veit Elser

{"title":"无损失的学习","authors":"Veit Elser","doi":"10.1186/s13663-021-00697-1","DOIUrl":null,"url":null,"abstract":"We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.","PeriodicalId":12293,"journal":{"name":"Fixed Point Theory and Applications","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Learning without loss\",\"authors\":\"Veit Elser\",\"doi\":\"10.1186/s13663-021-00697-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\\\\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.\",\"PeriodicalId\":12293,\"journal\":{\"name\":\"Fixed Point Theory and Applications\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fixed Point Theory and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13663-021-00697-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fixed Point Theory and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13663-021-00697-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 10

摘要

我们探索了一种训练神经网络的新方法，其中所有损失函数都被硬约束取代。同样的方法在相位检索中非常成功，其中信号从幅度约束和一般特征(稀疏性，支持度等)重建。在基于约束的方法(称为松弛反射反射(RRR))中，优化器不采用梯度步骤，而是从对局部约束的投影中派生步骤。在神经网络中，一个这样的投影对每个神经元的输入x、相关权重w和预激活值y进行最小的修改，以满足方程$x\cdot w=y$。这些预测，以及许多其他局部预测(约束前激活和后激活等)可以划分为两个集合，这样每个集合中的所有预测都可以并发地应用于整个网络和训练批中的所有数据。这种分为两组的情况类似于相位检索中的情况和设计通用RRR优化器的设置。由于该方法的新颖性，本文也可以作为一个独立的教程。从执行非负矩阵分解的单层网络开始，并以包含自编码器和分类器的生成模型结束，所有应用程序及其通过投影实现的详细描述。虽然新方法有可能扩展神经网络的范围(例如，通过约束集而不是函数来定义激活)，但大多数特征模型都是标准的，以便与随机梯度下降进行比较。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Learning without loss

We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Fixed Point Theory and Applications MATHEMATICS, APPLIED-MATHEMATICS

自引率

0.00%

发文量

期刊介绍： In a wide range of mathematical, computational, economical, modeling and engineering problems, the existence of a solution to a theoretical or real world problem is equivalent to the existence of a fixed point for a suitable map or operator. Fixed points are therefore of paramount importance in many areas of mathematics, sciences and engineering. The theory itself is a beautiful mixture of analysis (pure and applied), topology and geometry. Over the last 60 years or so, the theory of fixed points has been revealed as a very powerful and important tool in the study of nonlinear phenomena. In particular, fixed point techniques have been applied in such diverse fields as biology, chemistry, physics, engineering, game theory and economics. In numerous cases finding the exact solution is not possible; hence it is necessary to develop appropriate algorithms to approximate the requested result. This is strongly related to control and optimization problems arising in the different sciences and in engineering problems. Many situations in the study of nonlinear equations, calculus of variations, partial differential equations, optimal control and inverse problems can be formulated in terms of fixed point problems or optimization.