无损失的学习

Veit Elser
{"title":"无损失的学习","authors":"Veit Elser","doi":"10.1186/s13663-021-00697-1","DOIUrl":null,"url":null,"abstract":"We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.","PeriodicalId":12293,"journal":{"name":"Fixed Point Theory and Applications","volume":"58 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2021-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"10","resultStr":"{\"title\":\"Learning without loss\",\"authors\":\"Veit Elser\",\"doi\":\"10.1186/s13663-021-00697-1\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\\\\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.\",\"PeriodicalId\":12293,\"journal\":{\"name\":\"Fixed Point Theory and Applications\",\"volume\":\"58 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"10\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Fixed Point Theory and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1186/s13663-021-00697-1\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Fixed Point Theory and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1186/s13663-021-00697-1","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 10

摘要

我们探索了一种训练神经网络的新方法,其中所有损失函数都被硬约束取代。同样的方法在相位检索中非常成功,其中信号从幅度约束和一般特征(稀疏性,支持度等)重建。在基于约束的方法(称为松弛反射反射(RRR))中,优化器不采用梯度步骤,而是从对局部约束的投影中派生步骤。在神经网络中,一个这样的投影对每个神经元的输入x、相关权重w和预激活值y进行最小的修改,以满足方程$x\cdot w=y$。这些预测,以及许多其他局部预测(约束前激活和后激活等)可以划分为两个集合,这样每个集合中的所有预测都可以并发地应用于整个网络和训练批中的所有数据。这种分为两组的情况类似于相位检索中的情况和设计通用RRR优化器的设置。由于该方法的新颖性,本文也可以作为一个独立的教程。从执行非负矩阵分解的单层网络开始,并以包含自编码器和分类器的生成模型结束,所有应用程序及其通过投影实现的详细描述。虽然新方法有可能扩展神经网络的范围(例如,通过约束集而不是函数来定义激活),但大多数特征模型都是标准的,以便与随机梯度下降进行比较。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Learning without loss
We explore a new approach for training neural networks where all loss functions are replaced by hard constraints. The same approach is very successful in phase retrieval, where signals are reconstructed from magnitude constraints and general characteristics (sparsity, support, etc.). Instead of taking gradient steps, the optimizer in the constraint based approach, called relaxed–reflect–reflect (RRR), derives its steps from projections to local constraints. In neural networks one such projection makes the minimal modification to the inputs x, the associated weights w, and the pre-activation value y at each neuron, to satisfy the equation $x\cdot w=y$ . These projections, along with a host of other local projections (constraining pre- and post-activations, etc.) can be partitioned into two sets such that all the projections in each set can be applied concurrently—across the network and across all data in the training batch. This partitioning into two sets is analogous to the situation in phase retrieval and the setting for which the general purpose RRR optimizer was designed. Owing to the novelty of the method, this paper also serves as a self-contained tutorial. Starting with a single-layer network that performs nonnegative matrix factorization, and concluding with a generative model comprising an autoencoder and classifier, all applications and their implementations by projections are described in complete detail. Although the new approach has the potential to extend the scope of neural networks (e.g. by defining activation not through functions but constraint sets), most of the featured models are standard to allow comparison with stochastic gradient descent.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
Fixed Point Theory and Applications
Fixed Point Theory and Applications MATHEMATICS, APPLIED-MATHEMATICS
自引率
0.00%
发文量
0
期刊介绍: In a wide range of mathematical, computational, economical, modeling and engineering problems, the existence of a solution to a theoretical or real world problem is equivalent to the existence of a fixed point for a suitable map or operator. Fixed points are therefore of paramount importance in many areas of mathematics, sciences and engineering. The theory itself is a beautiful mixture of analysis (pure and applied), topology and geometry. Over the last 60 years or so, the theory of fixed points has been revealed as a very powerful and important tool in the study of nonlinear phenomena. In particular, fixed point techniques have been applied in such diverse fields as biology, chemistry, physics, engineering, game theory and economics. In numerous cases finding the exact solution is not possible; hence it is necessary to develop appropriate algorithms to approximate the requested result. This is strongly related to control and optimization problems arising in the different sciences and in engineering problems. Many situations in the study of nonlinear equations, calculus of variations, partial differential equations, optimal control and inverse problems can be formulated in terms of fixed point problems or optimization.
期刊最新文献
Weak and strong convergence theorems for a new class of enriched strictly pseudononspreading mappings in Hilbert spaces Ϝ-Contraction of Hardy–Rogers type in supermetric spaces with applications Solution of a nonlinear fractional-order initial value problem via a \(\mathscr{C}^{*}\)-algebra-valued \(\mathcal{R}\)-metric space On a new generalization of a Perov-type F-contraction with application to a semilinear operator system Fixed point theorem and iterated function system in φ-metric modular space
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1