首页 > 最新文献

arXiv: Learning最新文献

英文 中文
Compositional Transfer in Hierarchical Reinforcement Learning 分层强化学习中的组合迁移
Pub Date : 2019-06-26 DOI: 10.15607/rss.2020.xvi.054
Markus Wulfmeier, A. Abdolmaleki, Roland Hafner, J. T. Springenberg, Michael Neunert, Tim Hertweck, T. Lampe, Noah Siegel, N. Heess, Martin A. Riedmiller
The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements. We introduce Regularized Hierarchical Policy Optimization (RHPO) to improve data-efficiency for domains with multiple dominant tasks and ultimately reduce required platform time. To this end, we employ compositional inductive biases on multiple levels and corresponding mechanisms for sharing off-policy transition data across low-level controllers and tasks as well as scheduling of tasks. The presented algorithm enables stable and fast learning for complex, real-world domains in the parallel multitask and sequential transfer case. We show that the investigated types of hierarchy enable positive transfer while partially mitigating negative interference and evaluate the benefits of additional incentives for efficient, compositional task solutions in single task domains. Finally, we demonstrate substantial data-efficiency and final performance gains over competitive baselines in a week-long, physical robot stacking experiment.
一般强化学习算法在现实机器人应用中的成功应用往往受到其高数据要求的限制。我们引入了正则化分层策略优化(RHPO)来提高具有多个主导任务的域的数据效率,并最终减少所需的平台时间。为此,我们在多个层次上采用组合归纳偏差和相应的机制,在低级控制器和任务之间共享非策略转换数据以及任务调度。该算法能够在并行多任务和顺序迁移情况下稳定快速地学习复杂的现实世界域。我们表明,所调查的层次结构类型能够实现正迁移,同时部分减轻负干扰,并评估了在单一任务域中有效的组合任务解决方案的额外激励的好处。最后,我们在为期一周的物理机器人堆叠实验中展示了大量的数据效率和最终性能增益。
{"title":"Compositional Transfer in Hierarchical Reinforcement Learning","authors":"Markus Wulfmeier, A. Abdolmaleki, Roland Hafner, J. T. Springenberg, Michael Neunert, Tim Hertweck, T. Lampe, Noah Siegel, N. Heess, Martin A. Riedmiller","doi":"10.15607/rss.2020.xvi.054","DOIUrl":"https://doi.org/10.15607/rss.2020.xvi.054","url":null,"abstract":"The successful application of general reinforcement learning algorithms to real-world robotics applications is often limited by their high data requirements. We introduce Regularized Hierarchical Policy Optimization (RHPO) to improve data-efficiency for domains with multiple dominant tasks and ultimately reduce required platform time. To this end, we employ compositional inductive biases on multiple levels and corresponding mechanisms for sharing off-policy transition data across low-level controllers and tasks as well as scheduling of tasks. The presented algorithm enables stable and fast learning for complex, real-world domains in the parallel multitask and sequential transfer case. We show that the investigated types of hierarchy enable positive transfer while partially mitigating negative interference and evaluate the benefits of additional incentives for efficient, compositional task solutions in single task domains. Finally, we demonstrate substantial data-efficiency and final performance gains over competitive baselines in a week-long, physical robot stacking experiment.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"132 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85755598","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 30
Developing an ANFIS-PSO Based Model to Estimate Mercury Emission in Combustion Flue Gases 基于anfiss - pso模型估算燃烧烟气中汞排放
Pub Date : 2019-05-10 DOI: 10.20944/PREPRINTS201905.0124.V1
S. Shamshirband, A. Baghban, Masoud Hadipoor, A. Mosavi
Accurate prediction of mercury content emitted from fossil-fueled power stations is of utmost important to environmental pollution assessment and hazard mitigation. In this paper, mercury content in the output gas from boilers was predicted using an Adaptive Neuro-Fuzzy Inference System (ANFIS) integrated with particle swarm optimization (PSO). Input parameters were selected from coal characteristics and the operational configuration of boilers. The proposed ANFIS-PSO model is capable of developing a nonlinear model to represent the dependency of flue gas mercury content into the specifications of coal and also the boiler type. In this study, operational information from 82 power plants has been gathered and employed to educate and examine the proposed model. To evaluate the performance of the proposed model the statistical meter of MARE% was implemented, which resulted 0.003266 and 0.013272 for training and testing respectively. Furthermore, relative errors between acquired data and predicted values were between -0.25% and 0.1%, which confirm the accuracy of PSO-ANFIS model.
准确预测化石燃料发电厂排放的汞含量对环境污染评估和减轻危害至关重要。本文采用自适应神经模糊推理系统(ANFIS)和粒子群优化(PSO)相结合的方法对锅炉输出气体中的汞含量进行了预测。输入参数根据煤的特性和锅炉的运行配置进行选择。所提出的anfiss - pso模型能够建立一个非线性模型来表示烟气中汞含量与煤规格和锅炉类型的关系。在本研究中,我们收集了82个发电厂的运行信息,并利用这些信息来教育和检验所提出的模型。为了评价该模型的性能,我们采用了MARE%的统计度量,训练和测试的结果分别为0.003266和0.013272。实测数据与预测值的相对误差在-0.25% ~ 0.1%之间,验证了PSO-ANFIS模型的准确性。
{"title":"Developing an ANFIS-PSO Based Model to Estimate Mercury Emission in Combustion Flue Gases","authors":"S. Shamshirband, A. Baghban, Masoud Hadipoor, A. Mosavi","doi":"10.20944/PREPRINTS201905.0124.V1","DOIUrl":"https://doi.org/10.20944/PREPRINTS201905.0124.V1","url":null,"abstract":"Accurate prediction of mercury content emitted from fossil-fueled power stations is of utmost important to environmental pollution assessment and hazard mitigation. In this paper, mercury content in the output gas from boilers was predicted using an Adaptive Neuro-Fuzzy Inference System (ANFIS) integrated with particle swarm optimization (PSO). Input parameters were selected from coal characteristics and the operational configuration of boilers. The proposed ANFIS-PSO model is capable of developing a nonlinear model to represent the dependency of flue gas mercury content into the specifications of coal and also the boiler type. In this study, operational information from 82 power plants has been gathered and employed to educate and examine the proposed model. To evaluate the performance of the proposed model the statistical meter of MARE% was implemented, which resulted 0.003266 and 0.013272 for training and testing respectively. Furthermore, relative errors between acquired data and predicted values were between -0.25% and 0.1%, which confirm the accuracy of PSO-ANFIS model.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"50 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2019-05-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91260683","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Actor-Expert: A Framework for using Q-learning in Continuous Action Spaces 参与者-专家:在连续动作空间中使用q -学习的框架
Pub Date : 2018-10-22 DOI: 10.7939/R3-QGDP-3872
Sungsu Lim
Q-learning can be difficult to use in continuous action spaces, because an optimization has to be solved to find the maximal action for the action-values. A common strategy has been to restrict the functional form of the action-values to be concave in the actions, to simplify the optimization. Such restrictions, however, can prevent learning accurate action-values. In this work, we propose a new policy search objective that facilitates using Q-learning and a framework to optimize this objective, called Actor-Expert. The Expert uses Q-learning to update the action-values towards optimal action-values. The Actor learns the maximal actions over time for these changing action-values. We develop a Cross Entropy Method (CEM) for the Actor, where such a global optimization approach facilitates use of generically parameterized action-values. This method - which we call Conditional CEM - iteratively concentrates density around maximal actions, conditioned on state. We prove that this algorithm tracks the expected CEM update, over states with changing action-values. We demonstrate in a toy environment that previous methods that restrict the action-value parameterization fail whereas Actor-Expert with a more general action-value parameterization succeeds. Finally, we demonstrate that Actor-Expert performs as well as or better than competitors on four benchmark continuous-action environments.
Q-learning在连续动作空间中很难使用,因为必须解决优化问题以找到动作值的最大动作。一种常用的策略是将动作值的函数形式限制在动作中的凹形,以简化优化。然而,这样的限制可能会妨碍学习准确的行动价值。在这项工作中,我们提出了一个新的政策搜索目标,便于使用Q-learning和一个框架来优化这个目标,称为Actor-Expert。专家使用Q-learning将动作值更新为最优动作值。Actor随着时间的推移学习这些不断变化的动作值的最大动作。我们为Actor开发了一种交叉熵方法(Cross Entropy Method, CEM),其中这种全局优化方法便于使用一般参数化的动作值。这种方法——我们称之为条件CEM——迭代地将密度集中在最大动作周围,以状态为条件。我们证明了该算法在动作值变化的状态上跟踪预期的CEM更新。我们在一个玩具环境中证明,以前限制动作值参数化的方法失败了,而具有更通用的动作值参数化的Actor-Expert方法成功了。最后,我们证明了Actor-Expert在四个基准连续动作环境中的表现与竞争对手一样好或更好。
{"title":"Actor-Expert: A Framework for using Q-learning in Continuous Action Spaces","authors":"Sungsu Lim","doi":"10.7939/R3-QGDP-3872","DOIUrl":"https://doi.org/10.7939/R3-QGDP-3872","url":null,"abstract":"Q-learning can be difficult to use in continuous action spaces, because an optimization has to be solved to find the maximal action for the action-values. A common strategy has been to restrict the functional form of the action-values to be concave in the actions, to simplify the optimization. Such restrictions, however, can prevent learning accurate action-values. In this work, we propose a new policy search objective that facilitates using Q-learning and a framework to optimize this objective, called Actor-Expert. The Expert uses Q-learning to update the action-values towards optimal action-values. The Actor learns the maximal actions over time for these changing action-values. We develop a Cross Entropy Method (CEM) for the Actor, where such a global optimization approach facilitates use of generically parameterized action-values. This method - which we call Conditional CEM - iteratively concentrates density around maximal actions, conditioned on state. We prove that this algorithm tracks the expected CEM update, over states with changing action-values. We demonstrate in a toy environment that previous methods that restrict the action-value parameterization fail whereas Actor-Expert with a more general action-value parameterization succeeds. Finally, we demonstrate that Actor-Expert performs as well as or better than competitors on four benchmark continuous-action environments.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"14 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76563496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Using link and content over time for embedding generation in Dynamic Attributed Networks 在动态属性网络中,使用随时间变化的链接和内容进行嵌入生成
Pub Date : 2018-07-17 DOI: 10.1007/978-3-030-10928-8_1
A. P. Appel, R. L. F. Cunha, C. Aggarwal, Marcela Megumi Terakado
{"title":"Using link and content over time for embedding generation in Dynamic Attributed Networks","authors":"A. P. Appel, R. L. F. Cunha, C. Aggarwal, Marcela Megumi Terakado","doi":"10.1007/978-3-030-10928-8_1","DOIUrl":"https://doi.org/10.1007/978-3-030-10928-8_1","url":null,"abstract":"","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"33 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78346588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Deep Learning on Low-Resource Datasets 低资源数据集上的深度学习
Pub Date : 2018-07-10 DOI: 10.20944/PREPRINTS201807.0185.V1
Veronica Morfi, D. Stowell
In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.
在训练深度学习系统执行音频转录时,可能会出现两个实际问题。首先,大多数数据集都是弱标记的,每个记录中只有一个事件列表,没有任何用于训练的时间信息。其次,深度神经网络需要非常大量的标记训练数据来获得高质量的性能,但在实践中,很难为大多数感兴趣的类别收集足够的样本。在本文中,我们提出将音频转录的最终任务分解为多个中间任务,以提高处理这类低资源数据集的训练性能。我们评估了用于中间任务的堆叠卷积和递归神经网络训练的三种数据效率方法。我们的研究结果表明,不同的训练方法有不同的优点和缺点。
{"title":"Deep Learning on Low-Resource Datasets","authors":"Veronica Morfi, D. Stowell","doi":"10.20944/PREPRINTS201807.0185.V1","DOIUrl":"https://doi.org/10.20944/PREPRINTS201807.0185.V1","url":null,"abstract":"In training a deep learning system to perform audio transcription, two practical problems may arise. Firstly, most datasets are weakly labelled, having only a list of events present in each recording without any temporal information for training. Secondly, deep neural networks need a very large amount of labelled training data to achieve good quality performance, yet in practice it is difficult to collect enough samples for most classes of interest. In this paper, we propose factorising the final task of audio transcription into multiple intermediate tasks in order to improve the training performance when dealing with this kind of low-resource datasets. We evaluate three data-efficient approaches of training a stacked convolutional and recurrent neural network for the intermediate tasks. Our results show that different methods of training have different advantages and disadvantages.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"57 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2018-07-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90946299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-view Ensemble Classification for Clinically Actionable Genetic Mutations 临床可操作基因突变的多视图集成分类
Pub Date : 2018-06-26 DOI: 10.1007/978-3-319-94042-7_5
Xi Sheryl Zhang, Dandi Chen, Yongjun Zhu, Chao Che, Chang Su, Sendong Zhao, X. Min, Fei Wang
{"title":"Multi-view Ensemble Classification for Clinically Actionable Genetic Mutations","authors":"Xi Sheryl Zhang, Dandi Chen, Yongjun Zhu, Chao Che, Chang Su, Sendong Zhao, X. Min, Fei Wang","doi":"10.1007/978-3-319-94042-7_5","DOIUrl":"https://doi.org/10.1007/978-3-319-94042-7_5","url":null,"abstract":"","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"36 1","pages":"79-99"},"PeriodicalIF":0.0,"publicationDate":"2018-06-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90385013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Reliable clustering of Bernoulli mixture models 伯努利混合模型的可靠聚类
Pub Date : 2017-10-05 DOI: 10.3150/19-bej1173
Amir Najafi, A. Motahari, H. Rabiee
A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs.
伯努利混合模型(BMM)是具有独立维数的随机二元向量的有限混合。BMM数据聚类的问题出现在各种实际应用中,从种群遗传学到社会网络中的活动分析。本文从理论的角度分析了在簇数未知的情况下hmm的可聚性。特别地,我们对模型的样本复杂度和维数规定了一组条件,以保证数据集的大概近似正确(PAC)聚类性。据我们所知,这些发现是学习或聚类bmm的样本复杂性的第一个非渐近边界。
{"title":"Reliable clustering of Bernoulli mixture models","authors":"Amir Najafi, A. Motahari, H. Rabiee","doi":"10.3150/19-bej1173","DOIUrl":"https://doi.org/10.3150/19-bej1173","url":null,"abstract":"A Bernoulli Mixture Model (BMM) is a finite mixture of random binary vectors with independent dimensions. The problem of clustering BMM data arises in a variety of real-world applications, ranging from population genetics to activity analysis in social networks. In this paper, we analyze the clusterability of BMMs from a theoretical perspective, when the number of clusters is unknown. In particular, we stipulate a set of conditions on the sample complexity and dimension of the model in order to guarantee the Probably Approximately Correct (PAC)-clusterability of a dataset. To the best of our knowledge, these findings are the first non-asymptotic bounds on the sample complexity of learning or clustering BMMs.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"21 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-10-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91290190","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Transitions, Losses, and Re-parameterizations: Elements of Prediction Games. 过渡、损失和重新参数化:预测游戏的元素
Pub Date : 2017-01-01 DOI: 10.25911/5D723BC67A01E
Kamalaruban Parameswaran
This thesis presents some geometric insights into three different types of two player prediction games -- namely general learning task, prediction with expert advice, and online convex optimization. These games differ in the nature of the opponent (stochastic, adversarial, or intermediate), the order of the players' move, and the utility function. The insights shed some light on the understanding of the intrinsic barriers of the prediction problems and the design of computationally efficient learning algorithms with strong theoretical guarantees (such as generalizability, statistical consistency, and constant regret etc.).
本文对三种不同类型的双人预测游戏(即一般学习任务、专家建议预测和在线凸优化)提出了一些几何见解。这些游戏的不同之处在于对手的性质(随机、对抗或中间)、玩家的移动顺序和效用函数。这些见解有助于理解预测问题的内在障碍,以及设计具有强大理论保证(如泛化性、统计一致性和持续遗憾等)的计算效率学习算法。
{"title":"Transitions, Losses, and Re-parameterizations: Elements of Prediction Games.","authors":"Kamalaruban Parameswaran","doi":"10.25911/5D723BC67A01E","DOIUrl":"https://doi.org/10.25911/5D723BC67A01E","url":null,"abstract":"This thesis presents some geometric insights into three different types of two player prediction games -- namely general learning task, prediction with expert advice, and online convex optimization. These games differ in the nature of the opponent (stochastic, adversarial, or intermediate), the order of the players' move, and the utility function. The insights shed some light on the understanding of the intrinsic barriers of the prediction problems and the design of computationally efficient learning algorithms with strong theoretical guarantees (such as generalizability, statistical consistency, and constant regret etc.).","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"55 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2017-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76478638","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning 用于端到端低精度模型训练的ZipML框架:可以、不可以和一点深度学习
Pub Date : 2016-11-16 DOI: 10.3929/ethz-a-010890124
Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang
Recently there has been significant interest in training machine-learning models at low precision: by reducing precision, one can reduce computation and communication by one order of magnitude. We examine training at reduced precision, both from a theoretical and practical perspective, and ask: is it possible to train models at end-to-end low precision with provable guarantees? Can this lead to consistent order-of-magnitude speedups? We present a framework called ZipML to answer these questions. For linear models, the answer is yes. We develop a simple framework based on one simple but novel strategy called double sampling. Our framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quantization would introduce significant bias. We validate our framework across a range of applications, and show that it enables an FPGA prototype that is up to 6.5x faster than an implementation using full 32-bit precision. We further develop a variance-optimal stochastic quantization strategy and show that it can make a significant difference in a variety of settings. When applied to linear models together with double sampling, we save up to another 1.7x in data movement compared with uniform quantization. When training deep networks with quantized models, we achieve higher accuracy than the state-of-the-art XNOR-Net. Finally, we extend our framework through approximation to non-linear models, such as SVM. We show that, although using low-precision data induces bias, we can appropriately bound and control the bias. We find in practice 8-bit precision is often sufficient to converge to the correct solution. Interestingly, however, in practice we notice that our framework does not always outperform the naive rounding approach. We discuss this negative result in detail.
最近,人们对低精度训练机器学习模型产生了浓厚的兴趣:通过降低精度,可以将计算和通信减少一个数量级。我们从理论和实践的角度来研究降低精度的训练,并问:是否有可能在可证明的保证下以端到端低精度训练模型?这能导致持续的数量级加速吗?我们提出了一个名为ZipML的框架来回答这些问题。对于线性模型,答案是肯定的。我们开发了一个简单的框架,基于一个简单但新颖的策略,称为双重抽样。我们的框架能够在没有偏差的情况下以低精度执行训练,保证收敛,而朴素量化会引入明显的偏差。我们在一系列应用中验证了我们的框架,并表明它使FPGA原型比使用全32位精度的实现快6.5倍。我们进一步开发了方差最优随机量化策略,并表明它可以在各种设置中产生显着差异。当应用于线性模型和双采样时,与均匀量化相比,我们节省了1.7倍的数据移动。当使用量化模型训练深度网络时,我们获得了比最先进的XNOR-Net更高的精度。最后,我们通过逼近非线性模型(如SVM)来扩展我们的框架。结果表明,虽然使用低精度数据会引起偏差,但我们可以适当地约束和控制偏差。在实践中,我们发现8位精度通常足以收敛到正确的解。然而,有趣的是,在实践中,我们注意到我们的框架并不总是优于朴素的舍入方法。我们将详细讨论这一否定结果。
{"title":"The ZipML Framework for Training Models with End-to-End Low Precision: The Cans, the Cannots, and a Little Bit of Deep Learning","authors":"Hantian Zhang, Jerry Li, Kaan Kara, Dan Alistarh, Ji Liu, Ce Zhang","doi":"10.3929/ethz-a-010890124","DOIUrl":"https://doi.org/10.3929/ethz-a-010890124","url":null,"abstract":"Recently there has been significant interest in training machine-learning models at low precision: by reducing precision, one can reduce computation and communication by one order of magnitude. We examine training at reduced precision, both from a theoretical and practical perspective, and ask: is it possible to train models at end-to-end low precision with provable guarantees? Can this lead to consistent order-of-magnitude speedups? We present a framework called ZipML to answer these questions. For linear models, the answer is yes. We develop a simple framework based on one simple but novel strategy called double sampling. Our framework is able to execute training at low precision with no bias, guaranteeing convergence, whereas naive quantization would introduce significant bias. We validate our framework across a range of applications, and show that it enables an FPGA prototype that is up to 6.5x faster than an implementation using full 32-bit precision. We further develop a variance-optimal stochastic quantization strategy and show that it can make a significant difference in a variety of settings. When applied to linear models together with double sampling, we save up to another 1.7x in data movement compared with uniform quantization. When training deep networks with quantized models, we achieve higher accuracy than the state-of-the-art XNOR-Net. Finally, we extend our framework through approximation to non-linear models, such as SVM. We show that, although using low-precision data induces bias, we can appropriately bound and control the bias. We find in practice 8-bit precision is often sufficient to converge to the correct solution. Interestingly, however, in practice we notice that our framework does not always outperform the naive rounding approach. We discuss this negative result in detail.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"1 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"89376724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Learning an Optimization Algorithm through Human Design Iterations 通过人类设计迭代学习优化算法
Pub Date : 2016-08-24 DOI: 10.1115/1.4037344
Thurston Sexton, Max Yi Ren
Solving optimal design problems through crowdsourcing faces a dilemma: On one hand, human beings have been shown to be more effective than algorithms at searching for good solutions of certain real-world problems with high-dimensional or discrete solution spaces; on the other hand, the cost of setting up crowdsourcing environments, the uncertainty in the crowd's domain-specific competence, and the lack of commitment of the crowd, all contribute to the lack of real-world application of design crowdsourcing. We are thus motivated to investigate a solution-searching mechanism where an optimization algorithm is tuned based on human demonstrations on solution searching, so that the search can be continued after human participants abandon the problem. To do so, we model the iterative search process as a Bayesian Optimization (BO) algorithm, and propose an inverse BO (IBO) algorithm to find the maximum likelihood estimators of the BO parameters based on human solutions. We show through a vehicle design and control problem that the search performance of BO can be improved by recovering its parameters based on an effective human search. Thus, IBO has the potential to improve the success rate of design crowdsourcing activities, by requiring only good search strategies instead of good solutions from the crowd.
通过众包解决最优设计问题面临着一个两难境地:一方面,在寻找具有高维或离散解空间的某些现实问题的良好解时,人类已被证明比算法更有效;另一方面,建立众包环境的成本、群体在特定领域能力的不确定性以及群体缺乏承诺,都导致了设计众包缺乏实际应用。因此,我们有动机研究一种解决方案搜索机制,其中优化算法根据人类对解决方案搜索的演示进行调整,以便在人类参与者放弃问题后继续搜索。为此,我们将迭代搜索过程建模为贝叶斯优化(BO)算法,并提出了一种逆贝叶斯优化(IBO)算法来寻找基于人类解的BO参数的最大似然估计。我们通过一个车辆设计和控制问题表明,通过基于有效的人工搜索恢复其参数可以提高BO的搜索性能。因此,IBO有潜力提高设计众包活动的成功率,只需要好的搜索策略,而不是从人群中获得好的解决方案。
{"title":"Learning an Optimization Algorithm through Human Design Iterations","authors":"Thurston Sexton, Max Yi Ren","doi":"10.1115/1.4037344","DOIUrl":"https://doi.org/10.1115/1.4037344","url":null,"abstract":"Solving optimal design problems through crowdsourcing faces a dilemma: On one hand, human beings have been shown to be more effective than algorithms at searching for good solutions of certain real-world problems with high-dimensional or discrete solution spaces; on the other hand, the cost of setting up crowdsourcing environments, the uncertainty in the crowd's domain-specific competence, and the lack of commitment of the crowd, all contribute to the lack of real-world application of design crowdsourcing. We are thus motivated to investigate a solution-searching mechanism where an optimization algorithm is tuned based on human demonstrations on solution searching, so that the search can be continued after human participants abandon the problem. To do so, we model the iterative search process as a Bayesian Optimization (BO) algorithm, and propose an inverse BO (IBO) algorithm to find the maximum likelihood estimators of the BO parameters based on human solutions. We show through a vehicle design and control problem that the search performance of BO can be improved by recovering its parameters based on an effective human search. Thus, IBO has the potential to improve the success rate of design crowdsourcing activities, by requiring only good search strategies instead of good solutions from the crowd.","PeriodicalId":8468,"journal":{"name":"arXiv: Learning","volume":"43 1","pages":""},"PeriodicalIF":0.0,"publicationDate":"2016-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91048836","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 19
期刊
arXiv: Learning
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1