Anytime Exploitation of Stragglers in Synchronous Stochastic Gradient Descent

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA) Pub Date : 2017-12-01 DOI:10.1109/ICMLA.2017.0-166

Nuwan S. Ferdinand, Benjamin Gharachorloo, S. Draper

引用次数: 21

Abstract

In this paper we propose an approach to parallelizing synchronous stochastic gradient descent (SGD) that we term “Anytime-Gradients”. The Anytime-Gradients is designed to exploit the work completed by slow compute nodes or “stragglers”. In many approaches work completed by these nodes, while only partial, is discarded completely. To maintain synchronization in our approach, each computational epoch is of fixed duration, and at the end of each epoch, workers send updated parameter vectors to a master mode for combination. The master weights each update by the amount of work done. The Anytime-Gradients scheme is robust to both persistent and non-persistent stragglers and requires no prior knowledge about processor abilities. We show that the scheme effectively exploits stragglers and outperforms existing methods.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

同步随机梯度下降中离散机的随时开发

在本文中，我们提出了一种并行化同步随机梯度下降(SGD)的方法，我们称之为“任意时间梯度”。任意时间梯度的设计是为了利用缓慢的计算节点或“掉队者”完成的工作。在许多方法中，这些节点完成的工作虽然只是部分完成，但被完全丢弃。为了在我们的方法中保持同步，每个计算历元都是固定的持续时间，并且在每个历元结束时，工作人员将更新的参数向量发送到主模式进行组合。主服务器根据完成的工作量对每次更新进行加权。Anytime-Gradients方案对持久性和非持久性掉队者都具有鲁棒性，并且不需要事先了解处理器的能力。我们证明了该方案有效地利用了离散子，并且优于现有的方法。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2017 16th IEEE International Conference on Machine Learning and Applications (ICMLA)

自引率

0.00%

发文量

期刊最新文献

Tree-Structured Curriculum Learning Based on Semantic Similarity of Text Direct Multiclass Boosting Using Base Classifiers' Posterior Probabilities Estimates Predicting Psychosis Using the Experience Sampling Method with Mobile Apps Human Action Recognition from Body-Part Directional Velocity Using Hidden Markov Models Realistic Traffic Generation for Web Robots