{"title":"Distributed Deep Learning on Wimpy Smartphone Nodes","authors":"Tzoof Hemed, Nitai Lavie, R. Kaplan","doi":"10.1109/ICSEE.2018.8646195","DOIUrl":null,"url":null,"abstract":"Deep Neural Networks (DNN), contain multiple convolutional and several fully connected layers, require considerable hardware resources to train in a reasonable time. Multiple CPUs, GPUs or FPGAs are usually combined to reduce the training time of a DNN. However, many individuals or small organizations do not possess the resources to obtain multiple hardware units.The contribution of this work is two-fold. First, we present an implementation of a distributed DNN training system that uses multiple small (wimpy) nodes to accelerate the training process. The nodes are mobile smartphone devices, with variable hardware specifications. All DNN training tasks are performed on the small nodes, coordinated by a centralized server. Second, we propose a novel method to mitigate issues arising from the variability in hardware resources. We demonstrate that the method allows training a DNN to high accuracy on known image recognition datasets with multiple small different nodes. The proposed method factors in the contribution from each node according to its run time on a specific training task, relative to the other nodes. In addition, we discuss practical challenges that arise from small node system and suggest several solutions.","PeriodicalId":254455,"journal":{"name":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","volume":"409 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Conference on the Science of Electrical Engineering in Israel (ICSEE)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICSEE.2018.8646195","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1
Abstract
Deep Neural Networks (DNN), contain multiple convolutional and several fully connected layers, require considerable hardware resources to train in a reasonable time. Multiple CPUs, GPUs or FPGAs are usually combined to reduce the training time of a DNN. However, many individuals or small organizations do not possess the resources to obtain multiple hardware units.The contribution of this work is two-fold. First, we present an implementation of a distributed DNN training system that uses multiple small (wimpy) nodes to accelerate the training process. The nodes are mobile smartphone devices, with variable hardware specifications. All DNN training tasks are performed on the small nodes, coordinated by a centralized server. Second, we propose a novel method to mitigate issues arising from the variability in hardware resources. We demonstrate that the method allows training a DNN to high accuracy on known image recognition datasets with multiple small different nodes. The proposed method factors in the contribution from each node according to its run time on a specific training task, relative to the other nodes. In addition, we discuss practical challenges that arise from small node system and suggest several solutions.