分布式神经网络优化的一种中间表示

Proceedings of the 1st Workshop on Machine Learning and Systems Pub Date : 2021-04-26 DOI:10.1145/3437984.3458829

Keshav Santhanam, Siddharth Krishna, Ryota Tomioka, A. Fitzgibbon, Tim Harris

{"title":"分布式神经网络优化的一种中间表示","authors":"Keshav Santhanam, Siddharth Krishna, Ryota Tomioka, A. Fitzgibbon, Tim Harris","doi":"10.1145/3437984.3458829","DOIUrl":null,"url":null,"abstract":"The rapidly growing size of deep neural network (DNN) models and datasets has given rise to a variety of distribution strategies such as data, horizontal, and pipeline parallelism. However, selecting the best set of strategies for a given model and hardware configuration is challenging because debugging and testing on clusters is expensive. In this work we propose DistIR, an IR for explicitly representing distributed DNN computation that can capture many popular distribution strategies. We build an analysis framework for DistIR programs, including a simulator and reference executor that can be used to automatically search for an optimal distribution strategy. Our unified global representation also eases development of new distribution strategies, as one can reuse the lowering to per-rank backend programs. Preliminary results using a grid search over a hybrid data/horizontal/pipeline-parallel space suggest DistIR and its simulator can aid automatic DNN distribution.","PeriodicalId":269840,"journal":{"name":"Proceedings of the 1st Workshop on Machine Learning and Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-04-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"6","resultStr":"{\"title\":\"DistIR: An Intermediate Representation for Optimizing Distributed Neural Networks\",\"authors\":\"Keshav Santhanam, Siddharth Krishna, Ryota Tomioka, A. Fitzgibbon, Tim Harris\",\"doi\":\"10.1145/3437984.3458829\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The rapidly growing size of deep neural network (DNN) models and datasets has given rise to a variety of distribution strategies such as data, horizontal, and pipeline parallelism. However, selecting the best set of strategies for a given model and hardware configuration is challenging because debugging and testing on clusters is expensive. In this work we propose DistIR, an IR for explicitly representing distributed DNN computation that can capture many popular distribution strategies. We build an analysis framework for DistIR programs, including a simulator and reference executor that can be used to automatically search for an optimal distribution strategy. Our unified global representation also eases development of new distribution strategies, as one can reuse the lowering to per-rank backend programs. Preliminary results using a grid search over a hybrid data/horizontal/pipeline-parallel space suggest DistIR and its simulator can aid automatic DNN distribution.\",\"PeriodicalId\":269840,\"journal\":{\"name\":\"Proceedings of the 1st Workshop on Machine Learning and Systems\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-04-26\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"6\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the 1st Workshop on Machine Learning and Systems\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/3437984.3458829\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 1st Workshop on Machine Learning and Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3437984.3458829","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 6

摘要

随着深度神经网络(DNN)模型和数据集规模的迅速增长，出现了多种分布策略，如数据并行、水平并行和管道并行。然而，为给定的模型和硬件配置选择最佳策略集是一项挑战，因为在集群上调试和测试的成本很高。在这项工作中，我们提出了DistIR，一种用于显式表示分布式DNN计算的IR，可以捕获许多流行的分布策略。我们建立了一个DistIR程序的分析框架，包括一个模拟器和参考执行器，可以用来自动搜索最优的分布策略。我们统一的全球表示还简化了新分发策略的开发，因为可以重用降低到每个级别的后端程序。在混合数据/水平/管道-并行空间上使用网格搜索的初步结果表明，DistIR及其模拟器可以帮助自动DNN分布。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

DistIR: An Intermediate Representation for Optimizing Distributed Neural Networks

The rapidly growing size of deep neural network (DNN) models and datasets has given rise to a variety of distribution strategies such as data, horizontal, and pipeline parallelism. However, selecting the best set of strategies for a given model and hardware configuration is challenging because debugging and testing on clusters is expensive. In this work we propose DistIR, an IR for explicitly representing distributed DNN computation that can capture many popular distribution strategies. We build an analysis framework for DistIR programs, including a simulator and reference executor that can be used to automatically search for an optimal distribution strategy. Our unified global representation also eases development of new distribution strategies, as one can reuse the lowering to per-rank backend programs. Preliminary results using a grid search over a hybrid data/horizontal/pipeline-parallel space suggest DistIR and its simulator can aid automatic DNN distribution.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Proceedings of the 1st Workshop on Machine Learning and Systems

自引率

0.00%

发文量

期刊最新文献

Towards Mitigating Device Heterogeneity in Federated Learning via Adaptive Model Quantization Queen Jane Approximately: Enabling Efficient Neural Network Inference with Context-Adaptivity Are we there yet? Estimating Training Time for Recommendation Systems Predicting CPU usage for proactive autoscaling Towards Optimal Configuration of Microservices