{"title":"高分辨率图像的分布式训练:一种域和空间分解方法","authors":"A. Tsaris, Jacob D. Hinkle, D. Lunga, P. Dias","doi":"10.2172/1827010","DOIUrl":null,"url":null,"abstract":"In this work we developed two Pytorch libraries using the PyTorch RPC interface for distributed deep learning approaches on high resolution images. The spatial decomposition library allows for distributed training on very large images, which otherwise wouldn’t be possible on a single GPU. The domain parallelism library allows for distributed training across multiple domain unlabeled data, by leveraging the domain separation architecture. Both of those libraries where tested on the Summit supercomputer at Oak Ridge National Laboratory at a moderate scale.","PeriodicalId":119942,"journal":{"name":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2021-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Distributed Training for High Resolution Images: A Domain and Spatial Decomposition Approach\",\"authors\":\"A. Tsaris, Jacob D. Hinkle, D. Lunga, P. Dias\",\"doi\":\"10.2172/1827010\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this work we developed two Pytorch libraries using the PyTorch RPC interface for distributed deep learning approaches on high resolution images. The spatial decomposition library allows for distributed training on very large images, which otherwise wouldn’t be possible on a single GPU. The domain parallelism library allows for distributed training across multiple domain unlabeled data, by leveraging the domain separation architecture. Both of those libraries where tested on the Summit supercomputer at Oak Ridge National Laboratory at a moderate scale.\",\"PeriodicalId\":119942,\"journal\":{\"name\":\"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.2172/1827010\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE/ACM Redefining Scalability for Diversely Heterogeneous Architectures Workshop (RSDHA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.2172/1827010","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Distributed Training for High Resolution Images: A Domain and Spatial Decomposition Approach
In this work we developed two Pytorch libraries using the PyTorch RPC interface for distributed deep learning approaches on high resolution images. The spatial decomposition library allows for distributed training on very large images, which otherwise wouldn’t be possible on a single GPU. The domain parallelism library allows for distributed training across multiple domain unlabeled data, by leveraging the domain separation architecture. Both of those libraries where tested on the Summit supercomputer at Oak Ridge National Laboratory at a moderate scale.