Daniel Rammer, Kevin Bruhwiler, Paahuni Khandelwal, Samuel Armstrong, S. Pallickara, S. Pallickara
{"title":"Small is Beautiful: Distributed Orchestration of Spatial Deep Learning Workloads","authors":"Daniel Rammer, Kevin Bruhwiler, Paahuni Khandelwal, Samuel Armstrong, S. Pallickara, S. Pallickara","doi":"10.1109/UCC48980.2020.00029","DOIUrl":null,"url":null,"abstract":"Several domains such as agriculture, urban sustainability, and meteorology entail processing satellite imagery for modeling and decision-making. In this study, we describe our novel methodology to train deep learning models over collections of satellite imagery. Deep learning models are computationally and resource expensive. As dataset sizes increase, there is a corresponding increase in the CPU, GPU, disk, and network I/O requirements to train models. Our methodology exploits spatial characteristics inherent in satellite data to partition, disperse, and orchestrate model training workloads. Rather than train a single, all-encompassing model we facilitate producing an ensemble of models - each tuned to a particular spatial extent. We support query-based retrieval of targeted portions of satellite imagery including those that satisfy properties relating to cloud occlusion, We validate the suitability of our methodology by supporting deep learning models for multiple spatial analyses. Our approach is agnostic of the underlying deep learning library. Our extensive empirical benchmark demonstrates the suitability of our methodology to not just preserve accuracy, but reduce completion times by 13.9x while reducing data movement costs by 4 orders of magnitude and ensuring frugal resource utilization.","PeriodicalId":125849,"journal":{"name":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","volume":"155 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE/ACM 13th International Conference on Utility and Cloud Computing (UCC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/UCC48980.2020.00029","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 2
Abstract
Several domains such as agriculture, urban sustainability, and meteorology entail processing satellite imagery for modeling and decision-making. In this study, we describe our novel methodology to train deep learning models over collections of satellite imagery. Deep learning models are computationally and resource expensive. As dataset sizes increase, there is a corresponding increase in the CPU, GPU, disk, and network I/O requirements to train models. Our methodology exploits spatial characteristics inherent in satellite data to partition, disperse, and orchestrate model training workloads. Rather than train a single, all-encompassing model we facilitate producing an ensemble of models - each tuned to a particular spatial extent. We support query-based retrieval of targeted portions of satellite imagery including those that satisfy properties relating to cloud occlusion, We validate the suitability of our methodology by supporting deep learning models for multiple spatial analyses. Our approach is agnostic of the underlying deep learning library. Our extensive empirical benchmark demonstrates the suitability of our methodology to not just preserve accuracy, but reduce completion times by 13.9x while reducing data movement costs by 4 orders of magnitude and ensuring frugal resource utilization.