{"title":"Enhanced Partitioning of DNN Layers for Uploading from Mobile Devices to Edge Servers","authors":"K. Shin, H. Jeong, Soo-Mook Moon","doi":"10.1145/3325413.3329788","DOIUrl":null,"url":null,"abstract":"Offloading computations to servers is a promising method for resource constrained devices to run deep neural network (DNN). It often requires pre-installing DNN models at the server, which is not a valid assumption in an edge server environment where a client can offload to any nearby server, especially when it is on the move. So, the client needs to upload the DNN model on demand, but uploading the entire layers at once can seriously delay the offloading of the DNN queries due to its high overhead. IONN is a technique to partition the layers and upload them incrementally for fast start of offloading [1]. It partitions the DNN layers using the shortest path on a DNN execution graph between the client and the server based on a penalty factor for the uploading overhead. This paper proposes a new partition algorithm based on efficiency, which generates a more fine-grained uploading plan. Experimental results show that the proposed algorithm tangibly improves the query performance during uploading by as much as 55%, with faster execution of initially-raised queries.","PeriodicalId":164793,"journal":{"name":"The 3rd International Workshop on Deep Learning for Mobile Systems and Applications - EMDL '19","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-06-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The 3rd International Workshop on Deep Learning for Mobile Systems and Applications - EMDL '19","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3325413.3329788","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 15
Abstract
Offloading computations to servers is a promising method for resource constrained devices to run deep neural network (DNN). It often requires pre-installing DNN models at the server, which is not a valid assumption in an edge server environment where a client can offload to any nearby server, especially when it is on the move. So, the client needs to upload the DNN model on demand, but uploading the entire layers at once can seriously delay the offloading of the DNN queries due to its high overhead. IONN is a technique to partition the layers and upload them incrementally for fast start of offloading [1]. It partitions the DNN layers using the shortest path on a DNN execution graph between the client and the server based on a penalty factor for the uploading overhead. This paper proposes a new partition algorithm based on efficiency, which generates a more fine-grained uploading plan. Experimental results show that the proposed algorithm tangibly improves the query performance during uploading by as much as 55%, with faster execution of initially-raised queries.