Characterizing the Cost-Accuracy Performance of Cloud Applications

Workshop Proceedings of the 49th International Conference on Parallel Processing Pub Date : 2020-08-17 DOI:10.1145/3409390.3409409

Sunimal Rathnayake, Lavanya Ramapantulu, Y. M. Teo

{"title":"Characterizing the Cost-Accuracy Performance of Cloud Applications","authors":"Sunimal Rathnayake, Lavanya Ramapantulu, Y. M. Teo","doi":"10.1145/3409390.3409409","DOIUrl":null,"url":null,"abstract":"Emergence of applications that produce results with different accuracy allows cloud consumers to leverage the advantages of elastic cloud resources and pay-per-use pricing model. However, the trade-off between cost, accuracy and execution time of cloud applications has not been well studied due to multiple challenges. A key challenge faced by a cloud consumer is tuning the application and determining cloud resource configuration that achieves the desired application accuracy among the configuration space. This paper proposes an approach to improve the cost-accuracy performance of cloud applications for a given cost and accuracy. To illustrate our approach, we use two popular convolution neural networks’ (CNN) inference as examples with pruning as a tuning tool for changing the accuracy, and yield several insights. Firstly, we show the existence of multiple degrees of pruning as “sweet-spots”, where inference time and cost can be reduced without losing accuracy. Combining such sweet-spots can halve inference cost and time with one-tenth reduction in accuracy for Caffenet CNN. Secondly, we show that in the large resource configuration space, these “sweet-spots” form the cost-accuracy and time-accuracy Pareto-frontiers whereby a Pareto-optimal configuration can reduce cost and execution time by 55% and 50% respectively for achieving the highest possible inference accuracy. Lastly, to quantify the accuracy performance of cloud applications, we introduce Time Accuracy Ratio (TAR) and Cost Accuracy Ratio (CAR) metrics. We show that using TAR and CAR reduces the time complexity in determining cloud resource configurations from exponential to polynomial-time.","PeriodicalId":350506,"journal":{"name":"Workshop Proceedings of the 49th International Conference on Parallel Processing","volume":"47 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-08-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Workshop Proceedings of the 49th International Conference on Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3409390.3409409","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

Abstract

Emergence of applications that produce results with different accuracy allows cloud consumers to leverage the advantages of elastic cloud resources and pay-per-use pricing model. However, the trade-off between cost, accuracy and execution time of cloud applications has not been well studied due to multiple challenges. A key challenge faced by a cloud consumer is tuning the application and determining cloud resource configuration that achieves the desired application accuracy among the configuration space. This paper proposes an approach to improve the cost-accuracy performance of cloud applications for a given cost and accuracy. To illustrate our approach, we use two popular convolution neural networks’ (CNN) inference as examples with pruning as a tuning tool for changing the accuracy, and yield several insights. Firstly, we show the existence of multiple degrees of pruning as “sweet-spots”, where inference time and cost can be reduced without losing accuracy. Combining such sweet-spots can halve inference cost and time with one-tenth reduction in accuracy for Caffenet CNN. Secondly, we show that in the large resource configuration space, these “sweet-spots” form the cost-accuracy and time-accuracy Pareto-frontiers whereby a Pareto-optimal configuration can reduce cost and execution time by 55% and 50% respectively for achieving the highest possible inference accuracy. Lastly, to quantify the accuracy performance of cloud applications, we introduce Time Accuracy Ratio (TAR) and Cost Accuracy Ratio (CAR) metrics. We show that using TAR and CAR reduces the time complexity in determining cloud resource configurations from exponential to polynomial-time.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

描述云应用程序的成本-准确性性能

产生不同精度结果的应用程序的出现使云消费者能够利用弹性云资源和按使用付费定价模型的优势。然而，由于多种挑战，云应用程序的成本、准确性和执行时间之间的权衡尚未得到很好的研究。云使用者面临的一个关键挑战是调优应用程序和确定云资源配置，以在配置空间中实现所需的应用程序准确性。本文提出了一种在给定成本和精度的情况下提高云应用成本-精度性能的方法。为了说明我们的方法，我们使用两个流行的卷积神经网络(CNN)推理作为示例，将修剪作为改变精度的调整工具，并产生一些见解。首先，我们证明了存在多个程度的修剪作为“甜蜜点”，其中推理时间和成本可以在不损失准确性的情况下减少。结合这些最佳点可以使Caffenet CNN的推理成本和时间减半，准确率降低十分之一。其次，我们表明，在大的资源配置空间中，这些“最佳点”形成了成本-精度和时间-精度帕累托边界，其中帕累托最优配置可以分别减少55%和50%的成本和执行时间，以实现最高的推理精度。最后，为了量化云应用的准确性性能，我们引入了时间正确率(TAR)和成本正确率(CAR)指标。我们表明，使用TAR和CAR将确定云资源配置的时间复杂度从指数时间降低到多项式时间。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Workshop Proceedings of the 49th International Conference on Parallel Processing

自引率

0.00%

发文量