A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis
{"title":"Novel Abstraction and Offload Mechanisms for High Performance Cloud-native Distributed Object Stores","authors":"A. Reddy, D. Moghe, Manik Taneja, Roger Liao, Subin Francis","doi":"10.1109/IC2E55432.2022.00024","DOIUrl":null,"url":null,"abstract":"Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.","PeriodicalId":415781,"journal":{"name":"2022 IEEE International Conference on Cloud Engineering (IC2E)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Conference on Cloud Engineering (IC2E)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IC2E55432.2022.00024","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Object Storage solutions are typically optimised for capacity and cost but performance has traditionally been a second thought. We make the case for a highly performant distributed object store by a) building an abstraction layer to pass immutability and data affinity hints to the underlying storage while also, b) making the Objects layer aware of the hardware configurations enabling the Object Storage controllers to maximise throughput. With these optimizations we show that object storage performance can approach 95% of the maximum possible performance from the underlying raw storage while ensuring that the abstractions are generic enough to be able to run on any general purpose off the shelf storage systems. We have observed these performance gains across more than 1000 customer environments across diverse hardware. We also extend the above optimizations with generic mecha-nisms to offload compute closer to storage that have significant benefits for a broad class of workloads. Specifically, we evaluate performance gains from a) well known constructs like S3 Select for Analytics workloads and b) generic compute offload like Objects Lambda. This ability to offload compute is critical for modern distributed workloads like AI/ML and Analytics processing with very large distributed data sets.