CIDR: A Cost-Effective In-Line Data Reduction System for Terabit-Per-Second Scale SSD Arrays

2019 IEEE International Symposium on High Performance Computer Architecture (HPCA) Pub Date : 2019-02-01 DOI:10.1109/HPCA.2019.00025

M. Ajdari, Pyeongsu Park, Joonsung Kim, Dongup Kwon, Jang-Hyun Kim

{"title":"CIDR: A Cost-Effective In-Line Data Reduction System for Terabit-Per-Second Scale SSD Arrays","authors":"M. Ajdari, Pyeongsu Park, Joonsung Kim, Dongup Kwon, Jang-Hyun Kim","doi":"10.1109/HPCA.2019.00025","DOIUrl":null,"url":null,"abstract":"An SSD array, a storage system consisting of multiple SSDs per node, has become a design choice to implement a fast primary storage system, and modern storage architects now aim to achieve terabit-per-second scale performance with the next-generation SSD array. To reduce the storage cost and improve the device endurability, such SSD array must employ data reduction schemes (i.e., deduplication, compression), which provide high data reduction capability at minimum costs. However, existing data reduction schemes do not scale with the fast increasing performance of an SSD array, due to inhibitive amount of CPU resources (e.g., in software-based schemes) or low data reduction ratio (e.g., in SSD device wide deduplication) or being cost ineffective to address workload changes in datacenters (e.g., in ASIC-based acceleration). In this paper, we propose CIDR, a novel FPGA-based, cost-effective data reduction system for an SSD array to achieve the terabit-per-second scale storage performance. Our key ideas are as follows. First, we decouple data reductionrelated computing tasks from the unscalable host CPUs by offloading them to a scalable array of FPGA boards. Second, we employ a centralized, node-wide metadata management scheme to achieve an SSD array-wide, high data reduction. Third, our FPGA-based reconfiguration adapts to different workload patterns by dynamically balancing the amount of software and hardware tasks running on CPUs and FPGAs, respectively. For evaluation, we built our example CIDR prototype achieving up to 12.8 GB/s (0.1 Tbps) on one FPGA. CIDR outperforms the baseline for a write-only workload by up to 2.47x and a mixed read-write workload by an expected 3.2x, respectively. We showed CIDR’s scalability to achieve Tbps-scale performance by measuring a two-FPGA CIDR and projecting the performance impacts for more FPGAs. Keywords-deduplication; compression; FPGA; SSD array;","PeriodicalId":102050,"journal":{"name":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","volume":"203 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HPCA.2019.00025","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 23

Abstract

An SSD array, a storage system consisting of multiple SSDs per node, has become a design choice to implement a fast primary storage system, and modern storage architects now aim to achieve terabit-per-second scale performance with the next-generation SSD array. To reduce the storage cost and improve the device endurability, such SSD array must employ data reduction schemes (i.e., deduplication, compression), which provide high data reduction capability at minimum costs. However, existing data reduction schemes do not scale with the fast increasing performance of an SSD array, due to inhibitive amount of CPU resources (e.g., in software-based schemes) or low data reduction ratio (e.g., in SSD device wide deduplication) or being cost ineffective to address workload changes in datacenters (e.g., in ASIC-based acceleration). In this paper, we propose CIDR, a novel FPGA-based, cost-effective data reduction system for an SSD array to achieve the terabit-per-second scale storage performance. Our key ideas are as follows. First, we decouple data reductionrelated computing tasks from the unscalable host CPUs by offloading them to a scalable array of FPGA boards. Second, we employ a centralized, node-wide metadata management scheme to achieve an SSD array-wide, high data reduction. Third, our FPGA-based reconfiguration adapts to different workload patterns by dynamically balancing the amount of software and hardware tasks running on CPUs and FPGAs, respectively. For evaluation, we built our example CIDR prototype achieving up to 12.8 GB/s (0.1 Tbps) on one FPGA. CIDR outperforms the baseline for a write-only workload by up to 2.47x and a mixed read-write workload by an expected 3.2x, respectively. We showed CIDR’s scalability to achieve Tbps-scale performance by measuring a two-FPGA CIDR and projecting the performance impacts for more FPGAs. Keywords-deduplication; compression; FPGA; SSD array;

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

一种高性价比的用于每秒太比特级SSD阵列的在线数据缩减系统

SSD阵列(每个节点由多个SSD组成的存储系统)已经成为实现快速主存储系统的设计选择，现代存储架构师现在的目标是通过下一代SSD阵列实现每秒太比特级的性能。为了降低存储成本和提高设备的耐用性，该类SSD阵列必须采用数据缩减方案(即重复数据删除、压缩)，以最小的成本提供较高的数据缩减能力。然而，现有的数据缩减方案不能随着SSD阵列性能的快速增长而扩展，原因是CPU资源的抑制量(例如，基于软件的方案)或低数据缩减率(例如，在SSD设备范围内的重复数据删除)，或者在处理数据中心的工作负载变化(例如，在基于asic的加速中)时成本无效。在本文中，我们提出CIDR，一种新颖的基于fpga的，具有成本效益的SSD阵列数据缩减系统，以实现每秒太比特级的存储性能。我们的主要想法如下。首先，我们将数据约简相关的计算任务从不可扩展的主机cpu上解耦，将它们卸载到可扩展的FPGA板阵列上。其次，我们采用集中的节点范围的元数据管理方案来实现SSD阵列范围的高数据缩减。第三，我们基于fpga的重新配置通过动态平衡分别在cpu和fpga上运行的软件和硬件任务的数量来适应不同的工作负载模式。为了进行评估，我们在一个FPGA上构建了我们的示例CIDR原型，达到了12.8 GB/s (0.1 Tbps)。在只写工作负载和混合读写工作负载下，CIDR分别比基准性能高出预期的2.47倍和3.2倍。我们通过测量双fpga CIDR并预测更多fpga对性能的影响，展示了CIDR实现tps级性能的可扩展性。Keywords-deduplication;压缩;FPGA;SSD数组;

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

2019 IEEE International Symposium on High Performance Computer Architecture (HPCA)

自引率

0.00%

发文量

期刊最新文献

Machine Learning at Facebook: Understanding Inference at the Edge Understanding the Future of Energy Efficiency in Multi-Module GPUs POWERT Channels: A Novel Class of Covert CommunicationExploiting Power Management Vulnerabilities The Accelerator Wall: Limits of Chip Specialization Featherlight Reuse-Distance Measurement