2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)最新文献

英文中文

Improving SSD Performance Using Adaptive Restricted-Copyback Operations 使用自适应限制回拷操作提高SSD性能

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-20 DOI: 10.1109/NVMSA.2019.8863524

Duwon Hong, Myungsuk Kim, Jisung Park, Myoungsoo Jung, Jihong Kim

Copyback operation can improve the performance of data migrations in SSD, but they are rarely used because of their error propagation problem. In this paper, we propose an integrated approach that maximizes the efficiency of copyback operations but does not compromise data reliability. First, we propose a novel per-block error propagation model under consecutive copyback operations. Our model significantly increases the number of successive copybacks by exploiting the aging characteristics of NAND blocks. Second, we devise a resource-efficient error management scheme that can handle successive copybacks where pages move around multiple blocks with different reliability. Experimental results show that the proposed technique can improve the IO throughput by up to 25% over the existing technique.

回拷操作可以提高SSD数据迁移的性能，但由于回拷操作存在错误传播问题，因此很少使用。在本文中，我们提出了一种集成的方法，最大限度地提高回拷操作的效率，但不损害数据可靠性。首先，我们提出了一种新的连续回拷操作下的逐块错误传播模型。我们的模型通过利用NAND块的老化特性显着增加了连续回拷的数量。其次，我们设计了一种资源高效的错误管理方案，该方案可以处理连续的回拷，其中页面以不同的可靠性在多个块之间移动。实验结果表明，该技术比现有技术可提高高达25%的IO吞吐量。

引用次数: 7

NVMSA 2019 Message from the General Co-Chairs 2019年NVMSA总联合主席致辞

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/nvmsa.2019.8863512

引用次数: 0

Host-Level Workload-Aware Budget Compensation I/O Scheduling for Open-Channel SSDs 开放通道ssd的主机级工作负载感知预算补偿I/O调度

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863515

Sooyun Lee, Kyuhwa Han, Dongkun Shin

In datacenters and cloud computing, Quality of Service (QoS) is an essential concept as access to shared resources, including solid state drives (SSDs), must be ensured. The previously proposed workload-aware budget compensation (WA-BC) scheduling algorithm is a device I/O scheduler for guaranteeing performance isolation among multiple virtual machines sharing an SSD. This paper aims to resolve the following three shortcomings of WA-BC: (1) it is applicable to only SR-IOV supporting SSDs, (2) it is unfit for various types of workloads, and (3) it manages flash memory blocks separately in an inappropriate manner. We propose the host-level WA-BC (hWA-BC) scheduler, which aims to achieve performance isolation between multiple processes sharing an open-channel SSD.

在数据中心和云计算中，QoS (Quality of Service)是一个重要的概念，因为必须确保访问共享资源，包括ssd (solid state drives)。以前提出的工作负载感知预算补偿(WA-BC)调度算法是一种设备I/O调度程序，用于保证共享SSD的多个虚拟机之间的性能隔离。本文旨在解决WA-BC的三个缺点:(1)它只适用于支持ssd的SR-IOV;(2)它不适合各种类型的工作负载;(3)它单独管理闪存块的方式不合适。我们提出了主机级WA-BC (hWA-BC)调度器，它旨在实现共享开放通道SSD的多个进程之间的性能隔离。

引用次数: 1

Performance Evaluation on NVMM Emulator Employing Fine-Grain Delay Injection 采用细粒度延迟注入的NVMM仿真器性能评价

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863522

Yusuke Omori, K. Kimura

The emerging technology of byte-addressable nonvolatile memory chips is expected to enable larger main memory and lower power consumption than the traditional DRAM. It also realizes durable data structure without ordinary file systems. However, while enumerating the advantages of nonvolatile main memory (NVMM), its write-time expensive latency and higher energy consumption in comparision with a DRAM must be considered. These special characteristics of NVMM require new compiler techniques and OS support as well as new memory architectures. Several NVMM emulators built on real machines have been proposed to facilitate those software and hardware researches. Their designs were originally based on a simple coarse-grain delay model that injected additional clock cycles in the read and write requests sent to the memory controller. However, they could not utilize bank-level parallelism and row-buffer access locality, relied on by today’s memory modules, to exploit their performance. Therefore, a fine-grain delay model was recently proposed where the delay is injected for the primitive memory operations issued by the memory controller. In this paper, we implement both the coarse-grain and the fine-grain delay models on an SoC-FPGA board along with the use of Linux kernel modifications and several runtime functions. Then, the program behavior differences between two models are evaluated with SPEC CPU programs. The fine-grain model reveals the program execution time is influenced by the frequency of NVMM memory requests rather than the cache hit ratio. Bank-level parallelism and row-buffer access locality also affect the memory access delay, and the fine-grain model shows lower execution time for four of fourteen programs than the coarse-grain even when the former has longer total write latency.

新兴的字节可寻址非易失性存储器芯片技术有望实现比传统DRAM更大的主存储器和更低的功耗。它还实现了不需要普通文件系统的持久数据结构。然而，在列举非易失性主存储器(NVMM)的优点时，必须考虑到与DRAM相比，它的写时间昂贵的延迟和更高的能耗。NVMM的这些特殊特性需要新的编译器技术和操作系统支持以及新的内存体系结构。本文提出了几种基于真实机器的NVMM仿真器，以促进这些软件和硬件的研究。他们的设计最初是基于一个简单的粗粒度延迟模型，在发送到内存控制器的读写请求中注入额外的时钟周期。然而，它们不能利用银行级并行性和行缓冲区访问局部性来开发它们的性能，这是当今内存模块所依赖的。因此，最近提出了一种细粒度延迟模型，该模型将延迟注入到内存控制器发出的原始内存操作中。在本文中，我们在SoC-FPGA板上实现了粗粒度和细粒度延迟模型，并使用了Linux内核修改和几个运行时函数。然后，使用SPEC CPU程序评估两种模型之间的程序行为差异。细粒度模型显示，影响程序执行时间的是NVMM内存请求的频率，而不是缓存命中率。银行级并行性和行缓冲区访问局部性也会影响内存访问延迟，细粒度模型显示，在14个程序中，有4个程序的执行时间低于粗粒度模型，即使粗粒度模型的总写延迟更长。

{"title":"Performance Evaluation on NVMM Emulator Employing Fine-Grain Delay Injection","authors":"Yusuke Omori, K. Kimura","doi":"10.1109/NVMSA.2019.8863522","DOIUrl":"https://doi.org/10.1109/NVMSA.2019.8863522","url":null,"abstract":"The emerging technology of byte-addressable nonvolatile memory chips is expected to enable larger main memory and lower power consumption than the traditional DRAM. It also realizes durable data structure without ordinary file systems. However, while enumerating the advantages of nonvolatile main memory (NVMM), its write-time expensive latency and higher energy consumption in comparision with a DRAM must be considered. These special characteristics of NVMM require new compiler techniques and OS support as well as new memory architectures. Several NVMM emulators built on real machines have been proposed to facilitate those software and hardware researches. Their designs were originally based on a simple coarse-grain delay model that injected additional clock cycles in the read and write requests sent to the memory controller. However, they could not utilize bank-level parallelism and row-buffer access locality, relied on by today’s memory modules, to exploit their performance. Therefore, a fine-grain delay model was recently proposed where the delay is injected for the primitive memory operations issued by the memory controller. In this paper, we implement both the coarse-grain and the fine-grain delay models on an SoC-FPGA board along with the use of Linux kernel modifications and several runtime functions. Then, the program behavior differences between two models are evaluated with SPEC CPU programs. The fine-grain model reveals the program execution time is influenced by the frequency of NVMM memory requests rather than the cache hit ratio. Bank-level parallelism and row-buffer access locality also affect the memory access delay, and the fine-grain model shows lower execution time for four of fourteen programs than the coarse-grain even when the former has longer total write latency.","PeriodicalId":438544,"journal":{"name":"2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131266572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Optimizing Cauchy Reed-Solomon Coding via ReRAM Crossbars in SSD-based RAID Systems 在基于ssd的RAID系统中，通过ReRAM crossbar优化Cauchy Reed-Solomon编码

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863519

Lei Han, Shangzhen Tan, Bin Xiao, Chenlin Ma, Z. Shao

Erasure codes such as Cauchy Reed-Solomon codes have been gaining ever-increasing importance for fault-tolerance in the SSD-based RAID arrays. However, erasure coding on a processor-based RAID controller relies on Galois Filed arithmetic to perform matrix-vector multiplication, which increases the computation complexity and leads to a huge number of memory accesses. In this paper, we investigate utilizing ReRAM to improve erasure coding performance. We propose Re-RAID which uses ReRAM as main memory in both RAID and SSD controllers, in which erasure coding can be processed on ReRAM. We also propose a confluent Cauchy-Vandermonde matrix as the generator matrix for encoding. By doing this, Re-RAID can distribute the reconstruction tasks for a single failure to SSDs, and then SSDs can recover the data with ReRAM memory. Experimental results show that we can improve the encoding and decoding performance by up to $598 times $ and $251 times $, respectively.

在基于ssd的RAID阵列中，诸如Cauchy Reed-Solomon码之类的擦除码对于容错越来越重要。然而，基于处理器的RAID控制器上的擦除编码依赖于伽罗瓦域算法来执行矩阵向量乘法，这增加了计算复杂度，并导致大量的内存访问。在本文中，我们研究利用ReRAM来提高擦除编码的性能。我们提出了在RAID控制器和SSD控制器中使用ReRAM作为主存储器的Re-RAID，其中擦除编码可以在ReRAM上处理。我们还提出了一个合流的Cauchy-Vandermonde矩阵作为编码的生成矩阵。通过这样做，Re-RAID可以将单个故障的重建任务分配给ssd，然后ssd可以使用ReRAM内存恢复数据。实验结果表明，我们可以将编码和解码性能分别提高$598 times $和$251 times $。

引用次数: 5

NVMSA 2019 Copyright Page NVMSA 2019版权页

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/nvmsa.2019.8863509

引用次数: 0

NVMSA 2019 TOC

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/nvmsa.2019.8863529

引用次数: 0

Fair Down to the Device: A GC-Aware Fair Scheduler for SSD 公平到设备:一个GC-Aware公平调度的SSD

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863523

Cheng Ji, Lun Wang, Qiao Li, Congming Gao, Liang Shi, Chia-Lin Yang, C. Xue

Solid-state drives (SSD) are the mainstream solutions for massive data storage today. For modern computer systems, fair resource assignment is a critical design consideration and has drawn great interests in recent years. Although there are several I/O fairness schedulers proposed on the host side for SSDs, process fairness could still be dramatically degraded if garbage collection (GC) is triggered in the device side. A GC operation could block I/O requests, which causes unpredictable read/write latency variation and further impacts fairness between processes. This paper proposes Fair-GC, a novel coordinated host and device I/O scheduling strategy to achieve true fairness considering GC interferences. The key idea is to orchestrate GC operations inside SSDs carefully such that performance of a process is penalized by GC in the same degree (or comparable) as when it runs alone. In this way, the I/O fairness maintained by the host-side scheduler can be maintained in the presence of GC. Furthermore, our scheduler ensures that the timeslice of a process maintained at the host-side scheduler is updated in a timely manner to avoid unnecessary slowdown for maintaining fairness. Experimental results with a wide range of workloads verify that the proposed technique can achieve fairness as well as improve the throughput significantly. Compared to conventional fairness-based I/O scheduler, Fair-GC can reduce the slowdown of real applications by up to 99%, and improve the throughput by as much as 225%, respectively.

固态硬盘(SSD)是当前海量数据存储的主流解决方案。对于现代计算机系统来说，公平的资源分配是一个重要的设计考虑因素，近年来引起了人们的极大兴趣。尽管在主机端为ssd提供了几个I/O公平性调度器，但是如果在设备端触发垃圾收集(GC)，进程公平性仍然会显著降低。GC操作可能会阻塞I/O请求，这会导致不可预测的读/写延迟变化，并进一步影响进程之间的公平性。本文提出了一种新的主机和设备协同I/O调度策略Fair-GC，以实现真正的GC干扰。关键思想是小心地编排ssd内的GC操作，以便GC对进程性能的损害程度与单独运行时相同(或相当)。通过这种方式，可以在存在GC的情况下维护主机端调度器维护的I/O公平性。此外，我们的调度器确保在主机端调度器中维护的进程的时间片及时更新，以避免为了维护公平性而造成不必要的减速。在各种工作负载下的实验结果表明，该技术在实现公平性的同时显著提高了吞吐量。与传统的基于公平性的I/O调度器相比，Fair-GC可以将实际应用程序的速度降低99%，并将吞吐量提高225%。

{"title":"Fair Down to the Device: A GC-Aware Fair Scheduler for SSD","authors":"Cheng Ji, Lun Wang, Qiao Li, Congming Gao, Liang Shi, Chia-Lin Yang, C. Xue","doi":"10.1109/NVMSA.2019.8863523","DOIUrl":"https://doi.org/10.1109/NVMSA.2019.8863523","url":null,"abstract":"Solid-state drives (SSD) are the mainstream solutions for massive data storage today. For modern computer systems, fair resource assignment is a critical design consideration and has drawn great interests in recent years. Although there are several I/O fairness schedulers proposed on the host side for SSDs, process fairness could still be dramatically degraded if garbage collection (GC) is triggered in the device side. A GC operation could block I/O requests, which causes unpredictable read/write latency variation and further impacts fairness between processes. This paper proposes Fair-GC, a novel coordinated host and device I/O scheduling strategy to achieve true fairness considering GC interferences. The key idea is to orchestrate GC operations inside SSDs carefully such that performance of a process is penalized by GC in the same degree (or comparable) as when it runs alone. In this way, the I/O fairness maintained by the host-side scheduler can be maintained in the presence of GC. Furthermore, our scheduler ensures that the timeslice of a process maintained at the host-side scheduler is updated in a timely manner to avoid unnecessary slowdown for maintaining fairness. Experimental results with a wide range of workloads verify that the proposed technique can achieve fairness as well as improve the throughput significantly. Compared to conventional fairness-based I/O scheduler, Fair-GC can reduce the slowdown of real applications by up to 99%, and improve the throughput by as much as 225%, respectively.","PeriodicalId":438544,"journal":{"name":"2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)","volume":"34 2","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2019-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114124724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 2

A Memristor-based Scan Hold Flip-Flop 基于忆阻器的扫描保持触发器

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863517

Aijiao Cui, Zhenxing Chang, Ziming Wang, G. Qu, Huawei Li

The scan based design-for-testability (DfT) has been widely adopted in modern integrated circuits (ICs) design to facilitate manufacture testing. However, the transitions in scan cells result in much test power consumption during testing. The scan hold flip-flop (SHFF) can insulate the transitions in scan chain from the circuit under test to reduce test power while incurring much area overhead. We propose to solve this problem by adopting a memristor-based D flip-flop (DFF) into SHFF. The new design breaks down the design structure of conventional CMOS scan cells and adopts memristors into SHFF to reduce the number of transistors and hence the chip area. The functionality of the proposed design is verified to be correct by HSPICE simulation. Compared with the conventional SHFF cells, the area overhead is reduced 26.5%

基于扫描的可测试性设计(DfT)已广泛应用于现代集成电路(ic)设计中，以方便制造测试。然而，在测试过程中，扫描单元的转换导致了大量的测试功耗。扫描保持触发器(SHFF)可以将扫描链中的转换与被测电路隔离，以降低测试功率，同时产生大量的面积开销。我们建议通过在SHFF中采用基于忆阻器的D触发器(DFF)来解决这个问题。新设计打破了传统CMOS扫描单元的设计结构，在SHFF中采用忆阻器，减少了晶体管的数量，从而减小了芯片面积。通过HSPICE仿真验证了所提设计功能的正确性。与传统的SHFF电池相比，面积开销减少了26.5%

引用次数: 2

fsync-aware Multi-Buffer FTL for Improving the fsync Latency with Open-Channel SSDs 基于fsync感知的多缓冲区FTL改进开放通道ssd的fsync延迟

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

Pub Date : 2019-08-01 DOI: 10.1109/NVMSA.2019.8863514

Somm Kim, Yunji Kang, Dongkun Shin

Open-Channel SSDs are widely studied because of their advantages such as predictable latency, efficient data placement, and I/O scheduling. Currently, the Linux kernel includes pblk (The Physical Block Device), a host FTL that supports Open-Channel SSDs. In addition, there are recent studies that expand the single-threaded architecture of pblk to multi-threaded architecture: MT-FTL and QBLK. However, both pblk and recent studies were designed without considering fsync latency. However, since the fsync system call is performed synchronously, has a great effect on the performance of the system. In this paper, we propose FA-FTL, which is a host FTL considering fsync latency. Experiments show that FA-FTL is 141% higher than pblk and 119% higher than MT-FTL.

开放通道ssd由于其可预测的延迟、高效的数据放置和I/O调度等优点而被广泛研究。目前，Linux内核包括pblk (the Physical Block Device)，这是一个支持Open-Channel ssd的主机FTL。此外，最近有研究将pblk的单线程架构扩展到多线程架构:MT-FTL和QBLK。然而，pblk和最近的研究都没有考虑fsync延迟。但是，由于fsync系统调用是同步执行的，因此对系统的性能有很大的影响。在本文中，我们提出了FA-FTL，它是一种考虑fsync延迟的主机FTL。实验表明，FA-FTL比pblk高141%，比MT-FTL高119%。

引用次数: 1

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2019 IEEE Non-Volatile Memory Systems and Applications Symposium (NVMSA)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀