2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems最新文献

英文中文

A Numerical Algorithm for the Decomposition of Cooperating Structured Markov Processes 协作结构马尔可夫过程分解的数值算法

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.52

A. Marin, S. R. Bulò, S. Balsamo

Modern computer systems consist of a large number of dynamic hardware and software components that interact according to some specific rules. Quantitative models of such systems are important for performance engineering because they allow for an earlier prediction of the quality of service. The application of stochastic modelling for this purpose is limited by the problem of the explosion of the state space of the model, i.e. the number of states that should be considered for an exact analysis increases exponentially and is thus huge even when few components are considered. In this paper we resort to product-form theory to deal with this problem. We define an iterative algorithm with the following characteristics: a) it deals with models with infinite state space and block regular structure (e.g. quasi-birth&death) without the need of truncation; b) in case of detections of product-form according to RCAT conditions, it computes the exact solution of the model; c) in case of non-product-form, it computes an approximate solution. The very loose assumptions allow us to provide examples of analysis of heterogeneous product-form models (e.g., consisting of queues with catastrophes and/or batch removals) as well as approximating non-product-form models with non-exponential service time distributions and negative customers.

现代计算机系统由大量根据特定规则相互作用的动态硬件和软件组成。这类系统的定量模型对于性能工程非常重要，因为它们允许对服务质量进行更早的预测。随机建模在这方面的应用受到模型状态空间爆炸问题的限制，即需要考虑精确分析的状态数量呈指数增长，即使只考虑很少的成分，也会很大。本文采用积型理论来解决这一问题。我们定义了一种迭代算法，它具有以下特点:a)它处理具有无限状态空间和块规则结构(如拟生与死)的模型，而不需要截断;b)根据RCAT条件检测产品形态时，计算模型的精确解;C)对于非乘积形式，它计算一个近似解。非常宽松的假设允许我们提供异构产品形式模型的分析示例(例如，由具有灾难和/或批删除的队列组成)，以及近似具有非指数服务时间分布和负客户的非产品形式模型。

{"title":"A Numerical Algorithm for the Decomposition of Cooperating Structured Markov Processes","authors":"A. Marin, S. R. Bulò, S. Balsamo","doi":"10.1109/MASCOTS.2012.52","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.52","url":null,"abstract":"Modern computer systems consist of a large number of dynamic hardware and software components that interact according to some specific rules. Quantitative models of such systems are important for performance engineering because they allow for an earlier prediction of the quality of service. The application of stochastic modelling for this purpose is limited by the problem of the explosion of the state space of the model, i.e. the number of states that should be considered for an exact analysis increases exponentially and is thus huge even when few components are considered. In this paper we resort to product-form theory to deal with this problem. We define an iterative algorithm with the following characteristics: a) it deals with models with infinite state space and block regular structure (e.g. quasi-birth&death) without the need of truncation; b) in case of detections of product-form according to RCAT conditions, it computes the exact solution of the model; c) in case of non-product-form, it computes an approximate solution. The very loose assumptions allow us to provide examples of analysis of heterogeneous product-form models (e.g., consisting of queues with catastrophes and/or batch removals) as well as approximating non-product-form models with non-exponential service time distributions and negative customers.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114556175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

On the Use of GPUs in Realizing Cost-Effective Distributed RAID 利用gpu实现高性价比的分布式RAID

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.59

Aleksandr Khasymski, M. M. Rafique, A. Butt, Sudharshan S. Vazhkudai, Dimitrios S. Nikolopoulos

The exponential growth in user and application data entails new means for providing fault tolerance and protection against data loss. High Performance Computing (HPC) storage systems, which are at the forefront of handling the data deluge, typically employ hardware RAID at the backend. However, such solutions are costly, do not ensure end-to-end data integrity, and can become a bottleneck during data reconstruction. In this paper, we design an innovative solution to achieve a flexible, fault-tolerant, and high-performance RAID-6 solution for a parallel file system (PFS). Our system utilizes low-cost, strategically placed GPUs - both on the client and server sides - to accelerate parity computation. In contrast to hardware-based approaches, we provide full control over the size, length and location of a RAID array on a per file basis, end-to-end data integrity checking, and parallelization of RAID array reconstruction. We have deployed our system in conjunction with the widely-used Lustre PFS, and show that our approach is feasible and imposes acceptable overhead.

用户和应用程序数据的指数级增长需要新的方法来提供容错和防止数据丢失的保护。高性能计算(High Performance Computing, HPC)存储系统处于处理海量数据的前沿，通常在后端采用硬件RAID。但是，这种解决方案成本高，不能保证端到端的数据完整性，并且可能成为数据重建过程中的瓶颈。在本文中，我们设计了一种创新的解决方案，以实现并行文件系统(PFS)的灵活、容错和高性能RAID-6解决方案。我们的系统利用低成本，战略性地放置gpu -在客户端和服务器端-来加速奇偶计算。与基于硬件的方法相比，我们在每个文件的基础上提供对RAID阵列的大小、长度和位置的完全控制，端到端数据完整性检查，以及RAID阵列重构的并行化。我们已经将我们的系统与广泛使用的Lustre PFS一起部署，并表明我们的方法是可行的，并且会产生可接受的开销。

{"title":"On the Use of GPUs in Realizing Cost-Effective Distributed RAID","authors":"Aleksandr Khasymski, M. M. Rafique, A. Butt, Sudharshan S. Vazhkudai, Dimitrios S. Nikolopoulos","doi":"10.1109/MASCOTS.2012.59","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.59","url":null,"abstract":"The exponential growth in user and application data entails new means for providing fault tolerance and protection against data loss. High Performance Computing (HPC) storage systems, which are at the forefront of handling the data deluge, typically employ hardware RAID at the backend. However, such solutions are costly, do not ensure end-to-end data integrity, and can become a bottleneck during data reconstruction. In this paper, we design an innovative solution to achieve a flexible, fault-tolerant, and high-performance RAID-6 solution for a parallel file system (PFS). Our system utilizes low-cost, strategically placed GPUs - both on the client and server sides - to accelerate parity computation. In contrast to hardware-based approaches, we provide full control over the size, length and location of a RAID array on a per file basis, end-to-end data integrity checking, and parallelization of RAID array reconstruction. We have deployed our system in conjunction with the widely-used Lustre PFS, and show that our approach is feasible and imposes acceptable overhead.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"47 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128042748","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 17

A Scalable Algorithm for Placement of Virtual Clusters in Large Data Centers 大型数据中心虚拟集群布局的可伸缩算法

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.11

A. Tantawi

We consider the problem of placing virtual clusters, each consisting of a set of heterogeneous virtual machines (VM) with some interrelationships due to communication needs and other dependability-induced constraints, onto physical machines (PM) in a large data center. The placement of such constrained, networked virtual clusters, including compute, storage, and networking resources is challenging. The size of the problem forces one to resort to approximate and heuristics-based optimization techniques. We introduce a statistical approach based on importance sampling (also known as cross-entropy) to solve this placement problem. A straightforward implementation of such a technique proves inefficient. We considerably enhance the method by biasing the sampling process to incorporate communication needs and other constraints of requests to yield an efficient algorithm that is linear in the size of the data center. We investigate the quality of the results of using our algorithm on a simulated system, where we study the effects of various parameters on the solution and performance of the algorithm.

我们考虑将虚拟集群(每个集群由一组异构虚拟机(VM)组成，这些虚拟机由于通信需求和其他可靠性约束而具有一些相互关系)放置在大型数据中心的物理机器(PM)上的问题。这种受约束的、网络化的虚拟集群(包括计算、存储和网络资源)的放置具有挑战性。问题的规模迫使人们求助于近似和基于启发式的优化技术。我们引入了一种基于重要性抽样(也称为交叉熵)的统计方法来解决这个安置问题。这种技术的直接实现被证明是低效的。我们通过对采样过程进行偏置，将通信需求和其他请求约束纳入其中，从而大大增强了该方法，从而产生了一个在数据中心大小上呈线性的有效算法。我们研究了在模拟系统上使用我们的算法的结果质量，在那里我们研究了各种参数对算法的解和性能的影响。

{"title":"A Scalable Algorithm for Placement of Virtual Clusters in Large Data Centers","authors":"A. Tantawi","doi":"10.1109/MASCOTS.2012.11","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.11","url":null,"abstract":"We consider the problem of placing virtual clusters, each consisting of a set of heterogeneous virtual machines (VM) with some interrelationships due to communication needs and other dependability-induced constraints, onto physical machines (PM) in a large data center. The placement of such constrained, networked virtual clusters, including compute, storage, and networking resources is challenging. The size of the problem forces one to resort to approximate and heuristics-based optimization techniques. We introduce a statistical approach based on importance sampling (also known as cross-entropy) to solve this placement problem. A straightforward implementation of such a technique proves inefficient. We considerably enhance the method by biasing the sampling process to incorporate communication needs and other constraints of requests to yield an efficient algorithm that is linear in the size of the data center. We investigate the quality of the results of using our algorithm on a simulated system, where we study the effects of various parameters on the solution and performance of the algorithm.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125533011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 24

Comparing the ns-3 Propagation Models ns-3传播模型的比较

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.17

Mirko Stoffers, G. Riley

An important aspect of any network simulation that models wireless networks is the design and implementation of the Propagation Loss Model. The propagation loss model is used to determine the wireless signal strength at the set of receivers for any packet being transmitted by a single transmitter. There are a number of different ways to model this phenomenon, and these vary both in terms of computational complexity and in the measured performance of the wireless network being modeled. In fact, the ns -- 3 simulator presently has 11 different loss models included in the simulator library. We performed a detailed study of these models, comparing their overall performance both in terms of the computational complexity of the algorithms, as well as the measured performance of the wireless network being simulated. The results of these simulation experiments are reported and discussed. Not surprisingly, we observed considerable variation in both metrics.

任何对无线网络建模的网络仿真的一个重要方面是传播损耗模型的设计和实现。传播损耗模型用于确定由单个发射机传输的任何数据包在一组接收器处的无线信号强度。有许多不同的方法来对这种现象进行建模，这些方法在计算复杂性和被建模的无线网络的测量性能方面各不相同。事实上，ns - 3模拟器目前在模拟器库中包含了11种不同的损耗模型。我们对这些模型进行了详细的研究，比较了它们的整体性能，包括算法的计算复杂性，以及正在模拟的无线网络的测量性能。本文报道并讨论了这些模拟实验的结果。不出所料，我们在这两个指标中观察到相当大的差异。

引用次数: 86

Scheduling in Flash-Based Solid-State Drives - Performance Modeling and Optimization 在基于闪存的固态驱动器调度-性能建模和优化

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.58

W. Bux, Xiao-Yu Hu, I. Iliadis, R. Haas

In this paper, we study the performance of solid-state drives that employ flash technology as storage medium. Our prime objective is to understand how the scheduling of the user-generated read and write commands and the read, write, and erase operations induced by the garbage-collection process affect the basic performance measures throughput and latency. We demonstrate that the most straightforward scheduling that prioritizes the processing of garbage-collection-related commands over user-related commands suffers from severe latency deficiencies. These problems can be overcome by using a more sophisticated priority scheme that minimizes the user-perceived latency without throughput penalty or deadlock exposure. Using both analysis and simulation, we investigate how these schemes perform under a variety of system design parameters and workloads. Our results can be directly applied to the engineering of a performance-optimized solid-state-drive system.

本文研究了采用闪存技术作为存储介质的固态硬盘的性能。我们的主要目标是了解用户生成的读写命令的调度以及由垃圾收集进程引起的读、写和擦除操作如何影响基本性能指标吞吐量和延迟。我们证明，将垃圾收集相关命令的处理优先于用户相关命令的最直接调度存在严重的延迟缺陷。这些问题可以通过使用更复杂的优先级方案来克服，该方案可以最小化用户感知的延迟，而不会造成吞吐量损失或死锁暴露。通过分析和仿真，我们研究了这些方案在各种系统设计参数和工作负载下的性能。我们的研究结果可以直接应用于性能优化的固态驱动系统的工程设计。

引用次数: 5

Machine Learning-Based Self-Adjusting Concurrency in Software Transactional Memory Systems 软件事务性存储系统中基于机器学习的自调整并发

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.40

Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia

One of the problems of Software-Transactional-Memory (STM) systems is the performance degradation that can be experienced when applications run with a non-optimal concurrency level, namely number of concurrent threads. When this level is too high a loss of performance may occur due to excessive data contention and consequent transaction aborts. Conversely, if concurrency is too low, the performance may be penalized due to limitation of both parallelism and exploitation of available resources. In this paper we propose a machine-learning based approach which enables STM systems to predict their performance as a function of the number of concurrent threads in order to dynamically select the optimal concurrency level during the whole lifetime of the application. In our approach, the STM is coupled with a neural network and an on-line control algorithm that activates or deactivates application threads in order to maximize performance via the selection of the most adequate concurrency level, as a function of the current data access profile. A real implementation of our proposal within the TinySTM open-source package and an experimental study relying on the STAMP benchmark suite are also presented. The experimental data confirm how our self-adjusting concurrency scheme constantly provides optimal performance, thus avoiding performance loss phases caused by non-suited selection of the amount of concurrent threads and associated with the above depicted phenomena.

软件-事务-内存(software - transaction - memory, STM)系统的问题之一是，当应用程序以非最佳并发级别(即并发线程数)运行时，可能会出现性能下降。当这个级别过高时，可能会由于过多的数据争用和随之而来的事务中止而导致性能损失。相反，如果并发性过低，则由于并行性和可用资源利用的限制，性能可能会受到损害。在本文中，我们提出了一种基于机器学习的方法，该方法使STM系统能够预测其性能作为并发线程数量的函数，以便在应用程序的整个生命周期中动态选择最佳并发级别。在我们的方法中，STM与神经网络和在线控制算法相结合，该算法激活或停用应用程序线程，以便通过选择最适当的并发级别来最大化性能，作为当前数据访问配置文件的函数。本文还介绍了在TinySTM开源包中我们的建议的实际实现以及依赖于STAMP基准套件的实验研究。实验数据证实了我们的自调整并发方案是如何不断提供最优性能的，从而避免了由于并发线程数量选择不合适以及与上述现象相关的性能损失阶段。

{"title":"Machine Learning-Based Self-Adjusting Concurrency in Software Transactional Memory Systems","authors":"Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia","doi":"10.1109/MASCOTS.2012.40","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.40","url":null,"abstract":"One of the problems of Software-Transactional-Memory (STM) systems is the performance degradation that can be experienced when applications run with a non-optimal concurrency level, namely number of concurrent threads. When this level is too high a loss of performance may occur due to excessive data contention and consequent transaction aborts. Conversely, if concurrency is too low, the performance may be penalized due to limitation of both parallelism and exploitation of available resources. In this paper we propose a machine-learning based approach which enables STM systems to predict their performance as a function of the number of concurrent threads in order to dynamically select the optimal concurrency level during the whole lifetime of the application. In our approach, the STM is coupled with a neural network and an on-line control algorithm that activates or deactivates application threads in order to maximize performance via the selection of the most adequate concurrency level, as a function of the current data access profile. A real implementation of our proposal within the TinySTM open-source package and an experimental study relying on the STAMP benchmark suite are also presented. The experimental data confirm how our self-adjusting concurrency scheme constantly provides optimal performance, thus avoiding performance loss phases caused by non-suited selection of the amount of concurrent threads and associated with the above depicted phenomena.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129504140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 52

Solving the TCP-Incast Problem with Application-Level Scheduling 应用级调度解决TCP-Incast问题

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.21

Maxim Podlesny, C. Williamson

Data center networks are characterized by high link speeds, low propagation delays, small switch buffers, and temporally clustered arrivals of many concurrent TCP flows fulfilling data transfer requests. However, the combination of these features can lead to transient buffer overflow and bursty packet losses, which in turn lead to TCP retransmission timeouts that degrade the performance of short-lived flows. This so-called TCP-incast problem can cause TCP throughput collapse. In this paper, we explore an application-level approach for solving this problem. The key idea of our solution is to coordinate the scheduling of short-lived TCP flows so that no data loss happens. We develop a mathematical model of lossless data transmission, and estimate the maximum good put achievable in data center networks. The results indicate non-monotonic good put that is highly sensitive to specific parameter configurations in the data center network. We validate our model using ns-2 network simulations, which show good correspondence with the theoretical results.

数据中心网络的特点是链路速度快、传播延迟低、交换机缓冲区小，以及满足数据传输请求的许多并发TCP流的临时集群到达。然而，这些特性的组合可能会导致短暂的缓冲区溢出和突发数据包丢失，这反过来又会导致TCP重传超时，从而降低短时间流的性能。这种所谓的TCP-incast问题可能导致TCP吞吐量崩溃。在本文中，我们探索了一种应用级的方法来解决这个问题。我们的解决方案的关键思想是协调短期TCP流的调度，这样就不会发生数据丢失。我们建立了一个无损数据传输的数学模型，并估计了数据中心网络中可实现的最大good put。结果表明，在数据中心网络中，对特定参数配置高度敏感的非单调好放。通过ns-2网络仿真验证了模型的正确性，结果与理论结果吻合较好。

引用次数: 24

Hop Distance Analysis in Partially Connected Wireless Sensor Networks 部分连接无线传感器网络中的跳距分析

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.28

Yun Wang, Brendan M. Kelly, Aimin Zhou

Network connectivity, as a fundamental issue in a wireless sensor network(WSN), has been receiving considerable attention during the past decade. Most works focused on how to maintain full connectivity while conserving network resources. However, full connectivity is actually a sufficient but not necessary condition for many WSNs to communicate and function successfully. In addition, full connectivity requires high-demand in network cost as more sensors will be needed. Further, it is subject to high energy consumption and communication interference as higher communication power might be needed to connect the most isolated sensors. In view of this, this work investigates the hop distance in a randomly deployed WSN with partial network connectivity through modeling, analysis, and simulation perspectives. The results help in selecting critical network parameters for practical WSN designs of diverse WSN applications.

网络连接作为无线传感器网络(WSN)的一个基本问题，在过去的十年中受到了广泛的关注。大多数工作都集中在如何在节约网络资源的同时保持完全连接。然而，完全连接实际上是许多wsn成功通信和工作的充分条件，但不是必要条件。此外，由于需要更多的传感器，完全连接对网络成本的要求也很高。此外，由于可能需要更高的通信功率来连接最隔离的传感器，因此它受到高能耗和通信干扰的影响。鉴于此，本研究从建模、分析和仿真的角度研究了具有部分网络连接的随机部署WSN中的跳距离。研究结果有助于在不同应用的实际WSN设计中选择关键网络参数。

引用次数: 3

H-SWD: Incorporating Hot Data Identification into Shingled Write Disks H-SWD:将热数据识别集成到带状写盘中

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.44

Chung-I Lin, Dongchul Park, Weiping He, D. Du

Shingled write disk (SWD) is a magnetic hard disk drive that adopts the shingled magnetic recording (SMR) technology to overcome the areal density limit faced in conventional hard disk drives (HDDs). The SMR design enables SWDs to achieve two to three times higher areal density than the HDDs can reach, but it also makes SWDs unable to support random writes/in-place updates with no performance penalty. In particular, a SWD needs to concern about the random write/update interference, which indicates writing to one track overwrites the data previously stored on the subsequent tracks. Some research has been proposed to serve random write/update out-of-place to alleviate the performance degradation at the cost of bringing in the concept of garbage collection. However, none of these studies investigate SWDs based on the garbage collection performance. In this paper, we propose a SWD design called Hot data identification-based Shingled Write Disk (H-SWD). The H-SWD adopts a window-based hot data identification to effectively manage data in the hot bands and the cold bands such that it can significantly reduce the garbage collection overhead while preventing the random write/update interference. The experimental results with various realistic workloads demonstrates that H-SWD outperforms the Indirection System. Specifically, incorporating a simple hot data identification empowers the H-SWD design to remarkably improve garbage collection performance.

Shingled write disk (SWD)是一种采用Shingled magnetic recording (SMR)技术的磁性硬盘驱动器，克服了传统硬盘驱动器(hdd)的面密度限制。SMR设计使swd的面密度比hdd高两到三倍，但它也使swd无法在不影响性能的情况下支持随机写/就地更新。特别是，SWD需要关注随机写入/更新干扰，这表明写入一个磁道会覆盖先前存储在后续磁道上的数据。一些研究已经提出，以引入垃圾收集的概念为代价，提供非位置随机写/更新服务，以减轻性能下降。然而，这些研究都没有基于垃圾收集性能来调查swd。在本文中，我们提出了一种SWD设计，称为基于热数据识别的Shingled Write Disk (H-SWD)。H-SWD采用基于窗口的热数据识别，对热带和冷带的数据进行有效管理，在防止随机写/更新干扰的同时，显著降低垃圾回收开销。在各种实际工作负载下的实验结果表明，H-SWD系统优于间接系统。具体来说，结合简单的热数据识别使H-SWD设计能够显著提高垃圾收集性能。

{"title":"H-SWD: Incorporating Hot Data Identification into Shingled Write Disks","authors":"Chung-I Lin, Dongchul Park, Weiping He, D. Du","doi":"10.1109/MASCOTS.2012.44","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.44","url":null,"abstract":"Shingled write disk (SWD) is a magnetic hard disk drive that adopts the shingled magnetic recording (SMR) technology to overcome the areal density limit faced in conventional hard disk drives (HDDs). The SMR design enables SWDs to achieve two to three times higher areal density than the HDDs can reach, but it also makes SWDs unable to support random writes/in-place updates with no performance penalty. In particular, a SWD needs to concern about the random write/update interference, which indicates writing to one track overwrites the data previously stored on the subsequent tracks. Some research has been proposed to serve random write/update out-of-place to alleviate the performance degradation at the cost of bringing in the concept of garbage collection. However, none of these studies investigate SWDs based on the garbage collection performance. In this paper, we propose a SWD design called Hot data identification-based Shingled Write Disk (H-SWD). The H-SWD adopts a window-based hot data identification to effectively manage data in the hot bands and the cold bands such that it can significantly reduce the garbage collection overhead while preventing the random write/update interference. The experimental results with various realistic workloads demonstrates that H-SWD outperforms the Indirection System. Specifically, incorporating a simple hot data identification empowers the H-SWD design to remarkably improve garbage collection performance.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114500022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 43

Extent Mapping Scheme for Flash Memory Devices 闪存设备的范围映射方案

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

Pub Date : 2012-08-07 DOI: 10.1109/MASCOTS.2012.45

Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee

Flash memory devices commonly rely on traditional address mapping schemes such as page mapping, block mapping or a hybrid of the two. Page mapping is more flexible than block mapping or hybrid mapping without being restricted by block boundaries. However, its mapping table tends to grow large quickly as the capacity of flash memory devices does. To overcome this limitation, we propose a novel mapping scheme that is fundamentally different from the existing mapping strategies. We call this new scheme Virtual Extent Trie (VET), as it manages mapping information by treating each I/O request as an extent and by using extents as basic mapping units rather than pages or blocks. By storing extents instead of individual addresses, VET consumes much less memory to store mapping information and still remains as flexible as page mapping. We observed in our experiments that VET reduced memory consumption by up to an order of magnitude in comparison with the traditional mapping schemes for several real world workloads. The VET scheme also scaled well with increasing address spaces by synthetic workloads. With a binary search mechanism, VET limits the mapping time to O(log log|U |), where U denotes the set of all possible logical addresses. Though the asymptotic mapping cost of VET is higher than the O(1) time of a page mapping scheme, the amount of increased overhead was almost negligible or low enough to be hidden by an accompanying I/O operation.

闪存设备通常依赖于传统的地址映射方案，如页面映射、块映射或两者的混合。页映射比块映射或混合映射更灵活，不受块边界的限制。然而，它的映射表倾向于随着闪存设备容量的增长而迅速增长。为了克服这一限制，我们提出了一种与现有映射策略根本不同的新型映射方案。我们称这种新方案为虚拟范围三(Virtual Extent Trie, VET)，因为它通过将每个I/O请求视为一个范围，并使用范围作为基本映射单元(而不是页或块)来管理映射信息。通过存储区段而不是单个地址，VET消耗更少的内存来存储映射信息，并且仍然保持与页面映射一样的灵活性。我们在实验中观察到，对于几个真实世界的工作负载，与传统的映射方案相比，VET将内存消耗降低了一个数量级。随着合成工作负载的地址空间的增加，VET方案也可以很好地扩展。使用二进制搜索机制，VET将映射时间限制为O(log log|U |)，其中U表示所有可能的逻辑地址的集合。尽管VET的渐近映射成本高于页面映射方案的O(1)时间，但增加的开销几乎可以忽略不计，或者低到足以被伴随的I/O操作所隐藏。

{"title":"Extent Mapping Scheme for Flash Memory Devices","authors":"Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee","doi":"10.1109/MASCOTS.2012.45","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.45","url":null,"abstract":"Flash memory devices commonly rely on traditional address mapping schemes such as page mapping, block mapping or a hybrid of the two. Page mapping is more flexible than block mapping or hybrid mapping without being restricted by block boundaries. However, its mapping table tends to grow large quickly as the capacity of flash memory devices does. To overcome this limitation, we propose a novel mapping scheme that is fundamentally different from the existing mapping strategies. We call this new scheme Virtual Extent Trie (VET), as it manages mapping information by treating each I/O request as an extent and by using extents as basic mapping units rather than pages or blocks. By storing extents instead of individual addresses, VET consumes much less memory to store mapping information and still remains as flexible as page mapping. We observed in our experiments that VET reduced memory consumption by up to an order of magnitude in comparison with the traditional mapping schemes for several real world workloads. The VET scheme also scaled well with increasing address spaces by synthetic workloads. With a binary search mechanism, VET limits the mapping time to O(log log|U |), where U denotes the set of all possible logical addresses. Though the asymptotic mapping cost of VET is higher than the O(1) time of a page mapping scheme, the amount of increased overhead was almost negligible or low enough to be hidden by an accompanying I/O operation.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"302 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀