Barrier coverage problems in sensor networks involve detecting intruders that attempt to cross a region of interest. In this paper, we formulate the k-connect barrier count problem for Mixed Sensor Networks (MSNs). The k-connect barrier count problem is to find the maximum number of barriers in an arbitrary MSN where at most k distinct mobile sensors can be used to construct any given virtual edge used in a barrier. We present the solution for the k-connect barrier count problem for k ∈ {0, 1, 2} via Integer Linear Programming. Using simulation results, we show that as k increases, the density of sensors required to achieve barrier coverage decreases. The results quantitatively demonstrate the benefits of mobile sensors.
{"title":"Barrier Counting in Mixed Wireless Sensor Networks","authors":"Shambhavi Srinivasa, C. Williamson, Zongpeng Li","doi":"10.1109/MASCOTS.2012.48","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.48","url":null,"abstract":"Barrier coverage problems in sensor networks involve detecting intruders that attempt to cross a region of interest. In this paper, we formulate the k-connect barrier count problem for Mixed Sensor Networks (MSNs). The k-connect barrier count problem is to find the maximum number of barriers in an arbitrary MSN where at most k distinct mobile sensors can be used to construct any given virtual edge used in a barrier. We present the solution for the k-connect barrier count problem for k ∈ {0, 1, 2} via Integer Linear Programming. Using simulation results, we show that as k increases, the density of sensors required to achieve barrier coverage decreases. The results quantitatively demonstrate the benefits of mobile sensors.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130073277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper we extend and analyze Amdahl's law to general heterogeneous MPSoC era, to find out how the speedup is affected by the parameters, including amount and speedup for microprocessors and accelerators, as well as the task partition characteristics. We also analyze the theoretical results about how the extended Amdahl's Law is applied to leverage load balancing of a heterogeneous MPSoC without the abstract limitation of base core equivalents (BCEs). A prototype on FPGA is constructed with Microblaze processors and JPEG hardware accelerators. The experimental results demonstrate that our extended model reinforces state-of-the-art performance evaluation methods for hybrid MPSoC architectures and also provide creditable new insights on the heterogeneous research communities, in particular for scalable FPGA based reconfigurable MPSoCs.
{"title":"Analyzing Parallelization and Program Performance in Heterogeneous MPSoCs","authors":"Chao Wang, Xi Li, Junneng Zhang, Gangyong Jia, Peng Chen, Xuehai Zhou","doi":"10.1109/MASCOTS.2012.61","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.61","url":null,"abstract":"In this paper we extend and analyze Amdahl's law to general heterogeneous MPSoC era, to find out how the speedup is affected by the parameters, including amount and speedup for microprocessors and accelerators, as well as the task partition characteristics. We also analyze the theoretical results about how the extended Amdahl's Law is applied to leverage load balancing of a heterogeneous MPSoC without the abstract limitation of base core equivalents (BCEs). A prototype on FPGA is constructed with Microblaze processors and JPEG hardware accelerators. The experimental results demonstrate that our extended model reinforces state-of-the-art performance evaluation methods for hybrid MPSoC architectures and also provide creditable new insights on the heterogeneous research communities, in particular for scalable FPGA based reconfigurable MPSoCs.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"115 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114832880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
An important aspect of any network simulation that models wireless networks is the design and implementation of the Propagation Loss Model. The propagation loss model is used to determine the wireless signal strength at the set of receivers for any packet being transmitted by a single transmitter. There are a number of different ways to model this phenomenon, and these vary both in terms of computational complexity and in the measured performance of the wireless network being modeled. In fact, the ns -- 3 simulator presently has 11 different loss models included in the simulator library. We performed a detailed study of these models, comparing their overall performance both in terms of the computational complexity of the algorithms, as well as the measured performance of the wireless network being simulated. The results of these simulation experiments are reported and discussed. Not surprisingly, we observed considerable variation in both metrics.
{"title":"Comparing the ns-3 Propagation Models","authors":"Mirko Stoffers, G. Riley","doi":"10.1109/MASCOTS.2012.17","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.17","url":null,"abstract":"An important aspect of any network simulation that models wireless networks is the design and implementation of the Propagation Loss Model. The propagation loss model is used to determine the wireless signal strength at the set of receivers for any packet being transmitted by a single transmitter. There are a number of different ways to model this phenomenon, and these vary both in terms of computational complexity and in the measured performance of the wireless network being modeled. In fact, the ns -- 3 simulator presently has 11 different loss models included in the simulator library. We performed a detailed study of these models, comparing their overall performance both in terms of the computational complexity of the algorithms, as well as the measured performance of the wireless network being simulated. The results of these simulation experiments are reported and discussed. Not surprisingly, we observed considerable variation in both metrics.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129637859","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We consider the problem of placing virtual clusters, each consisting of a set of heterogeneous virtual machines (VM) with some interrelationships due to communication needs and other dependability-induced constraints, onto physical machines (PM) in a large data center. The placement of such constrained, networked virtual clusters, including compute, storage, and networking resources is challenging. The size of the problem forces one to resort to approximate and heuristics-based optimization techniques. We introduce a statistical approach based on importance sampling (also known as cross-entropy) to solve this placement problem. A straightforward implementation of such a technique proves inefficient. We considerably enhance the method by biasing the sampling process to incorporate communication needs and other constraints of requests to yield an efficient algorithm that is linear in the size of the data center. We investigate the quality of the results of using our algorithm on a simulated system, where we study the effects of various parameters on the solution and performance of the algorithm.
{"title":"A Scalable Algorithm for Placement of Virtual Clusters in Large Data Centers","authors":"A. Tantawi","doi":"10.1109/MASCOTS.2012.11","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.11","url":null,"abstract":"We consider the problem of placing virtual clusters, each consisting of a set of heterogeneous virtual machines (VM) with some interrelationships due to communication needs and other dependability-induced constraints, onto physical machines (PM) in a large data center. The placement of such constrained, networked virtual clusters, including compute, storage, and networking resources is challenging. The size of the problem forces one to resort to approximate and heuristics-based optimization techniques. We introduce a statistical approach based on importance sampling (also known as cross-entropy) to solve this placement problem. A straightforward implementation of such a technique proves inefficient. We considerably enhance the method by biasing the sampling process to incorporate communication needs and other constraints of requests to yield an efficient algorithm that is linear in the size of the data center. We investigate the quality of the results of using our algorithm on a simulated system, where we study the effects of various parameters on the solution and performance of the algorithm.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125533011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we study the performance of solid-state drives that employ flash technology as storage medium. Our prime objective is to understand how the scheduling of the user-generated read and write commands and the read, write, and erase operations induced by the garbage-collection process affect the basic performance measures throughput and latency. We demonstrate that the most straightforward scheduling that prioritizes the processing of garbage-collection-related commands over user-related commands suffers from severe latency deficiencies. These problems can be overcome by using a more sophisticated priority scheme that minimizes the user-perceived latency without throughput penalty or deadlock exposure. Using both analysis and simulation, we investigate how these schemes perform under a variety of system design parameters and workloads. Our results can be directly applied to the engineering of a performance-optimized solid-state-drive system.
{"title":"Scheduling in Flash-Based Solid-State Drives - Performance Modeling and Optimization","authors":"W. Bux, Xiao-Yu Hu, I. Iliadis, R. Haas","doi":"10.1109/MASCOTS.2012.58","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.58","url":null,"abstract":"In this paper, we study the performance of solid-state drives that employ flash technology as storage medium. Our prime objective is to understand how the scheduling of the user-generated read and write commands and the read, write, and erase operations induced by the garbage-collection process affect the basic performance measures throughput and latency. We demonstrate that the most straightforward scheduling that prioritizes the processing of garbage-collection-related commands over user-related commands suffers from severe latency deficiencies. These problems can be overcome by using a more sophisticated priority scheme that minimizes the user-perceived latency without throughput penalty or deadlock exposure. Using both analysis and simulation, we investigate how these schemes perform under a variety of system design parameters and workloads. Our results can be directly applied to the engineering of a performance-optimized solid-state-drive system.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133852309","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia
One of the problems of Software-Transactional-Memory (STM) systems is the performance degradation that can be experienced when applications run with a non-optimal concurrency level, namely number of concurrent threads. When this level is too high a loss of performance may occur due to excessive data contention and consequent transaction aborts. Conversely, if concurrency is too low, the performance may be penalized due to limitation of both parallelism and exploitation of available resources. In this paper we propose a machine-learning based approach which enables STM systems to predict their performance as a function of the number of concurrent threads in order to dynamically select the optimal concurrency level during the whole lifetime of the application. In our approach, the STM is coupled with a neural network and an on-line control algorithm that activates or deactivates application threads in order to maximize performance via the selection of the most adequate concurrency level, as a function of the current data access profile. A real implementation of our proposal within the TinySTM open-source package and an experimental study relying on the STAMP benchmark suite are also presented. The experimental data confirm how our self-adjusting concurrency scheme constantly provides optimal performance, thus avoiding performance loss phases caused by non-suited selection of the amount of concurrent threads and associated with the above depicted phenomena.
{"title":"Machine Learning-Based Self-Adjusting Concurrency in Software Transactional Memory Systems","authors":"Diego Rughetti, P. D. Sanzo, B. Ciciani, F. Quaglia","doi":"10.1109/MASCOTS.2012.40","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.40","url":null,"abstract":"One of the problems of Software-Transactional-Memory (STM) systems is the performance degradation that can be experienced when applications run with a non-optimal concurrency level, namely number of concurrent threads. When this level is too high a loss of performance may occur due to excessive data contention and consequent transaction aborts. Conversely, if concurrency is too low, the performance may be penalized due to limitation of both parallelism and exploitation of available resources. In this paper we propose a machine-learning based approach which enables STM systems to predict their performance as a function of the number of concurrent threads in order to dynamically select the optimal concurrency level during the whole lifetime of the application. In our approach, the STM is coupled with a neural network and an on-line control algorithm that activates or deactivates application threads in order to maximize performance via the selection of the most adequate concurrency level, as a function of the current data access profile. A real implementation of our proposal within the TinySTM open-source package and an experimental study relying on the STAMP benchmark suite are also presented. The experimental data confirm how our self-adjusting concurrency scheme constantly provides optimal performance, thus avoiding performance loss phases caused by non-suited selection of the amount of concurrent threads and associated with the above depicted phenomena.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129504140","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Data center networks are characterized by high link speeds, low propagation delays, small switch buffers, and temporally clustered arrivals of many concurrent TCP flows fulfilling data transfer requests. However, the combination of these features can lead to transient buffer overflow and bursty packet losses, which in turn lead to TCP retransmission timeouts that degrade the performance of short-lived flows. This so-called TCP-incast problem can cause TCP throughput collapse. In this paper, we explore an application-level approach for solving this problem. The key idea of our solution is to coordinate the scheduling of short-lived TCP flows so that no data loss happens. We develop a mathematical model of lossless data transmission, and estimate the maximum good put achievable in data center networks. The results indicate non-monotonic good put that is highly sensitive to specific parameter configurations in the data center network. We validate our model using ns-2 network simulations, which show good correspondence with the theoretical results.
{"title":"Solving the TCP-Incast Problem with Application-Level Scheduling","authors":"Maxim Podlesny, C. Williamson","doi":"10.1109/MASCOTS.2012.21","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.21","url":null,"abstract":"Data center networks are characterized by high link speeds, low propagation delays, small switch buffers, and temporally clustered arrivals of many concurrent TCP flows fulfilling data transfer requests. However, the combination of these features can lead to transient buffer overflow and bursty packet losses, which in turn lead to TCP retransmission timeouts that degrade the performance of short-lived flows. This so-called TCP-incast problem can cause TCP throughput collapse. In this paper, we explore an application-level approach for solving this problem. The key idea of our solution is to coordinate the scheduling of short-lived TCP flows so that no data loss happens. We develop a mathematical model of lossless data transmission, and estimate the maximum good put achievable in data center networks. The results indicate non-monotonic good put that is highly sensitive to specific parameter configurations in the data center network. We validate our model using ns-2 network simulations, which show good correspondence with the theoretical results.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"136 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131377384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Network connectivity, as a fundamental issue in a wireless sensor network(WSN), has been receiving considerable attention during the past decade. Most works focused on how to maintain full connectivity while conserving network resources. However, full connectivity is actually a sufficient but not necessary condition for many WSNs to communicate and function successfully. In addition, full connectivity requires high-demand in network cost as more sensors will be needed. Further, it is subject to high energy consumption and communication interference as higher communication power might be needed to connect the most isolated sensors. In view of this, this work investigates the hop distance in a randomly deployed WSN with partial network connectivity through modeling, analysis, and simulation perspectives. The results help in selecting critical network parameters for practical WSN designs of diverse WSN applications.
{"title":"Hop Distance Analysis in Partially Connected Wireless Sensor Networks","authors":"Yun Wang, Brendan M. Kelly, Aimin Zhou","doi":"10.1109/MASCOTS.2012.28","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.28","url":null,"abstract":"Network connectivity, as a fundamental issue in a wireless sensor network(WSN), has been receiving considerable attention during the past decade. Most works focused on how to maintain full connectivity while conserving network resources. However, full connectivity is actually a sufficient but not necessary condition for many WSNs to communicate and function successfully. In addition, full connectivity requires high-demand in network cost as more sensors will be needed. Further, it is subject to high energy consumption and communication interference as higher communication power might be needed to connect the most isolated sensors. In view of this, this work investigates the hop distance in a randomly deployed WSN with partial network connectivity through modeling, analysis, and simulation perspectives. The results help in selecting critical network parameters for practical WSN designs of diverse WSN applications.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114326666","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Shingled write disk (SWD) is a magnetic hard disk drive that adopts the shingled magnetic recording (SMR) technology to overcome the areal density limit faced in conventional hard disk drives (HDDs). The SMR design enables SWDs to achieve two to three times higher areal density than the HDDs can reach, but it also makes SWDs unable to support random writes/in-place updates with no performance penalty. In particular, a SWD needs to concern about the random write/update interference, which indicates writing to one track overwrites the data previously stored on the subsequent tracks. Some research has been proposed to serve random write/update out-of-place to alleviate the performance degradation at the cost of bringing in the concept of garbage collection. However, none of these studies investigate SWDs based on the garbage collection performance. In this paper, we propose a SWD design called Hot data identification-based Shingled Write Disk (H-SWD). The H-SWD adopts a window-based hot data identification to effectively manage data in the hot bands and the cold bands such that it can significantly reduce the garbage collection overhead while preventing the random write/update interference. The experimental results with various realistic workloads demonstrates that H-SWD outperforms the Indirection System. Specifically, incorporating a simple hot data identification empowers the H-SWD design to remarkably improve garbage collection performance.
Shingled write disk (SWD)是一种采用Shingled magnetic recording (SMR)技术的磁性硬盘驱动器,克服了传统硬盘驱动器(hdd)的面密度限制。SMR设计使swd的面密度比hdd高两到三倍,但它也使swd无法在不影响性能的情况下支持随机写/就地更新。特别是,SWD需要关注随机写入/更新干扰,这表明写入一个磁道会覆盖先前存储在后续磁道上的数据。一些研究已经提出,以引入垃圾收集的概念为代价,提供非位置随机写/更新服务,以减轻性能下降。然而,这些研究都没有基于垃圾收集性能来调查swd。在本文中,我们提出了一种SWD设计,称为基于热数据识别的Shingled Write Disk (H-SWD)。H-SWD采用基于窗口的热数据识别,对热带和冷带的数据进行有效管理,在防止随机写/更新干扰的同时,显著降低垃圾回收开销。在各种实际工作负载下的实验结果表明,H-SWD系统优于间接系统。具体来说,结合简单的热数据识别使H-SWD设计能够显著提高垃圾收集性能。
{"title":"H-SWD: Incorporating Hot Data Identification into Shingled Write Disks","authors":"Chung-I Lin, Dongchul Park, Weiping He, D. Du","doi":"10.1109/MASCOTS.2012.44","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.44","url":null,"abstract":"Shingled write disk (SWD) is a magnetic hard disk drive that adopts the shingled magnetic recording (SMR) technology to overcome the areal density limit faced in conventional hard disk drives (HDDs). The SMR design enables SWDs to achieve two to three times higher areal density than the HDDs can reach, but it also makes SWDs unable to support random writes/in-place updates with no performance penalty. In particular, a SWD needs to concern about the random write/update interference, which indicates writing to one track overwrites the data previously stored on the subsequent tracks. Some research has been proposed to serve random write/update out-of-place to alleviate the performance degradation at the cost of bringing in the concept of garbage collection. However, none of these studies investigate SWDs based on the garbage collection performance. In this paper, we propose a SWD design called Hot data identification-based Shingled Write Disk (H-SWD). The H-SWD adopts a window-based hot data identification to effectively manage data in the hot bands and the cold bands such that it can significantly reduce the garbage collection overhead while preventing the random write/update interference. The experimental results with various realistic workloads demonstrates that H-SWD outperforms the Indirection System. Specifically, incorporating a simple hot data identification empowers the H-SWD design to remarkably improve garbage collection performance.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114500022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee
Flash memory devices commonly rely on traditional address mapping schemes such as page mapping, block mapping or a hybrid of the two. Page mapping is more flexible than block mapping or hybrid mapping without being restricted by block boundaries. However, its mapping table tends to grow large quickly as the capacity of flash memory devices does. To overcome this limitation, we propose a novel mapping scheme that is fundamentally different from the existing mapping strategies. We call this new scheme Virtual Extent Trie (VET), as it manages mapping information by treating each I/O request as an extent and by using extents as basic mapping units rather than pages or blocks. By storing extents instead of individual addresses, VET consumes much less memory to store mapping information and still remains as flexible as page mapping. We observed in our experiments that VET reduced memory consumption by up to an order of magnitude in comparison with the traditional mapping schemes for several real world workloads. The VET scheme also scaled well with increasing address spaces by synthetic workloads. With a binary search mechanism, VET limits the mapping time to O(log log|U |), where U denotes the set of all possible logical addresses. Though the asymptotic mapping cost of VET is higher than the O(1) time of a page mapping scheme, the amount of increased overhead was almost negligible or low enough to be hidden by an accompanying I/O operation.
{"title":"Extent Mapping Scheme for Flash Memory Devices","authors":"Young-Kyoon Suh, Bongki Moon, A. Efrat, Jin-Soo Kim, Sang-Won Lee","doi":"10.1109/MASCOTS.2012.45","DOIUrl":"https://doi.org/10.1109/MASCOTS.2012.45","url":null,"abstract":"Flash memory devices commonly rely on traditional address mapping schemes such as page mapping, block mapping or a hybrid of the two. Page mapping is more flexible than block mapping or hybrid mapping without being restricted by block boundaries. However, its mapping table tends to grow large quickly as the capacity of flash memory devices does. To overcome this limitation, we propose a novel mapping scheme that is fundamentally different from the existing mapping strategies. We call this new scheme Virtual Extent Trie (VET), as it manages mapping information by treating each I/O request as an extent and by using extents as basic mapping units rather than pages or blocks. By storing extents instead of individual addresses, VET consumes much less memory to store mapping information and still remains as flexible as page mapping. We observed in our experiments that VET reduced memory consumption by up to an order of magnitude in comparison with the traditional mapping schemes for several real world workloads. The VET scheme also scaled well with increasing address spaces by synthetic workloads. With a binary search mechanism, VET limits the mapping time to O(log log|U |), where U denotes the set of all possible logical addresses. Though the asymptotic mapping cost of VET is higher than the O(1) time of a page mapping scheme, the amount of increased overhead was almost negligible or low enough to be hidden by an accompanying I/O operation.","PeriodicalId":278764,"journal":{"name":"2012 IEEE 20th International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems","volume":"302 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-08-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122487051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}