With the good properties of NAND flash memory such as small size, shock resistance, and low-power consumption, large capacity SSD (Solid State Disk) is anticipated to replace hard disk in high-end systems. However, the cost of NAND flash memory is still high to substitute for hard disk entirely. Using hard disk and NAND flash memory together as secondary storage is an alternative solution to provide relatively low response time, large capacity, and reasonable cost. In this paper, we present a new buffer cache management scheme with data migration that is optimized to use both NAND flash memory and hard disk together as secondary storage. The proposed scheme has three salient features. First, it detects I/O access patterns from each storage, and allocates the buffer cache space for each pattern by computing marginal gain adaptively considering the I/O cost of storage. Second, it prefetches data selectively according to their access pattern and storage devices. Third, it moves the evicted data from the buffer cache to hard disk or NAND flash memory considering the access patterns of block references on the reclamation. Trace-driven simulations show that the proposed scheme improves the I/O performance significantly. It enhances the buffer cache hit ratio by up to 29.9% and reduces the total I/O elapsed time by up to 49.5% compared to the well-acknowledged UBM scheme.
由于NAND闪存具有体积小、耐冲击、低功耗等优点,大容量SSD (Solid State Disk)有望在高端系统中取代硬盘。然而,NAND闪存的成本仍然很高,无法完全取代硬盘。使用硬盘和NAND闪存作为辅助存储是一种替代解决方案,可以提供相对较低的响应时间、较大的容量和合理的成本。在本文中,我们提出了一种新的缓冲区高速缓存管理方案,该方案优化了NAND闪存和硬盘作为辅助存储的使用。提出的方案有三个显著特点。首先,它从每个存储中检测I/O访问模式,并根据存储的I/O成本自适应计算边际增益,为每个模式分配缓冲缓存空间。其次,根据数据的访问方式和存储设备选择性地预取数据。第三,考虑回收时块引用的访问模式,将被驱逐的数据从缓冲缓存移动到硬盘或NAND闪存。跟踪驱动仿真表明,该方案显著提高了I/O性能。与公认的UBM方案相比,它将缓冲区缓存命中率提高了29.9%,并将总I/O消耗时间减少了49.5%。
{"title":"Unifying Buffer Replacement and Prefetching with Data Migration for Heterogeneous Storage Devices","authors":"Sehwan Lee, K. Koh, H. Bahn","doi":"10.1109/ICPADS.2010.103","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.103","url":null,"abstract":"With the good properties of NAND flash memory such as small size, shock resistance, and low-power consumption, large capacity SSD (Solid State Disk) is anticipated to replace hard disk in high-end systems. However, the cost of NAND flash memory is still high to substitute for hard disk entirely. Using hard disk and NAND flash memory together as secondary storage is an alternative solution to provide relatively low response time, large capacity, and reasonable cost. In this paper, we present a new buffer cache management scheme with data migration that is optimized to use both NAND flash memory and hard disk together as secondary storage. The proposed scheme has three salient features. First, it detects I/O access patterns from each storage, and allocates the buffer cache space for each pattern by computing marginal gain adaptively considering the I/O cost of storage. Second, it prefetches data selectively according to their access pattern and storage devices. Third, it moves the evicted data from the buffer cache to hard disk or NAND flash memory considering the access patterns of block references on the reclamation. Trace-driven simulations show that the proposed scheme improves the I/O performance significantly. It enhances the buffer cache hit ratio by up to 29.9% and reduces the total I/O elapsed time by up to 49.5% compared to the well-acknowledged UBM scheme.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"226 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131445985","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of web has brought rich applications and services, giving users convenience, but also causing that the user’s information is locked and isolated, the users’ resource are disperse, and the operation granularity is not uniform, which ultimately harm the end-users. This paper presents PGOS,Personal Grid Operation System, a general-purpose software for controlled sharing of cross-domain resources in personal net computing [1]. It accesses disperse resources uniformly from web client in order to connect the information islands formed by the companies, and it provides uniform fine-grained sharing mechanism, in addition, we can build new applications by combining the integrated resources in PGOS. The article proposes PGOS Core to complete decentralized user authentication, authorization and access control, Funnel is used to abstract the resource and make decentralized resource discovery, simultaneously PGSML, the Personal Grid Service Markup Language, is put forward to construct PGOS applications.
{"title":"PGOS: An Architecture of a Personal Net Computing Platform","authors":"Jie Liu, Yongqiang Zou","doi":"10.1109/ICPADS.2010.50","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.50","url":null,"abstract":"The development of web has brought rich applications and services, giving users convenience, but also causing that the user’s information is locked and isolated, the users’ resource are disperse, and the operation granularity is not uniform, which ultimately harm the end-users. This paper presents PGOS,Personal Grid Operation System, a general-purpose software for controlled sharing of cross-domain resources in personal net computing [1]. It accesses disperse resources uniformly from web client in order to connect the information islands formed by the companies, and it provides uniform fine-grained sharing mechanism, in addition, we can build new applications by combining the integrated resources in PGOS. The article proposes PGOS Core to complete decentralized user authentication, authorization and access control, Funnel is used to abstract the resource and make decentralized resource discovery, simultaneously PGSML, the Personal Grid Service Markup Language, is put forward to construct PGOS applications.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115750166","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Counting or estimating the number of tags is crucial for RFID system. Researchers have proposed several fast cardinality estimation schemes to estimate the quantity of a batch of tags within a short time frame. Existing estimation schemes scarcely consider the privacy issue. Without effective protection, the adversary can utilize the responding signals to estimate the number of tags as accurate as the valid reader. To address this issue, we propose a novel privacy-preserving estimation scheme, termed as MEAS, which provides an active RF countermeasure against the estimation from invalid readers. MEAS comprises of two components, an Estimation Interference Device (EID) and two well-designed Interference Blanking Estimators (IBE). EID is deployed with the tags to actively generate interfering signals, which introduce sufficiently large estimation errors to invalid or malicious readers. Using a secret interference factor shared with EID, a valid reader can perform accurate estimation via two IBEs. Our theoretical analysis and simulation results show the effectiveness of MEAS. Meanwhile, MEAS can also maintain a high estimation accuracy using IBEs.
{"title":"Utilizing RF Interference to Enable Private Estimation in RFID Systems","authors":"Lei Yang, Jinsong Han, Yong Qi, Cheng Wang, Zhuo Li, Qingsong Yao, Ying Chen, Xiao Zhong","doi":"10.1109/ICPADS.2010.106","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.106","url":null,"abstract":"Counting or estimating the number of tags is crucial for RFID system. Researchers have proposed several fast cardinality estimation schemes to estimate the quantity of a batch of tags within a short time frame. Existing estimation schemes scarcely consider the privacy issue. Without effective protection, the adversary can utilize the responding signals to estimate the number of tags as accurate as the valid reader. To address this issue, we propose a novel privacy-preserving estimation scheme, termed as MEAS, which provides an active RF countermeasure against the estimation from invalid readers. MEAS comprises of two components, an Estimation Interference Device (EID) and two well-designed Interference Blanking Estimators (IBE). EID is deployed with the tags to actively generate interfering signals, which introduce sufficiently large estimation errors to invalid or malicious readers. Using a secret interference factor shared with EID, a valid reader can perform accurate estimation via two IBEs. Our theoretical analysis and simulation results show the effectiveness of MEAS. Meanwhile, MEAS can also maintain a high estimation accuracy using IBEs.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"120 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116367967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study the dynamic bid optimization problem via a primal dual approach. In the case we have no information about the distribution of queries, we reconstruct the ln(U=L) + 1 competitive algorithm proposed in [ZCL08] through a systematic way and showed the intuition behind this algorithm. In the case of random permutation model, we showed that the learning technique used in [DH09] can give us a (1 ¡ O(²)) competitive algorithm for any small constant ² > 0 as long as the optimum is large enough.
{"title":"A Primal Dual Approach for Dynamic Bid Optimization","authors":"Lingfei Yu, Kun She, Changyuan Yu","doi":"10.1109/ICPADS.2010.75","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.75","url":null,"abstract":"We study the dynamic bid optimization problem via a primal dual approach. In the case we have no information about the distribution of queries, we reconstruct the ln(U=L) + 1 competitive algorithm proposed in [ZCL08] through a systematic way and showed the intuition behind this algorithm. In the case of random permutation model, we showed that the learning technique used in [DH09] can give us a (1 ¡ O(²)) competitive algorithm for any small constant ² > 0 as long as the optimum is large enough.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123932050","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper deals with the problem of mapping pipelined applications on heterogeneous platforms whose processors are subject to failures. We address a difficult bi-criteria problem, namely deciding which stages to replicate, and on which resources, in order to optimize the reliability of the schedule, while guaranteeing a minimal throughput. Previous work had addressed the complexity of interval mappings, where the application is partitioned into intervals of consecutive stages (which are then replicated and assigned to processors). In this paper we investigate general mappings, where stages may be partitioned without any constraint, thereby allowing a better usage of processors and communication network capabilities. The price to pay for general mappings is a dramatic increase in the problem complexity. We show that computing the period of a given general mapping is an NP-complete problem, and we provide polynomial bounds to determine a (conservative) approximated value. The bi-criteria mapping problem itself becomes NP-complete on homogeneous platforms, while it is polynomial with interval mappings. We design a set of efficient heuristics, which we compare with interval mapping strategies through extensive simulations.
{"title":"General vs. Interval Mappings for Streaming Applications","authors":"A. Benoit, Hinde-Lilia Bouziane, Y. Robert","doi":"10.1109/ICPADS.2010.15","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.15","url":null,"abstract":"This paper deals with the problem of mapping pipelined applications on heterogeneous platforms whose processors are subject to failures. We address a difficult bi-criteria problem, namely deciding which stages to replicate, and on which resources, in order to optimize the reliability of the schedule, while guaranteeing a minimal throughput. Previous work had addressed the complexity of interval mappings, where the application is partitioned into intervals of consecutive stages (which are then replicated and assigned to processors). In this paper we investigate general mappings, where stages may be partitioned without any constraint, thereby allowing a better usage of processors and communication network capabilities. The price to pay for general mappings is a dramatic increase in the problem complexity. We show that computing the period of a given general mapping is an NP-complete problem, and we provide polynomial bounds to determine a (conservative) approximated value. The bi-criteria mapping problem itself becomes NP-complete on homogeneous platforms, while it is polynomial with interval mappings. We design a set of efficient heuristics, which we compare with interval mapping strategies through extensive simulations.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129735917","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In the optimizing work of parallel program, especially in the realm of massively parallel computing, the parallel computing time and space must be concurrently carefully considered to cut down the computing time as much as possible, because lots of poor parallel space strategies would impact negative effects on computing time. Although, sometimes we have no choice but to sacrifice the space for the time’s further diminishing. What relationship should the computing time and space to keep and how are they going on are two problems, which deciding our optimizing direction directly and must be clear in parallel optimizing. This paper proposes a space theory, named as space speedup, to denote the scalability of memory requirement, and discusses the relationship of time speedup and space speedup, through which the speedups’ guidance capacity in optimizing parallel codes are given.
{"title":"Space Speedup and Its Relationship with Time Speedup","authors":"Yue Hu, W. Tong, Xiaoli Zhi, Zhi-xun Gong","doi":"10.1109/ICPADS.2010.68","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.68","url":null,"abstract":"In the optimizing work of parallel program, especially in the realm of massively parallel computing, the parallel computing time and space must be concurrently carefully considered to cut down the computing time as much as possible, because lots of poor parallel space strategies would impact negative effects on computing time. Although, sometimes we have no choice but to sacrifice the space for the time’s further diminishing. What relationship should the computing time and space to keep and how are they going on are two problems, which deciding our optimizing direction directly and must be clear in parallel optimizing. This paper proposes a space theory, named as space speedup, to denote the scalability of memory requirement, and discusses the relationship of time speedup and space speedup, through which the speedups’ guidance capacity in optimizing parallel codes are given.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129274622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
It is hard to execute parallel program efficiently on man-core platform because we could not divide program into appropriate granularity executed simultaneously. Based on virtual machine and binary translation technologies the article proposes the vapor profiling framework that uses SBIRP instruction in-place replacement method to collect program’s run-time control flow and data flow information precisely. Moreover, it explains how to create control flow and data flow dependency graphs. Experiment results prove that vapor has better performance than traditional methods.
{"title":"Vapor: Virtual Machine Based Parallel Program Profiling Framework","authors":"Yusong Tan, Wei Chen, Q. Wu","doi":"10.1109/ICPADS.2010.59","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.59","url":null,"abstract":"It is hard to execute parallel program efficiently on man-core platform because we could not divide program into appropriate granularity executed simultaneously. Based on virtual machine and binary translation technologies the article proposes the vapor profiling framework that uses SBIRP instruction in-place replacement method to collect program’s run-time control flow and data flow information precisely. Moreover, it explains how to create control flow and data flow dependency graphs. Experiment results prove that vapor has better performance than traditional methods.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129819497","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Di Wu, Tianji Wu, Yi Shan, Yu Wang, Yong He, Ningyi Xu, Huazhong Yang
The research on complex Brain Networks plays a vital role in understanding the connectivity patterns of the human brain and disease-related alterations. Recent studies have suggested a noninvasive way to model and analyze human brain networks by using multi-modal imaging and graph theoretical approaches. Both the construction and analysis of the Brain Networks require tremendous computation. As a result, most current studies of the Brain Networks are focused on a coarse scale based on Brain Regions. Networks on this scale usually consist around 100 nodes. The more accurate and meticulous voxel-base Brain Networks, on the other hand, may consist 20K to 100K nodes. In response to the difficulties of analyzing large-scale networks, we propose an acceleration framework for voxel-base Brain Network Analysis based on Graphics Processing Unit (GPU). Our GPU implementations of Brain Network construction and modularity achieve 24x and 80x speedup respectively, compared with single-core CPU. Our work makes the processing time affordable to analyze multiple large-scale Brain Networks.
{"title":"Making Human Connectome Faster: GPU Acceleration of Brain Network Analysis","authors":"Di Wu, Tianji Wu, Yi Shan, Yu Wang, Yong He, Ningyi Xu, Huazhong Yang","doi":"10.1109/ICPADS.2010.105","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.105","url":null,"abstract":"The research on complex Brain Networks plays a vital role in understanding the connectivity patterns of the human brain and disease-related alterations. Recent studies have suggested a noninvasive way to model and analyze human brain networks by using multi-modal imaging and graph theoretical approaches. Both the construction and analysis of the Brain Networks require tremendous computation. As a result, most current studies of the Brain Networks are focused on a coarse scale based on Brain Regions. Networks on this scale usually consist around 100 nodes. The more accurate and meticulous voxel-base Brain Networks, on the other hand, may consist 20K to 100K nodes. In response to the difficulties of analyzing large-scale networks, we propose an acceleration framework for voxel-base Brain Network Analysis based on Graphics Processing Unit (GPU). Our GPU implementations of Brain Network construction and modularity achieve 24x and 80x speedup respectively, compared with single-core CPU. Our work makes the processing time affordable to analyze multiple large-scale Brain Networks.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127715986","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Wireless sensor networks (WSNs) have received considerable attention in recent years as they have great potential for many distributed applications in different scenarios. Whatever the scenario, WSNs are actually connected to an external network, through which sensed information are passed to the Internet and control messages can reach the WSN. This paper presents Smart, a service model for integrating WSNs and the Internet at service level. Instead of integrating protocol stacks and/or mapping logical addresses, Smart allows the integration of Internet's and WSN's services by providing service interoperability. A communication infrastructure that implements the main components of Smart, along with a power consumption evaluation, is presented to validate the model.
{"title":"Smart: Service Model for Integrating Wireless Sensor Networks and the Internet","authors":"Jeisa P. O. Domingues, A. Dâmaso, N. Rosa","doi":"10.1109/ICPADS.2010.92","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.92","url":null,"abstract":"Wireless sensor networks (WSNs) have received considerable attention in recent years as they have great potential for many distributed applications in different scenarios. Whatever the scenario, WSNs are actually connected to an external network, through which sensed information are passed to the Internet and control messages can reach the WSN. This paper presents Smart, a service model for integrating WSNs and the Internet at service level. Instead of integrating protocol stacks and/or mapping logical addresses, Smart allows the integration of Internet's and WSN's services by providing service interoperability. A communication infrastructure that implements the main components of Smart, along with a power consumption evaluation, is presented to validate the model.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"112 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128049056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The development of flash memory drives flash based SSDs to enter into enterprise-scale storage systems. As the kernel of SSD, flash translation layer (FTL) attracts many attentions. Generally, there are two types of FTLs according to the granularity of address mapping: block-level and page-level mapping FTLs. We focus on the latter one. Typically, page-level mapping scheme must employ a cache to alleviate the memory pressure introduced by the big mapping table. We argue that classic cache replacement policies aren’t competent for the page table cache of FTLs. The major contribution of this work is to design a dedicated cache replacement policy called Two Filters (abbreviated as 2F) for page-level mapping FTLs. 2F aims at two goals. The first is higher hit ratio as all the replacement policies pursue. As 2F not only protects frequently accessed pages, but also protects sequentially accessed pages at little cost, it does achieve a higher hit ratio. The second goal is to distinguish hot pages from the cold. This goal is special for page table of FTLs. If hot and cold pages are directed to separate blocks, garbage collection will be more efficient. In order to achieve this goal, 2F employs two filters. One is used for containing sequentially accessed pages. Another is used for selecting hot pages. Trace driven simulations present that 2F outperforms classic replacement policies in both hit ratio and data classification.
{"title":"2F: A Special Cache for Mapping Table of Page-Level Flash Translation Layer","authors":"Zhiguang Chen, Nong Xiao, Fang Liu, Yimo Du","doi":"10.1109/ICPADS.2010.60","DOIUrl":"https://doi.org/10.1109/ICPADS.2010.60","url":null,"abstract":"The development of flash memory drives flash based SSDs to enter into enterprise-scale storage systems. As the kernel of SSD, flash translation layer (FTL) attracts many attentions. Generally, there are two types of FTLs according to the granularity of address mapping: block-level and page-level mapping FTLs. We focus on the latter one. Typically, page-level mapping scheme must employ a cache to alleviate the memory pressure introduced by the big mapping table. We argue that classic cache replacement policies aren’t competent for the page table cache of FTLs. The major contribution of this work is to design a dedicated cache replacement policy called Two Filters (abbreviated as 2F) for page-level mapping FTLs. 2F aims at two goals. The first is higher hit ratio as all the replacement policies pursue. As 2F not only protects frequently accessed pages, but also protects sequentially accessed pages at little cost, it does achieve a higher hit ratio. The second goal is to distinguish hot pages from the cold. This goal is special for page table of FTLs. If hot and cold pages are directed to separate blocks, garbage collection will be more efficient. In order to achieve this goal, 2F employs two filters. One is used for containing sequentially accessed pages. Another is used for selecting hot pages. Trace driven simulations present that 2F outperforms classic replacement policies in both hit ratio and data classification.","PeriodicalId":365914,"journal":{"name":"2010 IEEE 16th International Conference on Parallel and Distributed Systems","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-12-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121395978","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}