Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017106
Yang Wang, Fudong Qiu, Fan Wu, Guihai Chen
With the popularity of cloud computing, many companies would outsource their social network data to a cloud service provider, where privacy leaks have become a more and more serious problem. However, most of the previous studies have ignored an important fact, i.e., in real social networks, users possess various attributes and have the flexibility to decide which attributes of their profiles are sensitive attributes by themselves. These sensitive attributes of the users should be protected from being revealed when outsourcing a social network to a cloud service provider. In this paper, we consider the problem of resisting privacy attacks with neighborhood information of both network structure and labels of one-hop neighbors as background knowledge. To tackle this problem, we propose a Global Similarity-based Group Anonymization (GSGA) method to generate a anonymized social network while maintaining as much utility as possible. We also extensively evaluate our approach on both real data set and synthetic data sets. Evaluation results show that the social network anonymized by our approach can still be used to answer aggregation queries with high accuracy.
{"title":"Resisting label-neighborhood attacks in outsourced social networks","authors":"Yang Wang, Fudong Qiu, Fan Wu, Guihai Chen","doi":"10.1109/PCCC.2014.7017106","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017106","url":null,"abstract":"With the popularity of cloud computing, many companies would outsource their social network data to a cloud service provider, where privacy leaks have become a more and more serious problem. However, most of the previous studies have ignored an important fact, i.e., in real social networks, users possess various attributes and have the flexibility to decide which attributes of their profiles are sensitive attributes by themselves. These sensitive attributes of the users should be protected from being revealed when outsourcing a social network to a cloud service provider. In this paper, we consider the problem of resisting privacy attacks with neighborhood information of both network structure and labels of one-hop neighbors as background knowledge. To tackle this problem, we propose a Global Similarity-based Group Anonymization (GSGA) method to generate a anonymized social network while maintaining as much utility as possible. We also extensively evaluate our approach on both real data set and synthetic data sets. Evaluation results show that the social network anonymized by our approach can still be used to answer aggregation queries with high accuracy.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115655298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017018
Haijun Geng, Xingang Shi, Xia Yin, Zhiliang Wang, Han Zhang, Jiangyuan Yao
Current intra-domain routing protocols computes only shortest paths for any pair of nodes which cannot provide good fast reroute when network failures occur. Multipath routing can be fundamentally more efficient than the currently used single path routing protocols. It can significantly reduce congestion in network by shifting traffic to unused network resources. This improves network utilization and provides load balancing. To enhance failure resiliency we propose a new scheme More Nodes Have At Least Two Choices (MNTC) where the goal is how to maximize the number of nodes that have at least two next-hops towards their destinations. We evaluate the algorithm in a wide space of relevant topologies and the results show that it can achieve good reliability while keeping low stretch.
当前的域内路由协议只能计算任意一对节点的最短路径,不能在网络出现故障时提供良好的快速路由。多路径路由可以从根本上比目前使用的单路径路由协议更有效。它可以通过将流量转移到未使用的网络资源来显著减少网络拥塞。这样可以提高网络利用率并提供负载均衡。为了提高故障恢复能力,我们提出了一个新的方案“多节点至少有两个选择”(More Nodes Have Least Two Choices, MNTC),其目标是如何最大化至少有两个下一跳到达目的地的节点数量。我们在广泛的相关拓扑空间中对该算法进行了评估,结果表明该算法在保持低拉伸的同时可以获得良好的可靠性。
{"title":"Let more nodes have a second choice","authors":"Haijun Geng, Xingang Shi, Xia Yin, Zhiliang Wang, Han Zhang, Jiangyuan Yao","doi":"10.1109/PCCC.2014.7017018","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017018","url":null,"abstract":"Current intra-domain routing protocols computes only shortest paths for any pair of nodes which cannot provide good fast reroute when network failures occur. Multipath routing can be fundamentally more efficient than the currently used single path routing protocols. It can significantly reduce congestion in network by shifting traffic to unused network resources. This improves network utilization and provides load balancing. To enhance failure resiliency we propose a new scheme More Nodes Have At Least Two Choices (MNTC) where the goal is how to maximize the number of nodes that have at least two next-hops towards their destinations. We evaluate the algorithm in a wide space of relevant topologies and the results show that it can achieve good reliability while keeping low stretch.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"303 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122917206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017059
Makoto Nakayama, K. Yamazaki, Satoshi Tanaka, H. Kasahara
A serializer/deserializer (SerDe) is necessary to serialize a data object into a byte array and to deserialize in reverse direction. A SerDe that is used worldwide and runs quickly is the Protocol Buffer (ProtoBuf), which serializes a tree-structured data object into the Type-Length-Value (TLV) format. Acceleration of SerDe processing is beneficial because SerDes are used in various fields. This paper proposes a new method that accelerates the tree-to-TLV serialization through 2-way parallel processing called “parallelized serialization” and “parallelization with streaming”. Experimental results show that parallelized serialization with 4 worker threads achieves a 1.97 fold shorter serialization time than when using a single worker thread, and the combination of 2-way parallel processing achieves a 2.11 fold shorter output time than that for ProtoBuf when 4 worker threads, FileOutputStream and trees of 10,080 container nodes are used.
{"title":"Parallelization of tree-to-TLV serialization","authors":"Makoto Nakayama, K. Yamazaki, Satoshi Tanaka, H. Kasahara","doi":"10.1109/PCCC.2014.7017059","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017059","url":null,"abstract":"A serializer/deserializer (SerDe) is necessary to serialize a data object into a byte array and to deserialize in reverse direction. A SerDe that is used worldwide and runs quickly is the Protocol Buffer (ProtoBuf), which serializes a tree-structured data object into the Type-Length-Value (TLV) format. Acceleration of SerDe processing is beneficial because SerDes are used in various fields. This paper proposes a new method that accelerates the tree-to-TLV serialization through 2-way parallel processing called “parallelized serialization” and “parallelization with streaming”. Experimental results show that parallelized serialization with 4 worker threads achieves a 1.97 fold shorter serialization time than when using a single worker thread, and the combination of 2-way parallel processing achieves a 2.11 fold shorter output time than that for ProtoBuf when 4 worker threads, FileOutputStream and trees of 10,080 container nodes are used.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125593244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017105
Li Shi, D. Katramatos, Dantong Yu
Clouds are being widely used for leasing resources to users in the form of on-demand virtual data centers, which comprise sets of virtual machines interconnected by sets of virtual links. Given a user request for a virtual data center with specific resource requirements, a critical problem is to select a set of servers and links in the physical data center of a cloud to satisfy the request in a manner that minimizes the amount of reserved resources. In this paper, we study the main aspects of this Virtual Data Center Allocation (VDCA) problem, and decompose it into three subproblems: virtual data center clustering, virtual machine allocation, and virtual link allocation. We prove the NP-hardness of VDCA and propose an algorithm that solves the problem by dynamically clustering the requested virtual data center and jointly optimizing virtual machine and virtual link allocation. We further compare the performance and scalability of the proposed algorithm with two existing algorithms, called LoCo and SecondNet, through simulations. We demonstrate that our algorithm generates 30%-200% more revenue than LoCo and 55%-300% than SecondNet, while being up to 12 times faster.
{"title":"Virtual data center allocation with dynamic clustering in clouds","authors":"Li Shi, D. Katramatos, Dantong Yu","doi":"10.1109/PCCC.2014.7017105","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017105","url":null,"abstract":"Clouds are being widely used for leasing resources to users in the form of on-demand virtual data centers, which comprise sets of virtual machines interconnected by sets of virtual links. Given a user request for a virtual data center with specific resource requirements, a critical problem is to select a set of servers and links in the physical data center of a cloud to satisfy the request in a manner that minimizes the amount of reserved resources. In this paper, we study the main aspects of this Virtual Data Center Allocation (VDCA) problem, and decompose it into three subproblems: virtual data center clustering, virtual machine allocation, and virtual link allocation. We prove the NP-hardness of VDCA and propose an algorithm that solves the problem by dynamically clustering the requested virtual data center and jointly optimizing virtual machine and virtual link allocation. We further compare the performance and scalability of the proposed algorithm with two existing algorithms, called LoCo and SecondNet, through simulations. We demonstrate that our algorithm generates 30%-200% more revenue than LoCo and 55%-300% than SecondNet, while being up to 12 times faster.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"439 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116014206","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017054
Wenzhuo Li, Chuang Lin
Big data processing frameworks are developing towards larger degrees of parallelism and shorter task durations in order to achieve lower response time. Scheduling highly parallel tasks that complete in nearly 100 milliseconds poses a major challenge for task schedulers. Taking the challenge, researchers turn to decentralized frameworks to relieve the pressure of task schedulers, among which Sparrow is a good choice. However, little efforts are devoted to fault tolerance of Sparrow, which does not handle worker failures, giving rise to incomplete tasks. We present a fault tolerance mechanism named Heartbeat on Sparrow to handle failures of worker machines. Through simulation, we compare it with a simple mechanism. The result shows that Heartbeat on Sparrow can detect worker failures faster and reschedule all failed tasks more efficiently, achieving recovery of tasks and states in sub-second time. We hope this mechanism will make some contributions to Sparrow and other decentralized designs on fault tolerance side.
{"title":"Design and analysis of fault tolerance mechanism for sparrow","authors":"Wenzhuo Li, Chuang Lin","doi":"10.1109/PCCC.2014.7017054","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017054","url":null,"abstract":"Big data processing frameworks are developing towards larger degrees of parallelism and shorter task durations in order to achieve lower response time. Scheduling highly parallel tasks that complete in nearly 100 milliseconds poses a major challenge for task schedulers. Taking the challenge, researchers turn to decentralized frameworks to relieve the pressure of task schedulers, among which Sparrow is a good choice. However, little efforts are devoted to fault tolerance of Sparrow, which does not handle worker failures, giving rise to incomplete tasks. We present a fault tolerance mechanism named Heartbeat on Sparrow to handle failures of worker machines. Through simulation, we compare it with a simple mechanism. The result shows that Heartbeat on Sparrow can detect worker failures faster and reschedule all failed tasks more efficiently, achieving recovery of tasks and states in sub-second time. We hope this mechanism will make some contributions to Sparrow and other decentralized designs on fault tolerance side.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"50 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127437124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017052
Nazim Ahmed, K. Saraç
Path divergence refers to a situation where a path from source to an intermediate router on a source to destination path may not be a prefix of the source to destination path. Studying path divergence helps us understand various operational characteristics of the underlying network. In this paper, we perform an active measurement study to observe the magnitude, causes, and types of path divergence in the Internet. We observe that most path divergence cases occur due to load balancing routers but policy-based inter domain routing practices also contribute to divergence. We also observe that most routers causing path divergence are positioned in the backbone of the network but routers closer to the sources are causing more number of divergences. Our study combined with peering relationship data between neighboring domains can also point out potential routing anomaly cases in the inter domain routing process in the Internet. Finally, our techniques to trace to intermediate routers can explore new IP addresses, routers, Autonomous Systems (ASes) which can potentially help enrich topology mapping procedure and infer new peering relationships among ASes.
{"title":"Measuring path divergence in the Internet","authors":"Nazim Ahmed, K. Saraç","doi":"10.1109/PCCC.2014.7017052","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017052","url":null,"abstract":"Path divergence refers to a situation where a path from source to an intermediate router on a source to destination path may not be a prefix of the source to destination path. Studying path divergence helps us understand various operational characteristics of the underlying network. In this paper, we perform an active measurement study to observe the magnitude, causes, and types of path divergence in the Internet. We observe that most path divergence cases occur due to load balancing routers but policy-based inter domain routing practices also contribute to divergence. We also observe that most routers causing path divergence are positioned in the backbone of the network but routers closer to the sources are causing more number of divergences. Our study combined with peering relationship data between neighboring domains can also point out potential routing anomaly cases in the inter domain routing process in the Internet. Finally, our techniques to trace to intermediate routers can explore new IP addresses, routers, Autonomous Systems (ASes) which can potentially help enrich topology mapping procedure and infer new peering relationships among ASes.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130682490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017094
Yu Wang, Mingwei Xu, Zhenyang Feng, Qing Li, Qi Li
Information-Centric Networking (ICN) has been proposed recently to improve the efficiency of content delivery in current IP networks. ICN employs data names, instead of host addresses, as routing and forwarding indicators. Content in the ICN carries only signature of the content provider but does not contain the identity of the content consumer by default. Such information is, however, essential for many of the web applications, such as email, online social networking, online game, e-commerce, and other session-based web services. In this paper, we propose a session-based access control (SAC) mechanism for ICN scenario to bridge the gap. Key distribution protocols are designed to protect the confidentiality of the content during information delivery. We also employ a dynamic naming scheme to enhance user privacy. According to security analysis, our access control mechanism can provide communication security and privacy protection for both sides of the session. Our design can be easily applied to session-based applications in ICN with negligible overhead.
{"title":"Session-based access control in information-centric networks: Design and analyses","authors":"Yu Wang, Mingwei Xu, Zhenyang Feng, Qing Li, Qi Li","doi":"10.1109/PCCC.2014.7017094","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017094","url":null,"abstract":"Information-Centric Networking (ICN) has been proposed recently to improve the efficiency of content delivery in current IP networks. ICN employs data names, instead of host addresses, as routing and forwarding indicators. Content in the ICN carries only signature of the content provider but does not contain the identity of the content consumer by default. Such information is, however, essential for many of the web applications, such as email, online social networking, online game, e-commerce, and other session-based web services. In this paper, we propose a session-based access control (SAC) mechanism for ICN scenario to bridge the gap. Key distribution protocols are designed to protect the confidentiality of the content during information delivery. We also employ a dynamic naming scheme to enhance user privacy. According to security analysis, our access control mechanism can provide communication security and privacy protection for both sides of the session. Our design can be easily applied to session-based applications in ICN with negligible overhead.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125387692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017065
Reena Panda, L. John
Performance of modern day computer systems greatly depends on the wide range of workloads, which run on the systems. Thus, a representative set of workloads, representing the different classes of real-world applications, need to be used by computer designers and researchers for processor design-space evaluation studies. While a number of different benchmark suites are available, a few common benchmark suites like the SPEC CPU2006 benchmarks are widely used by researchers either due to ease of setup, or simulation time constraints etc. However, as the popular benchmarks such as SPEC CPU2006 benchmarks do not capture the characteristics of the wide variety of emerging real-world applications, using them as the basis for performance evaluation may lead to either suboptimal designs or misleading results. In this paper, we characterize the behavior of the data analytics workloads, an important class of emerging applications, and perform a systematic similarity analysis with the popular SPEC CPU2006 & SPECjbb2013 benchmarks suites. To characterize the workloads, we use hardware performance counter based measurements and a variety of extracted micro-architecture independent workload characteristics. Then, we use statistical data analysis techniques, namely principal component analysis and clustering techniques, to analyze the similarity/dissimilarity among these different classes of applications. In this paper, we demonstrate the inherent differences between the characteristics of the different classes of applications and how to arrive at meaningful subsets of benchmarks, which will help in faster and more accurate targeted early hardware system performance evaluation.
{"title":"Data analytics workloads: Characterization and similarity analysis","authors":"Reena Panda, L. John","doi":"10.1109/PCCC.2014.7017065","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017065","url":null,"abstract":"Performance of modern day computer systems greatly depends on the wide range of workloads, which run on the systems. Thus, a representative set of workloads, representing the different classes of real-world applications, need to be used by computer designers and researchers for processor design-space evaluation studies. While a number of different benchmark suites are available, a few common benchmark suites like the SPEC CPU2006 benchmarks are widely used by researchers either due to ease of setup, or simulation time constraints etc. However, as the popular benchmarks such as SPEC CPU2006 benchmarks do not capture the characteristics of the wide variety of emerging real-world applications, using them as the basis for performance evaluation may lead to either suboptimal designs or misleading results. In this paper, we characterize the behavior of the data analytics workloads, an important class of emerging applications, and perform a systematic similarity analysis with the popular SPEC CPU2006 & SPECjbb2013 benchmarks suites. To characterize the workloads, we use hardware performance counter based measurements and a variety of extracted micro-architecture independent workload characteristics. Then, we use statistical data analysis techniques, namely principal component analysis and clustering techniques, to analyze the similarity/dissimilarity among these different classes of applications. In this paper, we demonstrate the inherent differences between the characteristics of the different classes of applications and how to arrive at meaningful subsets of benchmarks, which will help in faster and more accurate targeted early hardware system performance evaluation.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125855771","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017038
Seonghyun Kim, Hojae Lee, Beom Kwon, Inwoong Lee, Sanghoon Lee
In this paper, we propose a joint transmission and reception with phase control for beamforming in a multi-cell environment. For generated transmit weight vectors of multiple base stations (BSs), a mobile station (MS) calculates phases to maximize an achievable rate with low rate feedback. By using the phases, the multiple transmit weight vectors are coordinated to improve the signal to noise ratio (SNR). In order to find optimal phases, we present a phase control method for the effective channel via geometrical approach.
{"title":"Optimal phase control for joint transmission and reception with beamforming","authors":"Seonghyun Kim, Hojae Lee, Beom Kwon, Inwoong Lee, Sanghoon Lee","doi":"10.1109/PCCC.2014.7017038","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017038","url":null,"abstract":"In this paper, we propose a joint transmission and reception with phase control for beamforming in a multi-cell environment. For generated transmit weight vectors of multiple base stations (BSs), a mobile station (MS) calculates phases to maximize an achievable rate with low rate feedback. By using the phases, the multiple transmit weight vectors are coordinated to improve the signal to noise ratio (SNR). In order to find optimal phases, we present a phase control method for the effective channel via geometrical approach.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114699506","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2014-12-01DOI: 10.1109/PCCC.2014.7017095
P. Subedi, Ping Huang, Xubin He, Ming Zhang, Jizhong Han
The high performance and ever-increasing capacity of flash memory has led to the rapid adoption of Solid-State Disks (SSDs) in mass storage systems. In order to increase disk capacity, multi-level cells (MLC) are used in the design of SSDs, but the use of such SSDs in persistent storage systems raise concerns for users due to the low reliability of such disks. In this paper, we present a hybrid erasure-coded (EECC) architecture that incorporates ECC schemes and erasure codes to improve both performance and reliability. As weak error-correction codes have faster decoding speed than complex error correction codes (ECC), we propose the use of weak-ECC at the segment level rather than complex ECC. To compensate the reduced correction ability of weak-ECC, we use an erasure code that is striped across segments rather than pages or blocks. We use a small sized HDD to store parities so that we can leverage parallelism across multiple devices and remove the parity updates from the critical write path. We carry out simulation experiments based on Disksim to demonstrate that our proposed scheme is able reduce the SSD average read-latency by up to 31.23% and along with tolerance from double chip failures, it dramatically reduces the uncorrectable page error rate.
{"title":"A hybrid erasure-coded ECC scheme to improve performance and reliability of solid state drives","authors":"P. Subedi, Ping Huang, Xubin He, Ming Zhang, Jizhong Han","doi":"10.1109/PCCC.2014.7017095","DOIUrl":"https://doi.org/10.1109/PCCC.2014.7017095","url":null,"abstract":"The high performance and ever-increasing capacity of flash memory has led to the rapid adoption of Solid-State Disks (SSDs) in mass storage systems. In order to increase disk capacity, multi-level cells (MLC) are used in the design of SSDs, but the use of such SSDs in persistent storage systems raise concerns for users due to the low reliability of such disks. In this paper, we present a hybrid erasure-coded (EECC) architecture that incorporates ECC schemes and erasure codes to improve both performance and reliability. As weak error-correction codes have faster decoding speed than complex error correction codes (ECC), we propose the use of weak-ECC at the segment level rather than complex ECC. To compensate the reduced correction ability of weak-ECC, we use an erasure code that is striped across segments rather than pages or blocks. We use a small sized HDD to store parities so that we can leverage parallelism across multiple devices and remove the parity updates from the critical write path. We carry out simulation experiments based on Disksim to demonstrate that our proposed scheme is able reduce the SSD average read-latency by up to 31.23% and along with tolerance from double chip failures, it dramatically reduces the uncorrectable page error rate.","PeriodicalId":105442,"journal":{"name":"2014 IEEE 33rd International Performance Computing and Communications Conference (IPCCC)","volume":"117 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133497415","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}