首页 > 最新文献

2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing最新文献

英文 中文
Optimal Reliability Design for Real-Time Systems with Dynamic Voltage and Frequency Scaling 动态电压频率标度实时系统的可靠性优化设计
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.35
Toshitaka Koga, T. Dohi, H. Okamura
In designing information communication devices such as real-time embedded systems, it is quite important to maximize the system performance under some hard energy constraints. As a useful technology to reduce the energy consumption in computer-based systems, the dynamic voltage and frequency scaling (DVFS) is becoming very popular. In this paper, we consider an optimal DVFS allocation problem by maximizing the system reliability subject to the hard real-time and energy constraints. More specifically, we formulate a reliability maximization problem when the task processing is probabilistic and is described by a discrete-time Markov chain. Two approximate formulas are proposed to calculate the system reliability efficiently. We perform the sensitivity analysis of model parameters in numerical examples, and also give a case study to design a Wi-Fi subsystem in terms of the DVFS allocation.
在实时嵌入式系统等信息通信设备的设计中,如何在一定的硬能量约束下实现系统性能的最大化是非常重要的。作为一种有效降低计算机系统能耗的技术,动态电压频率标度(DVFS)越来越受到人们的重视。本文考虑了在硬实时性和能量约束下系统可靠性最大化的最优DVFS分配问题。更具体地说,我们提出了一个可靠性最大化问题,当任务处理是概率的,用离散时间马尔可夫链来描述。为了有效地计算系统的可靠度,提出了两个近似公式。通过数值算例对模型参数进行了敏感性分析,并给出了基于DVFS分配的Wi-Fi子系统的设计实例。
{"title":"Optimal Reliability Design for Real-Time Systems with Dynamic Voltage and Frequency Scaling","authors":"Toshitaka Koga, T. Dohi, H. Okamura","doi":"10.1109/PRDC.2014.35","DOIUrl":"https://doi.org/10.1109/PRDC.2014.35","url":null,"abstract":"In designing information communication devices such as real-time embedded systems, it is quite important to maximize the system performance under some hard energy constraints. As a useful technology to reduce the energy consumption in computer-based systems, the dynamic voltage and frequency scaling (DVFS) is becoming very popular. In this paper, we consider an optimal DVFS allocation problem by maximizing the system reliability subject to the hard real-time and energy constraints. More specifically, we formulate a reliability maximization problem when the task processing is probabilistic and is described by a discrete-time Markov chain. Two approximate formulas are proposed to calculate the system reliability efficiently. We perform the sensitivity analysis of model parameters in numerical examples, and also give a case study to design a Wi-Fi subsystem in terms of the DVFS allocation.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129951678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Region-Adherent Algorithms: Restricting the Impact of Faults on Service Quality 区域依赖算法:限制故障对服务质量的影响
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.34
J. Becker, D. Rahmatov, Oliver E. Theel
We present a new class of fault-tolerant distributed algorithms based on a concept which we call region adherence. A region-adherent algorithm upper-bounds the violation of safety due to faults in space. Region adherence counter-poses the concept of self-stabilization which upper-bounds a violation of safety in time. It turns out that region adherence is an orthogonal concept to self-stabilization. We give a formal definition of region-adherence that, intuitively, upper-bounds the reduction of the algorithm's service quality per fault. Then, we present a sample algorithm that exhibits region-adherent behavior and prove this property formally. Finally, we analyze the service quality of the sample algorithm via simulation and compare it to the worst-case behavior stated by the region adherence property.
本文提出了一种基于区域依附概念的分布式容错算法。区域依赖算法对空间中故障对安全的破坏进行了上界处理。区域依附性与自稳定的概念相反,自稳定的上界在时间上违反了安全。结果表明,区域依附是自稳定的一个正交概念。我们给出了区域依附性的形式化定义,直观地给出了每个故障对算法服务质量降低的上界。然后,我们给出了一个表现出区域依附行为的样本算法,并形式化地证明了这一性质。最后,我们通过仿真分析了样本算法的服务质量,并将其与由区域粘附性描述的最坏情况进行了比较。
{"title":"Region-Adherent Algorithms: Restricting the Impact of Faults on Service Quality","authors":"J. Becker, D. Rahmatov, Oliver E. Theel","doi":"10.1109/PRDC.2014.34","DOIUrl":"https://doi.org/10.1109/PRDC.2014.34","url":null,"abstract":"We present a new class of fault-tolerant distributed algorithms based on a concept which we call region adherence. A region-adherent algorithm upper-bounds the violation of safety due to faults in space. Region adherence counter-poses the concept of self-stabilization which upper-bounds a violation of safety in time. It turns out that region adherence is an orthogonal concept to self-stabilization. We give a formal definition of region-adherence that, intuitively, upper-bounds the reduction of the algorithm's service quality per fault. Then, we present a sample algorithm that exhibits region-adherent behavior and prove this property formally. Finally, we analyze the service quality of the sample algorithm via simulation and compare it to the worst-case behavior stated by the region adherence property.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115585855","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
EA-EO: Endurance Aware Erasure Code for SSD-Based Storage Systems 基于ssd存储系统的持久性感知Erasure Code
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.18
Saeideh Alinezhad Chamazcoti, S. Miremadi
One of the main issues in Solid State Drive (SSD)-based storage systems is endurance which is directly affected by the number of Program/Erase (P/E) cycles. The increment of P/E cycles increases the bit error rate threatening the reliability of SSDs. Erasure codes are used to leverage the reliability of storage systems but they also affect the number of P/E cycles based on their code pattern. A lower dependency between data and parities in the code pattern may lead to smaller number of P/E cycles providing better endurance. This paper introduces an Endurance Aware EVENODD (EA-EO), which minimizes the dependency between data and parities in the coding pattern. A simulation environment is used to compare the write-cycles of EA-EO with EVENODD in terms of different request size. The results show that the endurance improvement of EA-EO code could be as high as 44%. Furthermore, performance analysis of these codes in terms of parity construction and failure recovery shows that the number of XOR-operations is reduced in EA-EO compared to EVENODD.
基于固态硬盘(SSD)的存储系统的主要问题之一是持久性,它直接受到程序/擦除(P/E)循环次数的影响。P/E周期的增加会增加误码率,威胁到ssd硬盘的可靠性。Erasure code用于提高存储系统的可靠性,但也会根据其代码模式影响P/E周期的数量。代码模式中数据和对偶之间的依赖性较低,可能导致P/E循环次数较少,从而提供更好的持久性。本文介绍了一种可感知持久偶数(EA-EO)算法,该算法最大限度地减少了编码模式中数据和对偶之间的依赖关系。在不同的请求大小方面,使用仿真环境来比较EA-EO和EVENODD的写周期。结果表明,EA-EO码的耐久性能提高可达44%。此外,从奇偶构造和故障恢复方面对这些代码进行的性能分析表明,与EVENODD相比,EA-EO中的异或操作次数减少了。
{"title":"EA-EO: Endurance Aware Erasure Code for SSD-Based Storage Systems","authors":"Saeideh Alinezhad Chamazcoti, S. Miremadi","doi":"10.1109/PRDC.2014.18","DOIUrl":"https://doi.org/10.1109/PRDC.2014.18","url":null,"abstract":"One of the main issues in Solid State Drive (SSD)-based storage systems is endurance which is directly affected by the number of Program/Erase (P/E) cycles. The increment of P/E cycles increases the bit error rate threatening the reliability of SSDs. Erasure codes are used to leverage the reliability of storage systems but they also affect the number of P/E cycles based on their code pattern. A lower dependency between data and parities in the code pattern may lead to smaller number of P/E cycles providing better endurance. This paper introduces an Endurance Aware EVENODD (EA-EO), which minimizes the dependency between data and parities in the coding pattern. A simulation environment is used to compare the write-cycles of EA-EO with EVENODD in terms of different request size. The results show that the endurance improvement of EA-EO code could be as high as 44%. Furthermore, performance analysis of these codes in terms of parity construction and failure recovery shows that the number of XOR-operations is reduced in EA-EO compared to EVENODD.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126310951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Locating a Faulty Interaction in Pair-wise Testing 在成对测试中定位错误交互
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.26
Takahiro Nagamoto, Hideharu Kojima, Hiroyuki Nakagawa, Tatsuhiro Tsuchiya
This article discusses the location of faulty interactions in software testing. We propose an algorithm to generate a test suite that can be used to identify a faulty pair-wise interaction. This approach works as follows. First, a test suite is generated using an existing method for pair-wise testing. Pair-wise testing requires testing all pair-wise interactions but does not guarantee that the faulty interaction can be located. Second, pair-wise interactions that cannot be located by the test suite are enumerated. Finally, test cases are repeatedly added to the test suite until all pair-wise interactions can be located. The results of applying the algorithm to several problem instances show that the test suites obtained using the algorithm are nearly twice as large as those for ordinary pair-wise testing which does not ensure fault locating ability.
本文讨论了软件测试中错误交互的位置。我们提出了一种算法来生成一个测试套件,该测试套件可用于识别错误的成对交互。这种方法的工作原理如下。首先,使用现有的成对测试方法生成测试套件。成对测试需要测试所有成对交互,但不保证可以找到有缺陷的交互。其次,列举了测试套件无法定位的成对交互。最后,测试用例被重复地添加到测试套件中,直到可以定位所有成对交互。将该算法应用于若干问题实例的结果表明,使用该算法获得的测试套件几乎是普通成对测试的两倍,而普通成对测试无法保证故障定位能力。
{"title":"Locating a Faulty Interaction in Pair-wise Testing","authors":"Takahiro Nagamoto, Hideharu Kojima, Hiroyuki Nakagawa, Tatsuhiro Tsuchiya","doi":"10.1109/PRDC.2014.26","DOIUrl":"https://doi.org/10.1109/PRDC.2014.26","url":null,"abstract":"This article discusses the location of faulty interactions in software testing. We propose an algorithm to generate a test suite that can be used to identify a faulty pair-wise interaction. This approach works as follows. First, a test suite is generated using an existing method for pair-wise testing. Pair-wise testing requires testing all pair-wise interactions but does not guarantee that the faulty interaction can be located. Second, pair-wise interactions that cannot be located by the test suite are enumerated. Finally, test cases are repeatedly added to the test suite until all pair-wise interactions can be located. The results of applying the algorithm to several problem instances show that the test suites obtained using the algorithm are nearly twice as large as those for ordinary pair-wise testing which does not ensure fault locating ability.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"PP 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126356174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
Protecting RAID Arrays against Unexpectedly High Disk Failure Rates 保护RAID免受高硬盘故障率的影响
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.17
Jehan-Francois Pâris, T. Schwarz, A. Amer, D. Long
Disk failure rates vary so widely among different makes and models that designing storage solutions for the worst case scenario is a losing proposition. The approach we propose here is to design our storage solutions for the most probable case while incorporating in our design the option of adding extra redundancy when we find out that its disks are less reliable than expected. To illustrate our proposal, we show how to increase the reliability of existing two-dimensional disk arrays with n2 data elements and 2n parity elements by adding n additional parity elements that will mirror the contents of half the existing parity elements. Our approach offers the three advantages of being easy to deploy, not affecting the complexity of parity calculations, and providing a five-year reliability of 99.999 percent in the face of catastrophic levels of data loss where the array would lose up to a quarter of its storage capacity in a year.
不同品牌和型号的磁盘故障率差异如此之大,以至于为最坏的情况设计存储解决方案是一个失败的主张。我们在这里提出的方法是为最可能的情况设计我们的存储解决方案,同时在我们的设计中纳入当我们发现磁盘的可靠性低于预期时添加额外冗余的选项。为了说明我们的建议,我们展示了如何通过添加n个额外的奇偶元素来提高具有n2个数据元素和2n个奇偶元素的现有二维磁盘阵列的可靠性,这些奇偶元素将镜像现有奇偶元素的一半内容。我们的方法提供了三个优点:易于部署,不影响奇偶校验计算的复杂性,并且在面对灾难性的数据丢失级别时提供99.999%的五年可靠性,其中阵列将在一年内损失多达四分之一的存储容量。
{"title":"Protecting RAID Arrays against Unexpectedly High Disk Failure Rates","authors":"Jehan-Francois Pâris, T. Schwarz, A. Amer, D. Long","doi":"10.1109/PRDC.2014.17","DOIUrl":"https://doi.org/10.1109/PRDC.2014.17","url":null,"abstract":"Disk failure rates vary so widely among different makes and models that designing storage solutions for the worst case scenario is a losing proposition. The approach we propose here is to design our storage solutions for the most probable case while incorporating in our design the option of adding extra redundancy when we find out that its disks are less reliable than expected. To illustrate our proposal, we show how to increase the reliability of existing two-dimensional disk arrays with n2 data elements and 2n parity elements by adding n additional parity elements that will mirror the contents of half the existing parity elements. Our approach offers the three advantages of being easy to deploy, not affecting the complexity of parity calculations, and providing a five-year reliability of 99.999 percent in the face of catastrophic levels of data loss where the array would lose up to a quarter of its storage capacity in a year.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131851092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
From Safety Analyses to Experimental Validation of Automotive Embedded Systems 从汽车嵌入式系统的安全性分析到实验验证
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.23
Ludovic Pintard, J. Fabre, Michel Leeman, K. Kanoun, Matthieu Roy
Automotive embedded systems are becoming increasingly complex. Therefore verification activities are paramount to ensure safety. ISO 26262 is the first standard specifically dedicated to automotive safety systems. This standard requires introducing fault injection (FI) from the very early phases of the development process. Our work aims at developing an approach that will help integrate FI in the whole development process in a continuous way, from system requirements to the verification and validation phase. In this paper, we concentrate on exploring the benefits of safety analyses for experimental validation of the system. We propose an analogy between FI during the pre-implementation phase with safety analyses that are of common use during system design. We finally illustrate this approach on a case study from the automotive domain.
汽车嵌入式系统正变得越来越复杂。因此,核查活动对确保安全至关重要。ISO 26262是第一个专门针对汽车安全系统的标准。该标准要求在开发过程的早期阶段引入故障注入(FI)。我们的工作旨在开发一种方法,以一种持续的方式将FI集成到整个开发过程中,从系统需求到验证和确认阶段。在本文中,我们集中探讨安全分析对系统实验验证的好处。我们建议在实施前阶段的FI与系统设计期间常用的安全分析之间进行类比。最后,我们通过汽车领域的一个案例研究来说明这种方法。
{"title":"From Safety Analyses to Experimental Validation of Automotive Embedded Systems","authors":"Ludovic Pintard, J. Fabre, Michel Leeman, K. Kanoun, Matthieu Roy","doi":"10.1109/PRDC.2014.23","DOIUrl":"https://doi.org/10.1109/PRDC.2014.23","url":null,"abstract":"Automotive embedded systems are becoming increasingly complex. Therefore verification activities are paramount to ensure safety. ISO 26262 is the first standard specifically dedicated to automotive safety systems. This standard requires introducing fault injection (FI) from the very early phases of the development process. Our work aims at developing an approach that will help integrate FI in the whole development process in a continuous way, from system requirements to the verification and validation phase. In this paper, we concentrate on exploring the benefits of safety analyses for experimental validation of the system. We propose an analogy between FI during the pre-implementation phase with safety analyses that are of common use during system design. We finally illustrate this approach on a case study from the automotive domain.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"134 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131052687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Lightweight Bare-Metal Stateful Firewall 轻量级裸金属状态防火墙
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.15
Yihuan Xing, Ford-Long Wong, Akash Kumar
A firewall is a crucial security element in modern computer networks. This work investigates and demonstrates the implementation of a lightweight TCP/IP firewall in a bare-metal environment, on a commercial embedded ARM device. Compared to an implementation having an operating system (OS), using bare-metal design enables reduction of exposure to potential vulnerabilities in OS code, and provides a more dependable system. The implemented firewall provides both static and stateful filtering capabilities, and is configurable in a user-friendly way. As the architecture of the commercial hardware used was not available under closed source licensing, it was discovered through analysis at both hardware and software levels. Some challenges were encountered, and tools were developed to address these. The prototype is validated through functional testing in a controlled environment successfully.
防火墙是现代计算机网络中至关重要的安全元素。这项工作调查并演示了在裸机环境中在商业嵌入式ARM设备上实现轻量级TCP/IP防火墙。与具有操作系统(OS)的实现相比,使用裸机设计可以减少暴露于OS代码中的潜在漏洞,并提供更可靠的系统。实现的防火墙提供静态和有状态过滤功能,并以用户友好的方式进行配置。由于所使用的商业硬件的体系结构在闭源许可下不可用,因此通过硬件和软件级别的分析发现了它。我们遇到了一些挑战,并开发了一些工具来解决这些问题。通过在受控环境下的功能测试,成功验证了原型的有效性。
{"title":"Lightweight Bare-Metal Stateful Firewall","authors":"Yihuan Xing, Ford-Long Wong, Akash Kumar","doi":"10.1109/PRDC.2014.15","DOIUrl":"https://doi.org/10.1109/PRDC.2014.15","url":null,"abstract":"A firewall is a crucial security element in modern computer networks. This work investigates and demonstrates the implementation of a lightweight TCP/IP firewall in a bare-metal environment, on a commercial embedded ARM device. Compared to an implementation having an operating system (OS), using bare-metal design enables reduction of exposure to potential vulnerabilities in OS code, and provides a more dependable system. The implemented firewall provides both static and stateful filtering capabilities, and is configurable in a user-friendly way. As the architecture of the commercial hardware used was not available under closed source licensing, it was discovered through analysis at both hardware and software levels. Some challenges were encountered, and tools were developed to address these. The prototype is validated through functional testing in a controlled environment successfully.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124959578","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CloudBFT: Elastic Byzantine Fault Tolerance clouddbft:弹性拜占庭容错
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.31
Rodrigo Nogueira, Filipe Araújo, R. Barbosa
Cloud computing is increasingly important, with the industry moving towards outsourcing computational resources as a means to reduce investment and management costs, while improving security, dependability and performance. Cloud operators use multi-tenancy, by grouping virtual machines (VMs) into a few physical machines (PMs), to pool computing resources, thus offering elasticity to clients. Although cloud-based fault tolerance schemes impose communication and synchronization overheads, the cloud offers excellent facilities for critical applications, as it can host varying numbers of replicas in independent resources. Given these contradictory forces, determining whether the cloud can host elastic critical services is a major research question. We address this challenge from the perspective of a standard three-tiered system with relational data. We propose to tolerate Byzantine faults using groups of replicas placed on distinct physical machines, as a means to avoid exposing applications to correlated failures. To improve the scalability of our system, we divide data to enable parallel accesses. Using a realistic setup, this setting can reach speedups largely exceeding the number of partitions. Even for a wide variation of the load, the system preserves latency and throughput within reasonable bounds. We believe that the elasticity we observe demonstrates the feasibility of tolerating Byzantine faults in a cloud-based server using a relational database.
云计算越来越重要,业界正在转向外包计算资源,以此作为降低投资和管理成本的手段,同时提高安全性、可靠性和性能。云计算运营商使用多租户,通过将虚拟机(vm)分组为几个物理机(pm)来集中计算资源,从而为客户端提供弹性。尽管基于云的容错方案增加了通信和同步开销,但云为关键应用程序提供了出色的设施,因为它可以在独立资源中托管不同数量的副本。考虑到这些相互矛盾的力量,确定云是否可以承载弹性关键服务是一个主要的研究问题。我们从具有关系数据的标准三层系统的角度来解决这一挑战。我们建议使用放置在不同物理机器上的副本组来容忍拜占庭故障,作为避免将应用程序暴露于相关故障的一种手段。为了提高系统的可扩展性,我们划分数据以实现并行访问。使用实际的设置,此设置可以达到大大超过分区数量的加速。即使负载变化很大,系统也能将延迟和吞吐量保持在合理的范围内。我们认为,我们观察到的弹性证明了在使用关系数据库的基于云的服务器中容忍拜占庭故障的可行性。
{"title":"CloudBFT: Elastic Byzantine Fault Tolerance","authors":"Rodrigo Nogueira, Filipe Araújo, R. Barbosa","doi":"10.1109/PRDC.2014.31","DOIUrl":"https://doi.org/10.1109/PRDC.2014.31","url":null,"abstract":"Cloud computing is increasingly important, with the industry moving towards outsourcing computational resources as a means to reduce investment and management costs, while improving security, dependability and performance. Cloud operators use multi-tenancy, by grouping virtual machines (VMs) into a few physical machines (PMs), to pool computing resources, thus offering elasticity to clients. Although cloud-based fault tolerance schemes impose communication and synchronization overheads, the cloud offers excellent facilities for critical applications, as it can host varying numbers of replicas in independent resources. Given these contradictory forces, determining whether the cloud can host elastic critical services is a major research question. We address this challenge from the perspective of a standard three-tiered system with relational data. We propose to tolerate Byzantine faults using groups of replicas placed on distinct physical machines, as a means to avoid exposing applications to correlated failures. To improve the scalability of our system, we divide data to enable parallel accesses. Using a realistic setup, this setting can reach speedups largely exceeding the number of partitions. Even for a wide variation of the load, the system preserves latency and throughput within reasonable bounds. We believe that the elasticity we observe demonstrates the feasibility of tolerating Byzantine faults in a cloud-based server using a relational database.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116859175","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Study on Routing Protocol for Structured P2P Network Taking Account of the Nodes Which Behave Like a Byzantine Fault 考虑节点拜占庭故障的结构化P2P网络路由协议研究
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.12
S. Fukumoto, Tomoki Endo, Mamoru Ohara, M. Arai
In this study we discuss fault tolerant routing mechanism for the P2P network, Chord, considering the existence of faulty or malicious nodes. We propose a routing protocol which works despite existence of faulty nodes. We modify the original Chord routing protocol so that it can handle redundant lookups of multiple "knuckles," and replica or data fragments allocation on knuckles. Analysis based on mathematical models and simulations show that the proposed protocol effectively avoids the interruption of object supply caused by the target node fault and/or the acquisition failure of objects caused by malicious nodes tampering.
本文在考虑故障节点或恶意节点存在的情况下,讨论了P2P网络Chord的容错路由机制。提出了一种即使存在故障节点也能正常工作的路由协议。我们修改了原始的Chord路由协议,以便它可以处理多个“指关节”的冗余查找,以及指关节上的副本或数据片段分配。基于数学模型和仿真的分析表明,该协议有效地避免了目标节点故障导致的对象供应中断和恶意节点篡改导致的对象获取失败。
{"title":"Study on Routing Protocol for Structured P2P Network Taking Account of the Nodes Which Behave Like a Byzantine Fault","authors":"S. Fukumoto, Tomoki Endo, Mamoru Ohara, M. Arai","doi":"10.1109/PRDC.2014.12","DOIUrl":"https://doi.org/10.1109/PRDC.2014.12","url":null,"abstract":"In this study we discuss fault tolerant routing mechanism for the P2P network, Chord, considering the existence of faulty or malicious nodes. We propose a routing protocol which works despite existence of faulty nodes. We modify the original Chord routing protocol so that it can handle redundant lookups of multiple \"knuckles,\" and replica or data fragments allocation on knuckles. Analysis based on mathematical models and simulations show that the proposed protocol effectively avoids the interruption of object supply caused by the target node fault and/or the acquisition failure of objects caused by malicious nodes tampering.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125559813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Design of Multi-threaded Fault-Tolerant Connection-Oriented Communication 面向连接的多线程容错通信设计
Pub Date : 2014-11-18 DOI: 10.1109/PRDC.2014.10
N. Ivaki, Filipe Araújo, F. Barros
Fault-tolerance is vital for dependable distributed applications that can deliver service, even in the presence of faults. Over the last few decades, above all protocols proposed to offer reliability and fault-tolerance, TCP grew to become one of the cornerstones of the Internet. However, despite emulating reliable communication in distributed environments, TCP does not handle connection failures when the connectivity is lost for some time, even if both endpoints are still running. When this occurs, developers must rollback the peers to some coherent state, many times with error-prone, ad hoc, or custom application-level solutions. In this paper, we refine the Acceptor-Connector design pattern to tackle the TCP unreliability problem. The pattern decouples the failure-related processing from the connection and service processing, efficiently handling different connections and their possible crashes concurrently, thereby yielding more reusable, extensible, and efficient distributed communication. The solution we propose incorporates proven multi-threaded solutions and a buffering scheme that discards the need for an application-layer acknowledgment scheme. This simplifies the development of reliable connection-oriented applications using the ubiquitous TCP protocol.
容错对于可靠的分布式应用程序至关重要,即使在存在故障的情况下也可以提供服务。在过去的几十年里,在所有提出提供可靠性和容错性的协议中,TCP逐渐成为互联网的基石之一。然而,尽管在分布式环境中模拟可靠的通信,TCP在连接丢失一段时间后并不处理连接失败,即使两个端点仍在运行。当发生这种情况时,开发人员必须将对等点回滚到某种一致状态,很多时候使用容易出错的、特别的或自定义的应用程序级解决方案。在本文中,我们改进了Acceptor-Connector设计模式来解决TCP的不可靠性问题。该模式将与故障相关的处理与连接和服务处理解耦,有效地处理不同的连接及其可能并发的崩溃,从而产生更可重用、可扩展和高效的分布式通信。我们提出的解决方案结合了经过验证的多线程解决方案和缓冲方案,该方案放弃了对应用层确认方案的需求。这简化了使用无处不在的TCP协议开发可靠的面向连接的应用程序。
{"title":"Design of Multi-threaded Fault-Tolerant Connection-Oriented Communication","authors":"N. Ivaki, Filipe Araújo, F. Barros","doi":"10.1109/PRDC.2014.10","DOIUrl":"https://doi.org/10.1109/PRDC.2014.10","url":null,"abstract":"Fault-tolerance is vital for dependable distributed applications that can deliver service, even in the presence of faults. Over the last few decades, above all protocols proposed to offer reliability and fault-tolerance, TCP grew to become one of the cornerstones of the Internet. However, despite emulating reliable communication in distributed environments, TCP does not handle connection failures when the connectivity is lost for some time, even if both endpoints are still running. When this occurs, developers must rollback the peers to some coherent state, many times with error-prone, ad hoc, or custom application-level solutions. In this paper, we refine the Acceptor-Connector design pattern to tackle the TCP unreliability problem. The pattern decouples the failure-related processing from the connection and service processing, efficiently handling different connections and their possible crashes concurrently, thereby yielding more reusable, extensible, and efficient distributed communication. The solution we propose incorporates proven multi-threaded solutions and a buffering scheme that discards the need for an application-layer acknowledgment scheme. This simplifies the development of reliable connection-oriented applications using the ubiquitous TCP protocol.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127914833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
期刊
2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1