首页 > 最新文献

2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)最新文献

英文 中文
Experimental Comparison of Local and Shared Coin Randomized Consensus Protocols 本地和共享币随机共识协议的实验比较
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.19
Henrique Moniz, N. Neves, M. Correia, P. Veríssimo
The paper presents a comparative performance study of the two main classes of randomized binary consensus protocols: a local coin protocol, with an expected high communication complexity and cheap symmetric cryptography, and a shared coin protocol, with an expected low communication complexity and expensive asymmetric cryptography. The experimental evaluation was conducted on a LAN environment, by varying several system parameters, such as the fault types and number of processes. The analysis shows that there is a significant gap between the theoretical and the practical performance results of these protocols, and provides an important insight into what actually happens during their execution
本文对两类主要的随机化二元共识协议进行了性能比较研究:具有高通信复杂度和廉价对称密码的本地币协议和具有低通信复杂度和昂贵非对称密码的共享币协议。实验评估是在局域网环境下进行的,通过改变几个系统参数,如故障类型和进程数量。分析表明,这些协议的理论和实际性能结果之间存在很大差距,并提供了一个重要的见解,了解在执行过程中实际发生了什么
{"title":"Experimental Comparison of Local and Shared Coin Randomized Consensus Protocols","authors":"Henrique Moniz, N. Neves, M. Correia, P. Veríssimo","doi":"10.1109/SRDS.2006.19","DOIUrl":"https://doi.org/10.1109/SRDS.2006.19","url":null,"abstract":"The paper presents a comparative performance study of the two main classes of randomized binary consensus protocols: a local coin protocol, with an expected high communication complexity and cheap symmetric cryptography, and a shared coin protocol, with an expected low communication complexity and expensive asymmetric cryptography. The experimental evaluation was conducted on a LAN environment, by varying several system parameters, such as the fault types and number of processes. The analysis shows that there is a significant gap between the theoretical and the practical performance results of these protocols, and provides an important insight into what actually happens during their execution","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117041906","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Lightweight Reflection for Middleware-based Database Replication 用于基于中间件的数据库复制的轻量级反射
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.28
J. Salas, R. Jiménez-Peris, M. Patiño-Martínez, Bettina Kemme
Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in the software industry. However, middleware solutions often lack scalability and exhibit a number of consistency and performance issues. The reason is that in most cases the middleware has to handle the database as a black box, and hence, cannot take advantage of the many optimizations implemented in the database kernel. Thus, middleware solutions often reimplement key functionality but cannot achieve the same efficiency as a kernel implementation. Reflection has been proposed during the last decade as a fruitful paradigm to separate non-functional aspects from functional ones, simplifying software development and maintenance whilst fostering reuse. However, fully reflective databases are not feasible due to the high cost of reflection. Our claim is that by exposing some minimal database functionality through a lightweight reflective interface, efficient and scalable middleware database replication can be attained. In this paper we explore a wide variety of such lightweight reflective interfaces and discuss what kind of replication algorithms they enable. We also discuss implementation alternatives for some of these interfaces and evaluate their performance
最近几年出现了基于中间件的数据库复制方法,作为在数据库内核中实现的传统数据库复制的替代方案。中间件方法使第三方供应商能够提供高可用性解决方案,这是当今软件行业中越来越多的实践。然而,中间件解决方案往往缺乏可伸缩性,并表现出许多一致性和性能问题。原因在于,在大多数情况下,中间件必须将数据库作为黑盒来处理,因此无法利用数据库内核中实现的许多优化。因此,中间件解决方案经常重新实现关键功能,但无法达到与内核实现相同的效率。在过去的十年中,反射作为一种富有成效的范例被提出,它将非功能方面与功能方面分离开来,简化了软件开发和维护,同时促进了重用。然而,由于反射的高成本,完全反射的数据库是不可行的。我们的主张是,通过轻量级反射接口暴露一些最小的数据库功能,可以获得高效且可扩展的中间件数据库复制。在本文中,我们探讨了各种各样的这种轻量级反射接口,并讨论了它们支持哪种复制算法。我们还讨论了其中一些接口的实现方案,并评估了它们的性能
{"title":"Lightweight Reflection for Middleware-based Database Replication","authors":"J. Salas, R. Jiménez-Peris, M. Patiño-Martínez, Bettina Kemme","doi":"10.1109/SRDS.2006.28","DOIUrl":"https://doi.org/10.1109/SRDS.2006.28","url":null,"abstract":"Middleware-based database replication approaches have emerged in the last few years as an alternative to traditional database replication implemented within the database kernel. A middleware approach enables third party vendors to provide high availability solutions, a growing practice nowadays in the software industry. However, middleware solutions often lack scalability and exhibit a number of consistency and performance issues. The reason is that in most cases the middleware has to handle the database as a black box, and hence, cannot take advantage of the many optimizations implemented in the database kernel. Thus, middleware solutions often reimplement key functionality but cannot achieve the same efficiency as a kernel implementation. Reflection has been proposed during the last decade as a fruitful paradigm to separate non-functional aspects from functional ones, simplifying software development and maintenance whilst fostering reuse. However, fully reflective databases are not feasible due to the high cost of reflection. Our claim is that by exposing some minimal database functionality through a lightweight reflective interface, efficient and scalable middleware database replication can be attained. In this paper we explore a wide variety of such lightweight reflective interfaces and discuss what kind of replication algorithms they enable. We also discuss implementation alternatives for some of these interfaces and evaluate their performance","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"4 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117128345","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 25
Generalised Repair for Overlay Networks 覆盖网络的广义修复
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.23
Barry Porter, François Taïani, G. Coulson
We present and evaluate a generic approach to the repair of overlay networks which identifies general principles of overlay repair and embodies these as a reusable service. At the heart of our approach is an algorithm that discovers the extent of a failed section of any type of overlay, and assigns responsibility to carry out the repair. The repair strategy itself is 'pluggable' and can be tailored to the requirements of a specific overlay type or instance. Our approach is efficient in terms of the number of repair-related message exchanges it incurs; scalable in that it involves only nodes in the locality of the failed section of the overlay; and resilient in that it correctly handles cases in which multiple adjacent nodes fail simultaneously, and it tolerates new failures that occur while a repair is underway. The benefits of our approach are that: (i) it extracts and encapsulates best practice in repair for overlays; (ii) it simplifies the design and implementation of new overlays (because repair issues can be treated orthogonally to basic functionality); and (iii) it supports tailorable levels of dependability for overlays, including pluggable repair strategies
我们提出并评估了一种修复覆盖网络的通用方法,该方法确定了覆盖修复的一般原则,并将其体现为可重用的服务。我们方法的核心是一种算法,该算法可以发现任何类型覆盖层的故障部分的范围,并分配责任进行修复。修复策略本身是“可插拔的”,可以根据特定覆盖类型或实例的要求进行定制。我们的方法在与修复相关的消息交换数量方面是有效的;可扩展性在于它只涉及覆盖失效部分局部的节点;弹性在于它能正确处理多个相邻节点同时故障的情况,并能容忍在修复过程中出现的新故障。我们的方法的好处是:(i)它提取并封装了覆盖层修复的最佳实践;(ii)简化了新覆盖层的设计和实现(因为修复问题可以与基本功能正交处理);(iii)它支持可定制的覆盖层可靠性水平,包括可插拔修复策略
{"title":"Generalised Repair for Overlay Networks","authors":"Barry Porter, François Taïani, G. Coulson","doi":"10.1109/SRDS.2006.23","DOIUrl":"https://doi.org/10.1109/SRDS.2006.23","url":null,"abstract":"We present and evaluate a generic approach to the repair of overlay networks which identifies general principles of overlay repair and embodies these as a reusable service. At the heart of our approach is an algorithm that discovers the extent of a failed section of any type of overlay, and assigns responsibility to carry out the repair. The repair strategy itself is 'pluggable' and can be tailored to the requirements of a specific overlay type or instance. Our approach is efficient in terms of the number of repair-related message exchanges it incurs; scalable in that it involves only nodes in the locality of the failed section of the overlay; and resilient in that it correctly handles cases in which multiple adjacent nodes fail simultaneously, and it tolerates new failures that occur while a repair is underway. The benefits of our approach are that: (i) it extracts and encapsulates best practice in repair for overlays; (ii) it simplifies the design and implementation of new overlays (because repair issues can be treated orthogonally to basic functionality); and (iii) it supports tailorable levels of dependability for overlays, including pluggable repair strategies","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128468633","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Satem: Trusted Service Code Execution across Transactions Satem:跨事务的可信服务代码执行
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.42
Gang Xu, C. Borcea, L. Iftode
Web services and service oriented architectures are becoming the de facto standard for Internet computing. A main problem faced by users of such services is how to ensure that the service code is trusted. While methods that guarantee trusted service code execution before starting a client-service transaction exist, there is no solution for extending this assurance to the entire lifetime of the transaction. This paper presents Satem, a Service-aware trusted execution monitor that guarantees the trustworthiness of the service code across a whole transaction. The Satem architecture consists of an execution monitor residing in the operating system kernel on the service provider platform, a trust evaluator on the client platform, and a service commitment protocol. During this protocol, executed before every transaction, the client requests and verifies against its local policy a commitment from the service platform that promises trusted code execution. Subsequently, the monitor enforces this commitment for the duration of the transaction. To initialize the trust on the monitor, we use the Trusted Platform Module specified by the Trusted Computing Group. We implemented Satem under the Linux 2.6.12 kernel and tested it for a Web service and DNS. The experimental results demonstrate that Satem does not incur significant overhead to the protected services and does not impact the unprotected services
Web服务和面向服务的体系结构正在成为Internet计算的事实上的标准。这类服务的用户面临的一个主要问题是如何确保服务代码是可信的。虽然存在在启动客户机-服务事务之前保证可信服务代码执行的方法,但是没有将这种保证扩展到事务的整个生命周期的解决方案。本文介绍了Satem,一个服务感知可信执行监视器,它保证了整个事务中服务代码的可信性。Satem体系结构由位于服务提供者平台上的操作系统内核中的执行监视器、客户机平台上的信任评估器和服务承诺协议组成。在此协议(在每个事务之前执行)期间,客户端请求并根据其本地策略验证来自服务平台的承诺,该承诺承诺执行受信任的代码。随后,监控器在事务期间强制执行此承诺。要初始化监视器上的信任,我们使用可信计算组指定的可信平台模块。我们在Linux 2.6.12内核下实现了Satem,并对其进行了Web服务和DNS测试。实验结果表明,Satem不会给受保护的服务带来明显的开销,也不会对未受保护的服务产生影响
{"title":"Satem: Trusted Service Code Execution across Transactions","authors":"Gang Xu, C. Borcea, L. Iftode","doi":"10.1109/SRDS.2006.42","DOIUrl":"https://doi.org/10.1109/SRDS.2006.42","url":null,"abstract":"Web services and service oriented architectures are becoming the de facto standard for Internet computing. A main problem faced by users of such services is how to ensure that the service code is trusted. While methods that guarantee trusted service code execution before starting a client-service transaction exist, there is no solution for extending this assurance to the entire lifetime of the transaction. This paper presents Satem, a Service-aware trusted execution monitor that guarantees the trustworthiness of the service code across a whole transaction. The Satem architecture consists of an execution monitor residing in the operating system kernel on the service provider platform, a trust evaluator on the client platform, and a service commitment protocol. During this protocol, executed before every transaction, the client requests and verifies against its local policy a commitment from the service platform that promises trusted code execution. Subsequently, the monitor enforces this commitment for the duration of the transaction. To initialize the trust on the monitor, we use the Trusted Platform Module specified by the Trusted Computing Group. We implemented Satem under the Linux 2.6.12 kernel and tested it for a Web service and DNS. The experimental results demonstrate that Satem does not incur significant overhead to the protected services and does not impact the unprotected services","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129136484","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
A Scalable Services Architecture 可扩展的服务体系结构
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.7
Tudor Marian, K. Birman, R. V. Renesse
Data centers constructed as clusters of inexpensive machines have compelling cost-performance benefits, but developing services to run on them can be challenging. This paper reports on a new framework, the scalable services architecture (SSA), which helps developers develop scalable clustered applications. The work is focused on non-transactional high-performance applications; these are poorly supported in existing platforms. A primary goal was to keep the SSA as small and simple as possible. Key elements include a TCP-based "chain replication" mechanism and a gossip-based subsystem for managing configuration data and repairing inconsistencies after faults. Our experimental results confirm the effectiveness of the approach
作为廉价机器集群构建的数据中心具有引人注目的性价比优势,但是开发在其上运行的服务可能具有挑战性。本文介绍了一个新的框架,可扩展服务体系结构(SSA),它可以帮助开发人员开发可扩展的集群应用程序。工作重点是非事务性高性能应用程序;现有平台对这些功能的支持很差。主要目标是使SSA尽可能小而简单。关键元素包括基于tcp的“链复制”机制和基于八卦的子系统,用于管理配置数据和在故障后修复不一致。实验结果证实了该方法的有效性
{"title":"A Scalable Services Architecture","authors":"Tudor Marian, K. Birman, R. V. Renesse","doi":"10.1109/SRDS.2006.7","DOIUrl":"https://doi.org/10.1109/SRDS.2006.7","url":null,"abstract":"Data centers constructed as clusters of inexpensive machines have compelling cost-performance benefits, but developing services to run on them can be challenging. This paper reports on a new framework, the scalable services architecture (SSA), which helps developers develop scalable clustered applications. The work is focused on non-transactional high-performance applications; these are poorly supported in existing platforms. A primary goal was to keep the SSA as small and simple as possible. Key elements include a TCP-based \"chain replication\" mechanism and a gossip-based subsystem for managing configuration data and repairing inconsistencies after faults. Our experimental results confirm the effectiveness of the approach","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121968701","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Improvements and Reconsideration of Distributed Snapshot Protocols 分布式快照协议的改进和重新考虑
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.26
A. Agbaria
Distributed snapshots are an important building block for distributed systems, and, among other applications, are useful for constructing efficient checkpointing protocols. In addition to the imposed overhead of the existing distributed snapshot protocols, those protocols are not trivially applicable (if at all) in many of today's distributed systems, e.g., grid, mobile, and sensors systems. After presenting the shortages and the inapplicability of the most popular existing distributed snapshot protocols, this paper discusses improvement directions for the protocols. In addition, it presents a new and an important improvement for the most popular distributed snapshot protocol, which was presented by Chandy and Lamport in 1985. Although the proposed improvement is simple and easy to implement, it has significant benefits in reducing the software and hardware overheads of distributed snapshots. Then, the paper presents proofs for the safety and progress of the new protocol. Lastly, it presents a performance analysis of the protocol using stochastic models
分布式快照是分布式系统的重要构建块,对于构造高效的检查点协议非常有用。除了现有分布式快照协议强加的开销之外,这些协议在当今的许多分布式系统(例如,网格、移动和传感器系统)中并不普遍适用(如果有的话)。在介绍了现有最流行的分布式快照协议的不足和不适用性之后,讨论了这些协议的改进方向。此外,它对1985年由Chandy和Lamport提出的最流行的分布式快照协议进行了新的重要改进。尽管建议的改进简单且易于实现,但它在减少分布式快照的软件和硬件开销方面具有显著的好处。然后,对新协议的安全性和进步性进行了证明。最后,利用随机模型对协议进行了性能分析
{"title":"Improvements and Reconsideration of Distributed Snapshot Protocols","authors":"A. Agbaria","doi":"10.1109/SRDS.2006.26","DOIUrl":"https://doi.org/10.1109/SRDS.2006.26","url":null,"abstract":"Distributed snapshots are an important building block for distributed systems, and, among other applications, are useful for constructing efficient checkpointing protocols. In addition to the imposed overhead of the existing distributed snapshot protocols, those protocols are not trivially applicable (if at all) in many of today's distributed systems, e.g., grid, mobile, and sensors systems. After presenting the shortages and the inapplicability of the most popular existing distributed snapshot protocols, this paper discusses improvement directions for the protocols. In addition, it presents a new and an important improvement for the most popular distributed snapshot protocol, which was presented by Chandy and Lamport in 1985. Although the proposed improvement is simple and easy to implement, it has significant benefits in reducing the software and hardware overheads of distributed snapshots. Then, the paper presents proofs for the safety and progress of the new protocol. Lastly, it presents a performance analysis of the protocol using stochastic models","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"428 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115945212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Non-Blocking Synchronous Checkpointing Based on Rollback-Dependency Trackability 基于回滚依赖可跟踪性的非阻塞同步检查点
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.34
T. Sakata, Islene C. Garcia
This article proposes an original approach that applies the rollback-dependency trackability (RDT) property to implement a new non-blocking synchronous checkpointing protocol, called RDT-NBS, that takes mutable checkpoints and efficiently supports concurrent initiators. Mutable checkpoints can be saved in non-stable storage and make it possible for non-blocking synchronous checkpointing protocols to save a minimal number of checkpoints in stable storage during the construction of a consistent global checkpoint. We prove that this minimality property does not hold in presence of concurrent checkpointing initiations. Even though, RDT-NBS uses mutable checkpoints to reduce the use of stable memory assuring the existence of a consistent global checkpoint in stable storage. We also present simulation results that compare RDT-NBS to quasi-synchronous RDT
本文提出了一种原始方法,该方法应用回滚依赖项可跟踪性(RDT)属性来实现一种新的非阻塞同步检查点协议,称为RDT- nbs,该协议采用可变检查点并有效地支持并发启动器。可变检查点可以保存在非稳定存储中,并使非阻塞同步检查点协议能够在构建一致全局检查点期间在稳定存储中保存最小数量的检查点。我们证明了这个极小性在并发检查点初始化的情况下不成立。尽管如此,RDT-NBS使用可变检查点来减少稳定内存的使用,确保在稳定存储中存在一致的全局检查点。我们还给出了RDT- nbs与准同步RDT的仿真结果
{"title":"Non-Blocking Synchronous Checkpointing Based on Rollback-Dependency Trackability","authors":"T. Sakata, Islene C. Garcia","doi":"10.1109/SRDS.2006.34","DOIUrl":"https://doi.org/10.1109/SRDS.2006.34","url":null,"abstract":"This article proposes an original approach that applies the rollback-dependency trackability (RDT) property to implement a new non-blocking synchronous checkpointing protocol, called RDT-NBS, that takes mutable checkpoints and efficiently supports concurrent initiators. Mutable checkpoints can be saved in non-stable storage and make it possible for non-blocking synchronous checkpointing protocols to save a minimal number of checkpoints in stable storage during the construction of a consistent global checkpoint. We prove that this minimality property does not hold in presence of concurrent checkpointing initiations. Even though, RDT-NBS uses mutable checkpoints to reduce the use of stable memory assuring the existence of a consistent global checkpoint in stable storage. We also present simulation results that compare RDT-NBS to quasi-synchronous RDT","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134094899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Decentralized Local Failure Detection in Dynamic Distributed Systems 动态分布式系统中的分散局部故障检测
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.16
Nigamanth Sridhar
A failure detector is an important building block when constructing fault-tolerant distributed systems. In asynchronous distributed systems, failed processes are often indistinguishable from slow processes. A failure detector is an oracle that can intelligently suspect processes to have failed. Different classes of failure detectors have been proposed to solve different kinds of problems. Almost all of this work is focused on global failure detection, and moreover, in systems that do not contain mobile nodes or include dynamic topologies. In this paper, we present diamPm l - a local failure detector that can tolerate mobility and topology changes. This means that diamPm l can distinguish between a failed process and a process that has moved away from its original location. We also establish an upper bound on the duration for which a process wrongly suspects a node that has moved away from its neighborhood. We support our theoretical results with experimental findings from an implementation of this algorithm for sensor networks
故障检测器是构建容错分布式系统的重要组成部分。在异步分布式系统中,失败的进程通常与缓慢的进程无法区分。故障检测器是一种可以智能地怀疑进程已经失败的oracle。针对不同类型的问题,提出了不同类型的故障检测器。几乎所有这些工作都集中在全局故障检测上,而且,在不包含移动节点或包含动态拓扑的系统中。在本文中,我们提出了一种局部故障检测器,可以容忍迁移和拓扑变化。这意味着diamPm 1可以区分失败的流程和已从其原始位置移动的流程。我们还建立了进程错误地怀疑某个节点已离开其邻域的持续时间的上限。我们用该算法在传感器网络中的实现的实验结果来支持我们的理论结果
{"title":"Decentralized Local Failure Detection in Dynamic Distributed Systems","authors":"Nigamanth Sridhar","doi":"10.1109/SRDS.2006.16","DOIUrl":"https://doi.org/10.1109/SRDS.2006.16","url":null,"abstract":"A failure detector is an important building block when constructing fault-tolerant distributed systems. In asynchronous distributed systems, failed processes are often indistinguishable from slow processes. A failure detector is an oracle that can intelligently suspect processes to have failed. Different classes of failure detectors have been proposed to solve different kinds of problems. Almost all of this work is focused on global failure detection, and moreover, in systems that do not contain mobile nodes or include dynamic topologies. In this paper, we present diamPm l - a local failure detector that can tolerate mobility and topology changes. This means that diamPm l can distinguish between a failed process and a process that has moved away from its original location. We also establish an upper bound on the duration for which a process wrongly suspects a node that has moved away from its neighborhood. We support our theoretical results with experimental findings from an implementation of this algorithm for sensor networks","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"69 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127825999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
WRAPS: Denial-of-Service Defense through Web Referrals wrap:通过Web引用进行拒绝服务防御
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.48
Xiaofeng Wang, M. Reiter
The Web is a complicated graph, with millions of Web sites interlinked together. In this paper, we propose to use this Web sitegraph structure to mitigate flooding attacks on a Web site, using a new Web referral architecture for privileged service ("WRAPS"). WRAPS allows a legitimate client to obtain a privilege URL through a click on a referral hypher-link, from a Web site trusted by the target Web site. Using that URL, the client can get privileged access to the target Web site in a manner that is far less vulnerable to a DDoS flooding attack. WRAPS does not require changes to Web client software and is extremely lightweight for referrer Web sites, which eases its deployment. The massive scale of the Web sitegraph could deter attempts to isolate a Web site through blocking all referrers. We present the design of WRAPS, and the implementation of a prototype system used to evaluate our proposal. Our empirical study demonstrates that WRAPS enables legitimate clients to connect to a Web site smoothly in spite of an intensive flooding attack, at the cost of small overheads on the Web site's ISP's edge routers
网络是一个复杂的图表,数以百万计的网站相互连接在一起。在本文中,我们建议使用这种Web站点图结构来减轻对Web站点的洪水攻击,使用一种新的特权服务Web引用体系结构(“WRAPS”)。WRAPS允许合法客户端从目标网站信任的网站通过单击引用连字符链接获得特权URL。使用该URL,客户机可以以一种更不容易受到DDoS洪水攻击的方式获得对目标Web站点的特权访问。WRAPS不需要更改Web客户端软件,并且对于推荐网站来说非常轻量级,这简化了它的部署。Web站点图的巨大规模可以阻止通过阻止所有引用来隔离Web站点的企图。我们介绍了WRAPS的设计,以及用于评估我们的提案的原型系统的实现。我们的实证研究表明,尽管受到密集的洪水攻击,但WRAPS仍能使合法客户顺利连接到网站,其代价是网站的ISP边缘路由器的少量开销
{"title":"WRAPS: Denial-of-Service Defense through Web Referrals","authors":"Xiaofeng Wang, M. Reiter","doi":"10.1109/SRDS.2006.48","DOIUrl":"https://doi.org/10.1109/SRDS.2006.48","url":null,"abstract":"The Web is a complicated graph, with millions of Web sites interlinked together. In this paper, we propose to use this Web sitegraph structure to mitigate flooding attacks on a Web site, using a new Web referral architecture for privileged service (\"WRAPS\"). WRAPS allows a legitimate client to obtain a privilege URL through a click on a referral hypher-link, from a Web site trusted by the target Web site. Using that URL, the client can get privileged access to the target Web site in a manner that is far less vulnerable to a DDoS flooding attack. WRAPS does not require changes to Web client software and is extremely lightweight for referrer Web sites, which eases its deployment. The massive scale of the Web sitegraph could deter attempts to isolate a Web site through blocking all referrers. We present the design of WRAPS, and the implementation of a prototype system used to evaluate our proposal. Our empirical study demonstrates that WRAPS enables legitimate clients to connect to a Web site smoothly in spite of an intensive flooding attack, at the cost of small overheads on the Web site's ISP's edge routers","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"274 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122051759","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems 实时分布式系统中可分配线程故障的及时恢复
Pub Date : 2006-10-02 DOI: 10.1109/SRDS.2006.38
Edward Curley, J. Anderson, B. Ravindran, E. Jensen
We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans - i.e., thread segments that are disconnected from the thread's root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. We present a realtime scheduling algorithm called AUA, and a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time, thereby bounding thread starvation durations, and maximize the total thread accrued timeliness utility. We implement AUA and TP-TR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol's time-bounded recovery property and confirm their effectiveness
考虑了可分发线程在保证时效性的情况下的故障恢复问题。当承载部分可分发线程的节点失败时,它会导致孤儿——即线程段与线程的根断开连接。我们考虑了一种用于从此类故障中恢复的终止模型,其中必须检测并终止孤儿,并且必须将故障异常通知发送到最远的、连续的幸存线程段以恢复线程执行。提出了一种实时调度算法AUA和一种可分发线程完整性协议TP-TR。我们展示了AUA和TP-TR限制了孤立清理和恢复时间,从而限制了线程饥饿持续时间,并最大化了线程累积时效性的总效用。我们在支持可分发线程的实时中间件中实现AUA和TP-TR。我们的实验研究验证了算法/协议的有时限恢复特性,并证实了它们的有效性
{"title":"Recovering from Distributable Thread Failures with Assured Timeliness in Real-Time Distributed Systems","authors":"Edward Curley, J. Anderson, B. Ravindran, E. Jensen","doi":"10.1109/SRDS.2006.38","DOIUrl":"https://doi.org/10.1109/SRDS.2006.38","url":null,"abstract":"We consider the problem of recovering from failures of distributable threads with assured timeliness. When a node hosting a portion of a distributable thread fails, it causes orphans - i.e., thread segments that are disconnected from the thread's root. We consider a termination model for recovering from such failures, where the orphans must be detected and aborted, and failure-exception notification must be delivered to the farthest, contiguous surviving thread segment for resuming thread execution. We present a realtime scheduling algorithm called AUA, and a distributable thread integrity protocol called TP-TR. We show that AUA and TP-TR bound the orphan cleanup and recovery time, thereby bounding thread starvation durations, and maximize the total thread accrued timeliness utility. We implement AUA and TP-TR in a real-time middleware that supports distributable threads. Our experimental studies with the implementation validate the algorithm/protocol's time-bounded recovery property and confirm their effectiveness","PeriodicalId":164765,"journal":{"name":"2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130598724","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
期刊
2006 25th IEEE Symposium on Reliable Distributed Systems (SRDS'06)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1