异步分布式系统中的故障检测

Raimundo José de Araújo, Macêdo, Campus de Ondina
{"title":"异步分布式系统中的故障检测","authors":"Raimundo José de Araújo, Macêdo, Campus de Ondina","doi":"10.5753/wtf.2000.23478","DOIUrl":null,"url":null,"abstract":"Being able to detect failures is an important issue in designing fault-tolerant distributed systems. However, the actual behaviour of a system limits the ability to provide such a mechanism. From one extreme of the spectrum, synchronous systems (i.e., with bounded message transmission delay and processing times) allow for the construction of perfect failure detection based simply on local timeouts. At the other extreme, accurate failure detection cannot be developed for asynchronous systems (i.e. systems with no bounds on message transmission delays and processing times), unless some extra properties can be guaranteed, such the ones specified in a seminal article by Chandra and Toueg [1]. The present paper discusses the requirements and describes the implementations of failure detectors for two important fault-tolerant mechanisms meant to asynchronous environments: process group membership and <>S Failure Detector based distributed consensus [1]. These implementations are based on a mechanism called the Time Connectivity Indicator, introduced in this paper.","PeriodicalId":356716,"journal":{"name":"Anais do II Workshop de Testes e Tolerância a Falhas (WTF 2000)","volume":"284 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2000-07-15","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"13","resultStr":"{\"title\":\"Failure Detection in Asynchronous Distributed Systems\",\"authors\":\"Raimundo José de Araújo, Macêdo, Campus de Ondina\",\"doi\":\"10.5753/wtf.2000.23478\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Being able to detect failures is an important issue in designing fault-tolerant distributed systems. However, the actual behaviour of a system limits the ability to provide such a mechanism. From one extreme of the spectrum, synchronous systems (i.e., with bounded message transmission delay and processing times) allow for the construction of perfect failure detection based simply on local timeouts. At the other extreme, accurate failure detection cannot be developed for asynchronous systems (i.e. systems with no bounds on message transmission delays and processing times), unless some extra properties can be guaranteed, such the ones specified in a seminal article by Chandra and Toueg [1]. The present paper discusses the requirements and describes the implementations of failure detectors for two important fault-tolerant mechanisms meant to asynchronous environments: process group membership and <>S Failure Detector based distributed consensus [1]. These implementations are based on a mechanism called the Time Connectivity Indicator, introduced in this paper.\",\"PeriodicalId\":356716,\"journal\":{\"name\":\"Anais do II Workshop de Testes e Tolerância a Falhas (WTF 2000)\",\"volume\":\"284 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2000-07-15\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"13\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Anais do II Workshop de Testes e Tolerância a Falhas (WTF 2000)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.5753/wtf.2000.23478\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Anais do II Workshop de Testes e Tolerância a Falhas (WTF 2000)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.5753/wtf.2000.23478","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 13

摘要

能够检测故障是设计容错分布式系统的一个重要问题。然而,系统的实际行为限制了提供这种机制的能力。从频谱的一个极端来看,同步系统(即,具有有限的消息传输延迟和处理时间)允许仅基于本地超时构建完美的故障检测。在另一个极端,异步系统(即消息传输延迟和处理时间没有限制的系统)无法开发准确的故障检测,除非可以保证一些额外的属性,例如Chandra和Toueg[1]在一篇开创性文章中指定的那些属性。本文讨论了两种重要的异步环境容错机制的故障检测器的需求和实现:进程组成员和基于分布式共识[1]的故障检测器。这些实现基于本文介绍的称为时间连接指示器的机制。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Failure Detection in Asynchronous Distributed Systems
Being able to detect failures is an important issue in designing fault-tolerant distributed systems. However, the actual behaviour of a system limits the ability to provide such a mechanism. From one extreme of the spectrum, synchronous systems (i.e., with bounded message transmission delay and processing times) allow for the construction of perfect failure detection based simply on local timeouts. At the other extreme, accurate failure detection cannot be developed for asynchronous systems (i.e. systems with no bounds on message transmission delays and processing times), unless some extra properties can be guaranteed, such the ones specified in a seminal article by Chandra and Toueg [1]. The present paper discusses the requirements and describes the implementations of failure detectors for two important fault-tolerant mechanisms meant to asynchronous environments: process group membership and <>S Failure Detector based distributed consensus [1]. These implementations are based on a mechanism called the Time Connectivity Indicator, introduced in this paper.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Diagnóstico em Redes de Topologia Arbitrária: Um Algoritmo Baseado em Inundação de Mensagens Adicionando Replicação utilizando Componentes de Software e um Ambiente Interativo O Agente Chinês para Diagnóstico de Redes de Topologia Arbitrária Reliability Requirements in Mobile Agent Systems Experiência com a Implementação de um Injetor de Falhas em Linux
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1