Philip M. Thambidurai, You-Keun Park, Kishor S. Trivedi
{"title":"On reliability modelling of fault-tolerant distributed systems","authors":"Philip M. Thambidurai, You-Keun Park, Kishor S. Trivedi","doi":"10.1109/ICDCS.1989.37941","DOIUrl":null,"url":null,"abstract":"The problem of predicting the reliability of a distributed system based on the principles of Byzantine agreement is addressed. The system is considered inoperable or failed if Byzantine agreement cannot be guaranteed. The reliability models depend on a unified model of interactive consistency, which is based on a unique fault taxonomy appropriate for distributed systems. The unified model takes advantage of the fact that some faults may not be of an arbitrary nature, while still allowing for the fact that some faults may be arbitrary. A closed-form expression for the reliability and the mean time to failure of systems base on the unified model is derived. Each processor is allowed to have multiple failure modes, and the contribution of the interactive consistency algorithm is explicitly taken into account. The practical value of this unified model in designing ultrareliable systems is demonstrated by several examples.<<ETX>>","PeriodicalId":266544,"journal":{"name":"[1989] Proceedings. The 9th International Conference on Distributed Computing Systems","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"1989-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1989] Proceedings. The 9th International Conference on Distributed Computing Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS.1989.37941","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 11
Abstract
The problem of predicting the reliability of a distributed system based on the principles of Byzantine agreement is addressed. The system is considered inoperable or failed if Byzantine agreement cannot be guaranteed. The reliability models depend on a unified model of interactive consistency, which is based on a unique fault taxonomy appropriate for distributed systems. The unified model takes advantage of the fact that some faults may not be of an arbitrary nature, while still allowing for the fact that some faults may be arbitrary. A closed-form expression for the reliability and the mean time to failure of systems base on the unified model is derived. Each processor is allowed to have multiple failure modes, and the contribution of the interactive consistency algorithm is explicitly taken into account. The practical value of this unified model in designing ultrareliable systems is demonstrated by several examples.<>