{"title":"Partition-Tolerant Distributed Publish/Subscribe Systems","authors":"R. Kazemzadeh, H. Jacobsen","doi":"10.1109/SRDS.2011.21","DOIUrl":null,"url":null,"abstract":"In this paper, we develop reliable distributed publish/subscribe algorithms that can tolerate concurrent failure of up to d broker machines or communication links. In our approach, d is a configuration parameter which determines the level of fault-tolerance of the system and reliability refers to exactly-once and per-source, in-order delivery of publications to clients with matching subscriptions. We propose protocols to address three problems in presence of broker or link failures: (i) subscription propagation, (ii) publication forwarding, and (iii) broker recovery. Finally, we study the effectiveness of our approach when the number of concurrent failures exceeds d. Through large-scale experimental evaluations with up to 500 brokers, we demonstrate that a system configured with a modest value of d = 3 is able to reliably deliver 97% of publications in presence of failure of up to 17% of its brokers.","PeriodicalId":116805,"journal":{"name":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","volume":"81 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2011-10-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"23","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2011 IEEE 30th International Symposium on Reliable Distributed Systems","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SRDS.2011.21","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 23
Abstract
In this paper, we develop reliable distributed publish/subscribe algorithms that can tolerate concurrent failure of up to d broker machines or communication links. In our approach, d is a configuration parameter which determines the level of fault-tolerance of the system and reliability refers to exactly-once and per-source, in-order delivery of publications to clients with matching subscriptions. We propose protocols to address three problems in presence of broker or link failures: (i) subscription propagation, (ii) publication forwarding, and (iii) broker recovery. Finally, we study the effectiveness of our approach when the number of concurrent failures exceeds d. Through large-scale experimental evaluations with up to 500 brokers, we demonstrate that a system configured with a modest value of d = 3 is able to reliably deliver 97% of publications in presence of failure of up to 17% of its brokers.