Design of Multi-threaded Fault-Tolerant Connection-Oriented Communication

N. Ivaki, Filipe Araújo, F. Barros
{"title":"Design of Multi-threaded Fault-Tolerant Connection-Oriented Communication","authors":"N. Ivaki, Filipe Araújo, F. Barros","doi":"10.1109/PRDC.2014.10","DOIUrl":null,"url":null,"abstract":"Fault-tolerance is vital for dependable distributed applications that can deliver service, even in the presence of faults. Over the last few decades, above all protocols proposed to offer reliability and fault-tolerance, TCP grew to become one of the cornerstones of the Internet. However, despite emulating reliable communication in distributed environments, TCP does not handle connection failures when the connectivity is lost for some time, even if both endpoints are still running. When this occurs, developers must rollback the peers to some coherent state, many times with error-prone, ad hoc, or custom application-level solutions. In this paper, we refine the Acceptor-Connector design pattern to tackle the TCP unreliability problem. The pattern decouples the failure-related processing from the connection and service processing, efficiently handling different connections and their possible crashes concurrently, thereby yielding more reusable, extensible, and efficient distributed communication. The solution we propose incorporates proven multi-threaded solutions and a buffering scheme that discards the need for an application-layer acknowledgment scheme. This simplifies the development of reliable connection-oriented applications using the ubiquitous TCP protocol.","PeriodicalId":187000,"journal":{"name":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","volume":"22 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2014-11-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2014 IEEE 20th Pacific Rim International Symposium on Dependable Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/PRDC.2014.10","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 4

Abstract

Fault-tolerance is vital for dependable distributed applications that can deliver service, even in the presence of faults. Over the last few decades, above all protocols proposed to offer reliability and fault-tolerance, TCP grew to become one of the cornerstones of the Internet. However, despite emulating reliable communication in distributed environments, TCP does not handle connection failures when the connectivity is lost for some time, even if both endpoints are still running. When this occurs, developers must rollback the peers to some coherent state, many times with error-prone, ad hoc, or custom application-level solutions. In this paper, we refine the Acceptor-Connector design pattern to tackle the TCP unreliability problem. The pattern decouples the failure-related processing from the connection and service processing, efficiently handling different connections and their possible crashes concurrently, thereby yielding more reusable, extensible, and efficient distributed communication. The solution we propose incorporates proven multi-threaded solutions and a buffering scheme that discards the need for an application-layer acknowledgment scheme. This simplifies the development of reliable connection-oriented applications using the ubiquitous TCP protocol.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
面向连接的多线程容错通信设计
容错对于可靠的分布式应用程序至关重要,即使在存在故障的情况下也可以提供服务。在过去的几十年里,在所有提出提供可靠性和容错性的协议中,TCP逐渐成为互联网的基石之一。然而,尽管在分布式环境中模拟可靠的通信,TCP在连接丢失一段时间后并不处理连接失败,即使两个端点仍在运行。当发生这种情况时,开发人员必须将对等点回滚到某种一致状态,很多时候使用容易出错的、特别的或自定义的应用程序级解决方案。在本文中,我们改进了Acceptor-Connector设计模式来解决TCP的不可靠性问题。该模式将与故障相关的处理与连接和服务处理解耦,有效地处理不同的连接及其可能并发的崩溃,从而产生更可重用、可扩展和高效的分布式通信。我们提出的解决方案结合了经过验证的多线程解决方案和缓冲方案,该方案放弃了对应用层确认方案的需求。这简化了使用无处不在的TCP协议开发可靠的面向连接的应用程序。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Reduction of NBTI-Induced Degradation on Ring Oscillators in FPGA Region-Adherent Algorithms: Restricting the Impact of Faults on Service Quality CloudBFT: Elastic Byzantine Fault Tolerance Reliable Shortest Paths in Wireless Sensor Networks: Refocusing on Link Failure Scenarios from Applications Responsiveness of Service Discovery in Wireless Mesh Networks
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1