{"title":"Tolerating Faults in Disaggregated Datacenters","authors":"A. Carbonari, Ivan Beschastnikh","doi":"10.1145/3152434.3152447","DOIUrl":null,"url":null,"abstract":"Recent research shows that disaggregated datacenters (DDCs) are practical and that DDC resource modularity will benefit both users and operators. This paper explores the implications of disaggregation on application fault tolerance. We expect that resource failures in a DDC will be fine-grained because resources will no longer fate-share. In this context, we look at how DDCs can provide legacy applications with familiar failure semantics and discuss fate sharing granularities that are not available in existing datacenters. We argue that fate sharing and failure mitigation should be programmable, specified by the application, and primarily implemented in the SDN-based network.","PeriodicalId":120886,"journal":{"name":"Proceedings of the 16th ACM Workshop on Hot Topics in Networks","volume":"6 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"33","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 16th ACM Workshop on Hot Topics in Networks","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3152434.3152447","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 33
Abstract
Recent research shows that disaggregated datacenters (DDCs) are practical and that DDC resource modularity will benefit both users and operators. This paper explores the implications of disaggregation on application fault tolerance. We expect that resource failures in a DDC will be fine-grained because resources will no longer fate-share. In this context, we look at how DDCs can provide legacy applications with familiar failure semantics and discuss fate sharing granularities that are not available in existing datacenters. We argue that fate sharing and failure mitigation should be programmable, specified by the application, and primarily implemented in the SDN-based network.