{"title":"有效地恢复多服务器微内核的有状态系统组件","authors":"Wentai Li, Jinyu Gu, Nian Liu, B. Zang","doi":"10.1109/ICDCS51616.2021.00054","DOIUrl":null,"url":null,"abstract":"Microkernel OSes provide OS services through mutually-isolated system servers running in different user processes, which brings stronger fault isolation than monolithic OSes. Nevertheless, considering the fault recovery capability of system servers, most existing microkernel OSes usually do no more than restarting a fault server, which will cause a server to lose all its running states and then may affect all the applications relying on it. In this paper, we present a mechanism named TxIPC that can efficiently recover stateful system servers on microkernel OSes. Since a system server provides the service by inter-process communication (IPC), TxIPC makes it fault resilient by handling each IPC in a transaction-like manner. Specifically, if a fault happens in a server (during one IPC handling procedure), TxIPC aborts all the updates made by the IPC and thus recovers the server from that fault. Evaluations show that TxIPC can enable servers to recover from 99.8% (injected) faults with 3%-45 % performance overhead on application benchmarks, which significantly outperforms existing counterparts.","PeriodicalId":222376,"journal":{"name":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2021-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Efficiently Recovering Stateful System Components of Multi-server Microkernels\",\"authors\":\"Wentai Li, Jinyu Gu, Nian Liu, B. Zang\",\"doi\":\"10.1109/ICDCS51616.2021.00054\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Microkernel OSes provide OS services through mutually-isolated system servers running in different user processes, which brings stronger fault isolation than monolithic OSes. Nevertheless, considering the fault recovery capability of system servers, most existing microkernel OSes usually do no more than restarting a fault server, which will cause a server to lose all its running states and then may affect all the applications relying on it. In this paper, we present a mechanism named TxIPC that can efficiently recover stateful system servers on microkernel OSes. Since a system server provides the service by inter-process communication (IPC), TxIPC makes it fault resilient by handling each IPC in a transaction-like manner. Specifically, if a fault happens in a server (during one IPC handling procedure), TxIPC aborts all the updates made by the IPC and thus recovers the server from that fault. Evaluations show that TxIPC can enable servers to recover from 99.8% (injected) faults with 3%-45 % performance overhead on application benchmarks, which significantly outperforms existing counterparts.\",\"PeriodicalId\":222376,\"journal\":{\"name\":\"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)\",\"volume\":\"19 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2021-07-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICDCS51616.2021.00054\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2021 IEEE 41st International Conference on Distributed Computing Systems (ICDCS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICDCS51616.2021.00054","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Efficiently Recovering Stateful System Components of Multi-server Microkernels
Microkernel OSes provide OS services through mutually-isolated system servers running in different user processes, which brings stronger fault isolation than monolithic OSes. Nevertheless, considering the fault recovery capability of system servers, most existing microkernel OSes usually do no more than restarting a fault server, which will cause a server to lose all its running states and then may affect all the applications relying on it. In this paper, we present a mechanism named TxIPC that can efficiently recover stateful system servers on microkernel OSes. Since a system server provides the service by inter-process communication (IPC), TxIPC makes it fault resilient by handling each IPC in a transaction-like manner. Specifically, if a fault happens in a server (during one IPC handling procedure), TxIPC aborts all the updates made by the IPC and thus recovers the server from that fault. Evaluations show that TxIPC can enable servers to recover from 99.8% (injected) faults with 3%-45 % performance overhead on application benchmarks, which significantly outperforms existing counterparts.