Pub Date : 2004-06-28DOI: 10.1109/DSN.2004.1311907
Jennifer Morris, D. Kroening, P. Koopman
Some safety-critical distributed embedded systems may need to use centralized components to achieve certain dependability properties. The difficulty in combining centralized and distributed architectures is achieving the potential benefits of centralization without giving up properties that motivated the use of a distributed approach in the first place. This paper examines the impact on fault tolerance of adding selected centralized components to distributed embedded systems, and possible approaches to choosing an appropriate configuration. We consider the proposed use of a star topology with centralized bus guardians in the time-triggered architecture. We model systems with different levels of centralized control in their star couplers, and compare fault tolerance properties in the presence of star-coupler faults. We demonstrate that buffering entire frames in the star coupler could lead to failures in startup and integration. We also show that constraining buffer size imposes restrictions on frame size and clock rates.
{"title":"Fault tolerance tradeoffs in moving from decentralized to centralized embedded systems","authors":"Jennifer Morris, D. Kroening, P. Koopman","doi":"10.1109/DSN.2004.1311907","DOIUrl":"https://doi.org/10.1109/DSN.2004.1311907","url":null,"abstract":"Some safety-critical distributed embedded systems may need to use centralized components to achieve certain dependability properties. The difficulty in combining centralized and distributed architectures is achieving the potential benefits of centralization without giving up properties that motivated the use of a distributed approach in the first place. This paper examines the impact on fault tolerance of adding selected centralized components to distributed embedded systems, and possible approaches to choosing an appropriate configuration. We consider the proposed use of a star topology with centralized bus guardians in the time-triggered architecture. We model systems with different levels of centralized control in their star couplers, and compare fault tolerance properties in the presence of star-coupler faults. We demonstrate that buffering entire frames in the star coupler could lead to failures in startup and integration. We also show that constraining buffer size imposes restrictions on frame size and clock rates.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123604503","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-06-28DOI: 10.1109/DSN.2004.1311875
Christopher LaFrieda, R. Manohar
This paper presents a circuit fault detection and isolation technique for quasi delay-insensitive asynchronous circuits. We achieve fault isolation by a combination of physical layout and circuit techniques. The asynchronous nature of quasi delay-insensitive circuits combined with layout techniques makes the design tolerant to delay faults. Circuit techniques are used to make sections of the design robust to nondelay faults. The combination of these is an asynchronous defect-tolerant circuit where a large class of faults are tolerated, and the remaining faults can be both detected easily and isolated to a small region of the design.
{"title":"Fault detection and isolation techniques for quasi delay-insensitive circuits","authors":"Christopher LaFrieda, R. Manohar","doi":"10.1109/DSN.2004.1311875","DOIUrl":"https://doi.org/10.1109/DSN.2004.1311875","url":null,"abstract":"This paper presents a circuit fault detection and isolation technique for quasi delay-insensitive asynchronous circuits. We achieve fault isolation by a combination of physical layout and circuit techniques. The asynchronous nature of quasi delay-insensitive circuits combined with layout techniques makes the design tolerant to delay faults. Circuit techniques are used to make sections of the design robust to nondelay faults. The combination of these is an asynchronous defect-tolerant circuit where a large class of faults are tolerated, and the remaining faults can be both detected easily and isolated to a small region of the design.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126265661","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-06-28DOI: 10.1109/DSN.2004.1311906
E. Strunk, J. Knight
It is often the case that safety-critical systems have to be reconfigured during operation because of issues such as changes in the systems operating environment or the failure of software or hardware components. Operational systems exist that are capable of reconfiguration, but previous research and the techniques employed in operational systems for the most part either have not addressed the issue of assurance or have been developed in an ad hoc manner. In this paper we present a comprehensive approach to assured reconfiguration, providing a framework for formal verification that allows the developer of a reconfigurable system to use a set of application-level properties to show general reconfiguration properties. The properties and design are illustrated through an example from NASA's runway incursion prevention system.
{"title":"Assured reconfiguration of embedded real-time software","authors":"E. Strunk, J. Knight","doi":"10.1109/DSN.2004.1311906","DOIUrl":"https://doi.org/10.1109/DSN.2004.1311906","url":null,"abstract":"It is often the case that safety-critical systems have to be reconfigured during operation because of issues such as changes in the systems operating environment or the failure of software or hardware components. Operational systems exist that are capable of reconfiguration, but previous research and the techniques employed in operational systems for the most part either have not addressed the issue of assurance or have been developed in an ad hoc manner. In this paper we present a comprehensive approach to assured reconfiguration, providing a framework for formal verification that allows the developer of a reconfigurable system to use a set of application-level properties to show general reconfiguration properties. The properties and design are illustrated through an example from NASA's runway incursion prevention system.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125284850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-06-28DOI: 10.1109/DSN.2004.1311891
S. Kulkarni, Ali Ebnenasir
We concentrate on automated synthesis of multitolerant programs, i.e., programs that tolerate multiple classes of faults and provide a (possibly) different level of fault-tolerance to each class. We consider three levels of fault-tolerance: (1) failsafe, where in the presence of faults, the synthesized program guarantees safety, (2) nonmasking, where in the presence of faults, the synthesized program recovers to states from where its safety and liveness are satisfied, and (3) masking where in the presence of faults the synthesized program satisfies safety and recovers to states from where its safety and liveness are satisfied. We focus on the automated synthesis of finite-state multitolerant programs in high atomicity model where the program can read and write all its variables in an atomic step. We show that if one needs to add failsafe (respectively, nonmasking) fault-tolerance to one class of faults and masking fault-tolerance to another class of faults then such addition can be done in polynomial time in the state space of the fault-intolerant program. However, if one needs to add failsafe fault-tolerance to one class of faults and nonmasking fault-tolerance to another class of faults then the resulting problem is NP-complete. We find this result to be counterintuitive since adding failsafe and nonmasking fault-tolerance to the same class of faults (which is equivalent to adding masking fault-tolerance to that class of faults) can be done in polynomial time, whereas adding failsafe fault-tolerance to one class of faults and nonmasking fault-tolerance to a different class of faults is NP-complete.
{"title":"Automated synthesis of multitolerance","authors":"S. Kulkarni, Ali Ebnenasir","doi":"10.1109/DSN.2004.1311891","DOIUrl":"https://doi.org/10.1109/DSN.2004.1311891","url":null,"abstract":"We concentrate on automated synthesis of multitolerant programs, i.e., programs that tolerate multiple classes of faults and provide a (possibly) different level of fault-tolerance to each class. We consider three levels of fault-tolerance: (1) failsafe, where in the presence of faults, the synthesized program guarantees safety, (2) nonmasking, where in the presence of faults, the synthesized program recovers to states from where its safety and liveness are satisfied, and (3) masking where in the presence of faults the synthesized program satisfies safety and recovers to states from where its safety and liveness are satisfied. We focus on the automated synthesis of finite-state multitolerant programs in high atomicity model where the program can read and write all its variables in an atomic step. We show that if one needs to add failsafe (respectively, nonmasking) fault-tolerance to one class of faults and masking fault-tolerance to another class of faults then such addition can be done in polynomial time in the state space of the fault-intolerant program. However, if one needs to add failsafe fault-tolerance to one class of faults and nonmasking fault-tolerance to another class of faults then the resulting problem is NP-complete. We find this result to be counterintuitive since adding failsafe and nonmasking fault-tolerance to the same class of faults (which is equivalent to adding masking fault-tolerance to that class of faults) can be done in polynomial time, whereas adding failsafe fault-tolerance to one class of faults and nonmasking fault-tolerance to a different class of faults is NP-complete.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125840076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2004-06-28DOI: 10.1109/DSN.2004.1311874
Junghee Han, F. Jahanian
Multi-homed and overlay networks are two widely studied approaches aimed at leveraging the inherent redundancy of the Internet's underlying routing infrastructure to enhance end-to-end application performance and availability. However, the effectiveness of these approaches depends on the natural diversity of redundant paths between two endhosts in terms of physical links, routing infrastructure, administrative control and geographical distribution. This paper quantitatively analyzes the impact of path diversity on multihomed and overlay networks and highlights several inherent limitations of these architectures in exploiting the full potential redundancy of the Internet. We based our analysis on traceroutes and routing table data collected from several vantage points in the Internet including: looking glasses at ten major Internet service providers (ISPs), RouteViews servers from twenty ISPs, and more than fifty PlanetLab nodes globally distributed across the Internet. Our study motivates research directions - constructing topology-aware multihoming and overlay networks for better availability.
{"title":"Impact of path diversity on multi-homed and overlay networks","authors":"Junghee Han, F. Jahanian","doi":"10.1109/DSN.2004.1311874","DOIUrl":"https://doi.org/10.1109/DSN.2004.1311874","url":null,"abstract":"Multi-homed and overlay networks are two widely studied approaches aimed at leveraging the inherent redundancy of the Internet's underlying routing infrastructure to enhance end-to-end application performance and availability. However, the effectiveness of these approaches depends on the natural diversity of redundant paths between two endhosts in terms of physical links, routing infrastructure, administrative control and geographical distribution. This paper quantitatively analyzes the impact of path diversity on multihomed and overlay networks and highlights several inherent limitations of these architectures in exploiting the full potential redundancy of the Internet. We based our analysis on traceroutes and routing table data collected from several vantage points in the Internet including: looking glasses at ten major Internet service providers (ISPs), RouteViews servers from twenty ISPs, and more than fifty PlanetLab nodes globally distributed across the Internet. Our study motivates research directions - constructing topology-aware multihoming and overlay networks for better availability.","PeriodicalId":436323,"journal":{"name":"International Conference on Dependable Systems and Networks, 2004","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2004-06-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127141649","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}