Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022309
M. Demirbas, A. Arora
Refinement tools such as compilers do not necessarily preserve fault-tolerance. That is, given a fault-tolerant program in a high-level language as input, the output of a compiler in a lower-level language will not necessarily be fault-tolerant. We identify a type of refinement, namely "convergence refinement", that preserves the fault-tolerance property of stabilization. We illustrate the use of convergence refinement by presenting the first formal design of Dijkstra's little-understood 3-state stabilizing token-ring system. Our designs begin with simple, abstract token-ring systems that are not stabilizing, and then add an abstract "wrapper" to the systems so as to achieve stabilization. The system and the wrapper are then refined to obtain a concrete token-ring system, while preserving stabilization. In fact, the two are refined independently, which demonstrates that convergence refinement is amenable for "graybox" design of stabilizing implementations, i.e., design of system stabilization based solely on system specification and without knowledge of system implementation details.
{"title":"Convergence refinement","authors":"M. Demirbas, A. Arora","doi":"10.1109/ICDCS.2002.1022309","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022309","url":null,"abstract":"Refinement tools such as compilers do not necessarily preserve fault-tolerance. That is, given a fault-tolerant program in a high-level language as input, the output of a compiler in a lower-level language will not necessarily be fault-tolerant. We identify a type of refinement, namely \"convergence refinement\", that preserves the fault-tolerance property of stabilization. We illustrate the use of convergence refinement by presenting the first formal design of Dijkstra's little-understood 3-state stabilizing token-ring system. Our designs begin with simple, abstract token-ring systems that are not stabilizing, and then add an abstract \"wrapper\" to the systems so as to achieve stabilization. The system and the wrapper are then refined to obtain a concrete token-ring system, while preserving stabilization. In fact, the two are refined independently, which demonstrates that convergence refinement is amenable for \"graybox\" design of stabilizing implementations, i.e., design of system stabilization based solely on system specification and without knowledge of system implementation details.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123966380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022248
Chih-Lin Hu, Ming-Syan Chen
Data dissemination has significantly served as a scalable data delivery mechanism in wireless networks. However, even though the broadcast traffic has the nature of dynamic changes, most previous research efforts were elaborated upon the premise of static workloads and access patterns without having proper traffic awareness. In this paper, we address the existence of client impatience and accordingly devise an online traffic awareness mechanism based on a novel selective deferment and reflection (SDR) technique to estimate the dynamic workloads and access patterns in a granularity of a broadcast cycle. In comparison with prior probing and feedback approaches, our design is of practical usefulness in that it has low complexity and is light-weight without performance degradation. With various dynamic traffic scenarios, the experimental results show that with an increasing/decreasing workload, the real access frequency distribution is bounded by two specific estimated distributions. This fact in turn suggests us to employ a trigonometric tuning method to further enhance the estimation. In addition, we examine that the mean difference between the estimated access frequency distribution and the real one is very small, consequently indicating the feasibility and reliability of our proposed data broadcast mechanism with traffic awareness.
{"title":"Dynamic data broadcasting with traffic awareness","authors":"Chih-Lin Hu, Ming-Syan Chen","doi":"10.1109/ICDCS.2002.1022248","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022248","url":null,"abstract":"Data dissemination has significantly served as a scalable data delivery mechanism in wireless networks. However, even though the broadcast traffic has the nature of dynamic changes, most previous research efforts were elaborated upon the premise of static workloads and access patterns without having proper traffic awareness. In this paper, we address the existence of client impatience and accordingly devise an online traffic awareness mechanism based on a novel selective deferment and reflection (SDR) technique to estimate the dynamic workloads and access patterns in a granularity of a broadcast cycle. In comparison with prior probing and feedback approaches, our design is of practical usefulness in that it has low complexity and is light-weight without performance degradation. With various dynamic traffic scenarios, the experimental results show that with an increasing/decreasing workload, the real access frequency distribution is bounded by two specific estimated distributions. This fact in turn suggests us to employ a trigonometric tuning method to further enhance the estimation. In addition, we examine that the mean difference between the estimated access frequency distribution and the real one is very small, consequently indicating the feasibility and reliability of our proposed data broadcast mechanism with traffic awareness.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"35 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124469138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022292
Alexander Leonhardi, K. Rothermel
Location-aware services are a promising way of exploiting the special possibilities created by ubiquitous mobile devices and wireless communication. Advanced location-aware applications will require highly accurate information about the geographic location of mobile objects and functionality that goes beyond simply querying the user's position, for example determining all mobile objects inside a certain geographic area. In this paper, we propose a generic large-scale location service, which has been designed with the goal of managing the highly dynamic location information for a large number of mobile objects, thus providing a common infrastructure that can be employed by location-aware applications. We propose a hierarchical distributed architecture, which can efficiently process these queries in a scalable way. To be able to deal with the frequent updates and queries resulting from highly dynamic location information, we propose a data storage component, which makes use of a main memory database.
{"title":"Architecture of a large-scale location service","authors":"Alexander Leonhardi, K. Rothermel","doi":"10.1109/ICDCS.2002.1022292","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022292","url":null,"abstract":"Location-aware services are a promising way of exploiting the special possibilities created by ubiquitous mobile devices and wireless communication. Advanced location-aware applications will require highly accurate information about the geographic location of mobile objects and functionality that goes beyond simply querying the user's position, for example determining all mobile objects inside a certain geographic area. In this paper, we propose a generic large-scale location service, which has been designed with the goal of managing the highly dynamic location information for a large number of mobile objects, thus providing a common infrastructure that can be employed by location-aware applications. We propose a hierarchical distributed architecture, which can efficiently process these queries in a scalable way. To be able to deal with the frequent updates and queries resulting from highly dynamic location information, we propose a data storage component, which makes use of a main memory database.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116924742","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022269
E. Pagani, G. P. Rossi
The differentiated services model is the emerging standard to provide Quality-of-Service (QoS) support for multimedia applications in the future Internet. This model involves bandwidth broker agents performing admission control and network configuration functionalities. A great deal of effort has been recently devoted to investigate viable approaches to the implementation of mechanisms that automatically perform the bandwidth broker functions, yet no standard policy has been proposed so far. In this paper we propose a distributed measurement-based protocol that performs admission control functionalities for multicast traffic in diff-serv networks. The protocol supports dynamic changes of the multicast group membership, operates on-demand, and supports the premium service. We prove that the proposed protocol performs an effective and efficient admission control function.
{"title":"Distributed bandwidth broker for QoS multicast traffic","authors":"E. Pagani, G. P. Rossi","doi":"10.1109/ICDCS.2002.1022269","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022269","url":null,"abstract":"The differentiated services model is the emerging standard to provide Quality-of-Service (QoS) support for multimedia applications in the future Internet. This model involves bandwidth broker agents performing admission control and network configuration functionalities. A great deal of effort has been recently devoted to investigate viable approaches to the implementation of mechanisms that automatically perform the bandwidth broker functions, yet no standard policy has been proposed so far. In this paper we propose a distributed measurement-based protocol that performs admission control functionalities for multicast traffic in diff-serv networks. The protocol supports dynamic changes of the multicast group membership, operates on-demand, and supports the premium service. We prove that the proposed protocol performs an effective and efficient admission control function.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115263867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022290
J. Nakazawa, Y. Tobe, H. Tokuda
This paper proposes a middleware for home networks, called Virtual Networked Appliance (VNA) architecture, in which the service description method and the Service to Service (S2S) communication mechanism are separated in an orthogonal way. Through the separation, VNA architecture solved the following two problems of existing middleware technologies: aspect violation and middleware fragmentation. In this paper, we first clarify the two problems and their relationship. Then, we describe the proposed middleware architecture as a solution from the viewpoint of the overall configuration and the S2S communication mechanism.
本文提出了一种用于家庭网络的中间件VNA (Virtual Networked Appliance)体系结构,其中服务描述方法和服务到服务(service to service, S2S)通信机制以正交方式分离。通过这种分离,VNA架构解决了现有中间件技术存在的两个问题:方面冲突和中间件碎片化。本文首先阐明了这两个问题及其相互关系。然后,我们从整体配置和S2S通信机制的角度描述了所提出的中间件体系结构作为解决方案。
{"title":"A pluggable service-to-service communication mechanism for VNA architecture","authors":"J. Nakazawa, Y. Tobe, H. Tokuda","doi":"10.1109/ICDCS.2002.1022290","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022290","url":null,"abstract":"This paper proposes a middleware for home networks, called Virtual Networked Appliance (VNA) architecture, in which the service description method and the Service to Service (S2S) communication mechanism are separated in an orthogonal way. Through the separation, VNA architecture solved the following two problems of existing middleware technologies: aspect violation and middleware fragmentation. In this paper, we first clarify the two problems and their relationship. Then, we describe the proposed middleware architecture as a solution from the viewpoint of the overall configuration and the S2S communication mechanism.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"2 3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114386614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022294
F. Sultan, Kiran Srinivasan, D. Iyer, L. Iftode
Today's Internet services are commonly built over TCP, the standard Internet connection-oriented reliable transport protocol. The endpoint naming scheme of TCP, based on network layer (IP) addresses, creates an implicit binding between a service and the IP address of a server providing it, throughout the lifetime of a client connection. This makes a TCP client prone to all adverse conditions that may affect the server endpoint or the internetwork in between, after the connection is established: congestion or failure in the network, server overloaded, failed or under DoS attack. Studies that quantify the effects of network stability and route availability demonstrate that connectivity failures can significantly impact Internet services. As a result, although highly available servers can be deployed, sustaining continuous service remains a problem. We propose cooperative service model, in which a pool of similar servers, possibly geographically distributed across the Internet, cooperate in sustaining a service by migration of client connections within the pool. The control traffic between servers, needed to support migrated connections, can be carried either over the Internet or over a private network. From client's viewpoint, at any point during the lifetime of its service session, the remote endpoint of its connection may transparently migrate between servers.
{"title":"Migratory TCP: connection migration for service continuity in the Internet","authors":"F. Sultan, Kiran Srinivasan, D. Iyer, L. Iftode","doi":"10.1109/ICDCS.2002.1022294","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022294","url":null,"abstract":"Today's Internet services are commonly built over TCP, the standard Internet connection-oriented reliable transport protocol. The endpoint naming scheme of TCP, based on network layer (IP) addresses, creates an implicit binding between a service and the IP address of a server providing it, throughout the lifetime of a client connection. This makes a TCP client prone to all adverse conditions that may affect the server endpoint or the internetwork in between, after the connection is established: congestion or failure in the network, server overloaded, failed or under DoS attack. Studies that quantify the effects of network stability and route availability demonstrate that connectivity failures can significantly impact Internet services. As a result, although highly available servers can be deployed, sustaining continuous service remains a problem. We propose cooperative service model, in which a pool of similar servers, possibly geographically distributed across the Internet, cooperate in sustaining a service by migration of client connections within the pool. The control traffic between servers, needed to support migrated connections, can be carried either over the Internet or over a private network. From client's viewpoint, at any point during the lifetime of its service session, the remote endpoint of its connection may transparently migrate between servers.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"109 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114736264","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022259
J. Flinn, Soyoung Park, M. Satyanarayanan
We describe Spectra, a remote execution system for battery-powered clients used in pervasive computing. Spectra enables applications to combine the mobility of small devices with the greater processing power of static compute servers. Spectra is self-tuning: it monitors both application resource usage and the availability of resources in the environment, and dynamically determines how and where to execute application components. In making this determination, Spectra balances the competing goals of performance, energy conservation, and application quality. We have validated Spectra's approach on the Compaq Itsy v2.2 and IBM ThinkPad 560X using a speech recognizer a document preparation system, and a natural language translator. Our results confirm that Spectra almost always selects the best execution plan, and that its few suboptimal choices are very close to optimal.
{"title":"Balancing performance, energy, and quality in pervasive computing","authors":"J. Flinn, Soyoung Park, M. Satyanarayanan","doi":"10.1109/ICDCS.2002.1022259","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022259","url":null,"abstract":"We describe Spectra, a remote execution system for battery-powered clients used in pervasive computing. Spectra enables applications to combine the mobility of small devices with the greater processing power of static compute servers. Spectra is self-tuning: it monitors both application resource usage and the availability of resources in the environment, and dynamically determines how and where to execute application components. In making this determination, Spectra balances the competing goals of performance, energy conservation, and application quality. We have validated Spectra's approach on the Compaq Itsy v2.2 and IBM ThinkPad 560X using a speech recognizer a document preparation system, and a natural language translator. Our results confirm that Spectra almost always selects the best execution plan, and that its few suboptimal choices are very close to optimal.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128029170","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022239
Arturo Crespo, H. Garcia-Molina
Finding information in a peer-to-peer system currently requires either a costly and vulnerable central index, or flooding the network with queries. We introduce the concept of routing indices (RIs), which allow nodes to forward queries to neighbors that are more likely to have answers. If a node cannot answer a query, it forwards the query to a subset of its neighbors, based on its local RI, rather than by selecting neighbors at random or by flooding the network by forwarding the query to all neighbors. We present three RI schemes: the compound, the hop-count, and the exponential routing indices. We evaluate their performance via simulations, and find that RIs can improve performance by one or two orders of magnitude vs. a flooding-based system, and by up to 100% vs. a random forwarding system. We also discuss the tradeoffs between the different RI schemes and highlight the effects of key design variables on system performance.
{"title":"Routing indices for peer-to-peer systems","authors":"Arturo Crespo, H. Garcia-Molina","doi":"10.1109/ICDCS.2002.1022239","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022239","url":null,"abstract":"Finding information in a peer-to-peer system currently requires either a costly and vulnerable central index, or flooding the network with queries. We introduce the concept of routing indices (RIs), which allow nodes to forward queries to neighbors that are more likely to have answers. If a node cannot answer a query, it forwards the query to a subset of its neighbors, based on its local RI, rather than by selecting neighbors at random or by flooding the network by forwarding the query to all neighbors. We present three RI schemes: the compound, the hop-count, and the exponential routing indices. We evaluate their performance via simulations, and find that RIs can improve performance by one or two orders of magnitude vs. a flooding-based system, and by up to 100% vs. a random forwarding system. We also discuss the tradeoffs between the different RI schemes and highlight the effects of key design variables on system performance.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"92 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124604877","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022253
Dan He, Gilles Muller, J. Lawall
Distributing video over the Internet is an increasingly important application. Nevertheless, the real-time and high bandwidth requirements of video make video distribution over today's Internet a challenge. Adaptive approaches can be used to respond to changes in bandwidth availability while limiting the effect of such changes on perceptual quality and resource consumption. Nevertheless, most existing adaptation mechanisms have limited scalability and do not effectively exploit the heterogeneity of the Internet. In this paper, we describe the design and implementation of a MPEG video broadcasting service based on active networks. In an active network, routers can be programmed to make routing decisions based on local conditions. Because decisions are made locally, adaptation reacts rapidly to changing conditions and is unaffected by conditions elsewhere in the network. Programmability allows the adaptation policy to be tuned to the structure of the transmitted data, and to the properties of local clients. We use the PLAN-P domain-specific language for programming active routers; this language provides high-level abstractions and safety guarantees that allow complex protocols to be developed rapidly and reliably. Our experiments show that our approach to video distribution permits the decoding of up to 9 times as many frames in a heavily loaded network as distribution using standard routers.
{"title":"Distributing MPEG movies over the Internet using programmable networks","authors":"Dan He, Gilles Muller, J. Lawall","doi":"10.1109/ICDCS.2002.1022253","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022253","url":null,"abstract":"Distributing video over the Internet is an increasingly important application. Nevertheless, the real-time and high bandwidth requirements of video make video distribution over today's Internet a challenge. Adaptive approaches can be used to respond to changes in bandwidth availability while limiting the effect of such changes on perceptual quality and resource consumption. Nevertheless, most existing adaptation mechanisms have limited scalability and do not effectively exploit the heterogeneity of the Internet. In this paper, we describe the design and implementation of a MPEG video broadcasting service based on active networks. In an active network, routers can be programmed to make routing decisions based on local conditions. Because decisions are made locally, adaptation reacts rapidly to changing conditions and is unaffected by conditions elsewhere in the network. Programmability allows the adaptation policy to be tuned to the structure of the transmitted data, and to the properties of local clients. We use the PLAN-P domain-specific language for programming active routers; this language provides high-level abstractions and safety guarantees that allow complex protocols to be developed rapidly and reliably. Our experiments show that our approach to video distribution permits the decoding of up to 9 times as many frames in a heavily loaded network as distribution using standard routers.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"180 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123346259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2002-07-02DOI: 10.1109/ICDCS.2002.1022271
S. Kulkarni, Ali Ebnenasir
In this paper, we focus our attention on the problem of automating the addition of failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. And, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of adding failsafe fault-tolerance in distributed programs is NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe fault-tolerance. We also identify a class of specifications, monotonic specifications and a class of programs, monotonic programs. Given a (positive) monotonic specification and a (negative) monotonic program, we show that failsafe fault-tolerance can be added in polynomial time. We note that the monotonicity restrictions are met for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment. Finally, we argue that the restrictions on the specifications and programs are necessary to add failsafe fault-tolerance in polynomial time; we prove that if only one of these conditions is satisfied, the addition of failsafe fault-tolerance is still NP-hard.
{"title":"The complexity of adding failsafe fault-tolerance","authors":"S. Kulkarni, Ali Ebnenasir","doi":"10.1109/ICDCS.2002.1022271","DOIUrl":"https://doi.org/10.1109/ICDCS.2002.1022271","url":null,"abstract":"In this paper, we focus our attention on the problem of automating the addition of failsafe fault-tolerance where fault-tolerance is added to an existing (fault-intolerant) program. A failsafe fault-tolerant program satisfies its specification (including safety and liveness) in the absence of faults. And, in the presence of faults, it satisfies its safety specification. We present a somewhat unexpected result that, in general, the problem of adding failsafe fault-tolerance in distributed programs is NP-hard. Towards this end, we reduce the 3-SAT problem to the problem of adding failsafe fault-tolerance. We also identify a class of specifications, monotonic specifications and a class of programs, monotonic programs. Given a (positive) monotonic specification and a (negative) monotonic program, we show that failsafe fault-tolerance can be added in polynomial time. We note that the monotonicity restrictions are met for commonly encountered problems such as Byzantine agreement, distributed consensus, and atomic commitment. Finally, we argue that the restrictions on the specifications and programs are necessary to add failsafe fault-tolerance in polynomial time; we prove that if only one of these conditions is satisfied, the addition of failsafe fault-tolerance is still NP-hard.","PeriodicalId":186210,"journal":{"name":"Proceedings 22nd International Conference on Distributed Computing Systems","volume":"249 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2002-07-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133515259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}