In this paper, we investigate the k-set consensus problem in asynchronous distributed systems. In this problem, each participating process begins the protocol with an input value and by the end of the protocol must decide on one value so that at most k total values are decided by all correct processes. We extend previous work by exploring several variations of the problem definition and model, including for the first time investigation of Byzantine failures. We show that the precise definition of the validity requirement, which characterizes what decision values are allowed as a function of the input values and whether failures occur, is crucial to the solvability of the problem. For example, we show that allowing default decisions in case of failures makes the problem solvable for most values of k despite a minority of failures, even in face of the most severe type of failures (Byzantine). We introduce six validity conditions for this problem (all considered in various contexts in the literature), and demarcate the line between possible and impossible for each case. In many cases, this line is different from the one of the originally studied k-set consensus problem. Index Terms—Agreement problems, Byzantine failures, consensus, crash failures, distributed systems, validity conditions. E
{"title":"On k-set consensus problems in asynchronous systems","authors":"R. Prisco, D. Malkhi, M. Reiter","doi":"10.1145/301308.301368","DOIUrl":"https://doi.org/10.1145/301308.301368","url":null,"abstract":"In this paper, we investigate the k-set consensus problem in asynchronous distributed systems. In this problem, each participating process begins the protocol with an input value and by the end of the protocol must decide on one value so that at most k total values are decided by all correct processes. We extend previous work by exploring several variations of the problem definition and model, including for the first time investigation of Byzantine failures. We show that the precise definition of the validity requirement, which characterizes what decision values are allowed as a function of the input values and whether failures occur, is crucial to the solvability of the problem. For example, we show that allowing default decisions in case of failures makes the problem solvable for most values of k despite a minority of failures, even in face of the most severe type of failures (Byzantine). We introduce six validity conditions for this problem (all considered in various contexts in the literature), and demarcate the line between possible and impossible for each case. In many cases, this line is different from the one of the originally studied k-set consensus problem. Index Terms—Agreement problems, Byzantine failures, consensus, crash failures, distributed systems, validity conditions. E","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"73 1","pages":"7-21"},"PeriodicalIF":0.0,"publicationDate":"1999-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74819665","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Mobile computing raises many new issues such as lack of stable storage, low bandwidth of wireless channel, high mobility, and limited battery life. These new issues make traditional checkpointing algorithms unsuitable. Coordinated checkpointing is an attractive approach for transparently adding fault tolerance to distributed applications since it avoids domino effects and minimizes the stable storage requirement. However, it suffers from high overhead associated with the checkpointing process in mobile computing systems. Two approaches have been used to reduce the overhead: First is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process nonblocking. These two approaches were orthogonal previously until the Prakash-Singhal algorithm (28) combined them. However, we (8) found that this algorithm may result in an inconsistency in some situations and we proved that there does not exist a nonblocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we introduce the concept of "mutable checkpoint," which is neither a tentative checkpoint nor a permanent checkpoint, to design efficient checkpointing algorithms for mobile computing systems. Mutable checkpoints can be saved anywhere, e.g., the main memory or local disk of MHs. In this way, taking a mutable checkpoint avoids the overhead of transferring large amounts of data to the stable storage at MSSs over the wireless network. We present techniques to minimize the number of mutable checkpoints. Simulation results show that the overhead of taking mutable checkpoints is negligible. Based on mutable checkpoints, our nonblocking algorithm avoids the avalanche effect and forces only a minimum number of processes to take their checkpoints on the stable storage.
{"title":"Mutable checkpoints: a new checkpointing approach for mobile computing systems","authors":"G. Cao, M. Singhal","doi":"10.1145/301308.301371","DOIUrl":"https://doi.org/10.1145/301308.301371","url":null,"abstract":"Mobile computing raises many new issues such as lack of stable storage, low bandwidth of wireless channel, high mobility, and limited battery life. These new issues make traditional checkpointing algorithms unsuitable. Coordinated checkpointing is an attractive approach for transparently adding fault tolerance to distributed applications since it avoids domino effects and minimizes the stable storage requirement. However, it suffers from high overhead associated with the checkpointing process in mobile computing systems. Two approaches have been used to reduce the overhead: First is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process nonblocking. These two approaches were orthogonal previously until the Prakash-Singhal algorithm (28) combined them. However, we (8) found that this algorithm may result in an inconsistency in some situations and we proved that there does not exist a nonblocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we introduce the concept of \"mutable checkpoint,\" which is neither a tentative checkpoint nor a permanent checkpoint, to design efficient checkpointing algorithms for mobile computing systems. Mutable checkpoints can be saved anywhere, e.g., the main memory or local disk of MHs. In this way, taking a mutable checkpoint avoids the overhead of transferring large amounts of data to the stable storage at MSSs over the wireless network. We present techniques to minimize the number of mutable checkpoints. Simulation results show that the overhead of taking mutable checkpoints is negligible. Based on mutable checkpoints, our nonblocking algorithm avoids the avalanche effect and forces only a minimum number of processes to take their checkpoints on the stable storage.","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"17 3","pages":"157-172"},"PeriodicalIF":0.0,"publicationDate":"1999-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91487051","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a new solution to the group mutual exclusion problem recently posed by Joung. In this problem, processes repeatedly request access to various "sessions." It is required that distinct processes are not in different sessions concurrently, that multiple processes may be in the same session concurrently, and that each process that tries to enter a session is eventually able to do so. This problem is a generalization of the mutual exclusion and readers-writers problems. Our algorithm and its correctness proof are substantially simpler than Joung's. This simplicity is achieved by building upon known solutions to the more specific mutual exclusion problem. Our algorithm also has various advantages over Joung's, depending on the choice of mutual exclusion algorithm used. These advantages include admitting a process to its session in constant time in the absence of contention, spinning locally in Cache Coherent (CC) and Nonuniform Memory Access (NUMA) systems, and improvements in the complexity measures proposed by Joung.
{"title":"A simple local-spin group mutual exclusion algorithm","authors":"P. Keane, Mark Moir","doi":"10.1145/301308.301319","DOIUrl":"https://doi.org/10.1145/301308.301319","url":null,"abstract":"This paper presents a new solution to the group mutual exclusion problem recently posed by Joung. In this problem, processes repeatedly request access to various \"sessions.\" It is required that distinct processes are not in different sessions concurrently, that multiple processes may be in the same session concurrently, and that each process that tries to enter a session is eventually able to do so. This problem is a generalization of the mutual exclusion and readers-writers problems. Our algorithm and its correctness proof are substantially simpler than Joung's. This simplicity is achieved by building upon known solutions to the more specific mutual exclusion problem. Our algorithm also has various advantages over Joung's, depending on the choice of mutual exclusion algorithm used. These advantages include admitting a process to its session in constant time in the absence of contention, spinning locally in Cache Coherent (CC) and Nonuniform Memory Access (NUMA) systems, and improvements in the complexity measures proposed by Joung.","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"24 1","pages":"673-685"},"PeriodicalIF":0.0,"publicationDate":"1999-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"72799443","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Guest Editors' Introduction: Special Issue on Compilers and Languages for Parallel and Distributed Computers","authors":"Yingchun Zhu, L. Hendren","doi":"10.1109/TPDS.1999.10002","DOIUrl":"https://doi.org/10.1109/TPDS.1999.10002","url":null,"abstract":"","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"16 7 1","pages":"97-98"},"PeriodicalIF":0.0,"publicationDate":"1999-01-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"82762878","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In many real applications, for example those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP [9] and LogGP [4] models to account for the impact of network contention and network interface DMA behavior on the performance of message-passing programs.We validate LoGPC by analyzing three applications implemented with Active Messages [11, 18] on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50% of the total execution time. In addition, we show that the impact of communication locality on the communication costs is at most a factor of two on Alewife. Finally, we use the model to identify tradeoffs between synchronous and asynchronous message passing styles.
{"title":"LoGPC: modeling network contention in message-passing programs","authors":"C. A. Moritz, M. Frank","doi":"10.1145/277851.277933","DOIUrl":"https://doi.org/10.1145/277851.277933","url":null,"abstract":"In many real applications, for example those with frequent and irregular communication patterns or those using large messages, network contention and contention for message processing resources can be a significant part of the total execution time. This paper presents a new cost model, called LoGPC, that extends the LogP [9] and LogGP [4] models to account for the impact of network contention and network interface DMA behavior on the performance of message-passing programs.We validate LoGPC by analyzing three applications implemented with Active Messages [11, 18] on the MIT Alewife multiprocessor. Our analysis shows that network contention accounts for up to 50% of the total execution time. In addition, we show that the impact of communication locality on the communication costs is at most a factor of two on Alewife. Finally, we use the model to identify tradeoffs between synchronous and asynchronous message passing styles.","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"51 1","pages":"404-415"},"PeriodicalIF":0.0,"publicationDate":"1998-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83504793","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper describes our experiences developing high-performance code for astrophysical N-body simulations. Recent N-body methods are based on an adaptive tree structure. The tree must be built and maintained across physically distributed memory; moreover, the communication requirements are irregular and adaptive. Together with the need to balance the computational work-load among processors, these issues pose interesting challenges and tradeoffs for high-performance implementation. Our implementation was guided by the need to keep solutions simple and general. We use a technique for implicitly representing a dynamic global tree across multiple processors which substantially reduces the programming complexity as well as the performance overheads of distributed memory architectures. The contributions include methods to vectorize the computation and minimize communication time which are theoretically and experimentally justified. The code has been tested by varying the number and distribution of bodies on different configurations of the Connection Machine CM-5. The overall performance on instances with 10 million bodies is typically over 30% of the peak machine rate. Preliminary timings compare favorably with other approaches.
{"title":"Experiences with parallel N-body simulation","authors":"Pangfeng Liu, S. Bhatt","doi":"10.1145/181014.181081","DOIUrl":"https://doi.org/10.1145/181014.181081","url":null,"abstract":"This paper describes our experiences developing high-performance code for astrophysical N-body simulations. Recent N-body methods are based on an adaptive tree structure. The tree must be built and maintained across physically distributed memory; moreover, the communication requirements are irregular and adaptive. Together with the need to balance the computational work-load among processors, these issues pose interesting challenges and tradeoffs for high-performance implementation.\u0000Our implementation was guided by the need to keep solutions simple and general. We use a technique for implicitly representing a dynamic global tree across multiple processors which substantially reduces the programming complexity as well as the performance overheads of distributed memory architectures. The contributions include methods to vectorize the computation and minimize communication time which are theoretically and experimentally justified.\u0000The code has been tested by varying the number and distribution of bodies on different configurations of the Connection Machine CM-5. The overall performance on instances with 10 million bodies is typically over 30% of the peak machine rate. Preliminary timings compare favorably with other approaches.","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"83 1","pages":"1306-1323"},"PeriodicalIF":0.0,"publicationDate":"1994-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74189078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Studies the use of randomized routing in multistage networks. While log N additional randomizing stages are needed to break "spatial locality", within each permutation, only log log N additional randomizing stages are needed to break "temporal locality" among successive permutations. Thus, log N bits of initial randomization per input, followed by log log N bits of randomization per packet are sufficient to ensure that t permutations are delivered in time t+log N. We present simulation results that validate this analysis.
{"title":"Randomized routing with shorter paths","authors":"E. Upfal, S. A. Felperin, M. Snir","doi":"10.1145/165231.166106","DOIUrl":"https://doi.org/10.1145/165231.166106","url":null,"abstract":"Studies the use of randomized routing in multistage networks. While log N additional randomizing stages are needed to break \"spatial locality\", within each permutation, only log log N additional randomizing stages are needed to break \"temporal locality\" among successive permutations. Thus, log N bits of initial randomization per input, followed by log log N bits of randomization per packet are sufficient to ensure that t permutations are delivered in time t+log N. We present simulation results that validate this analysis.","PeriodicalId":13128,"journal":{"name":"IEEE Trans. Parallel Distributed Syst.","volume":"53 1","pages":"356-362"},"PeriodicalIF":0.0,"publicationDate":"1993-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84320565","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}