{"title":"An introduction to the analysis and debug of distributed computations","authors":"E. Fromentin, N. Plouzeau, M. Raynal","doi":"10.1109/ICAPP.1995.472239","DOIUrl":null,"url":null,"abstract":"Distributed programs are much more difficult to design, understand and implement than sequential or parallel ones. This is mainly due to the uncertainty created by the asynchrony inherent to distributed machines. So appropriate concepts and tools have to be devised to help the programmer of distributed applications in his task. This paper is motivated by the practical problem called distributed debugging. It presents concepts and tools that help the programmer to analyze distributed executions. Two basic problems are addressed: replay of a distributed execution (how to reproduce an equivalent execution despite of asynchrony) and the detection of a stable or unstable property of a distributed execution. Concepts and tools presented are fundamental when designing an environment for distributed program development. This paper is essentially a survey presenting a state of the art in replay mechanisms and detection of unstable properties on global states of distributed executions.<<ETX>>","PeriodicalId":448130,"journal":{"name":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","volume":"56 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1995-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings 1st International Conference on Algorithms and Architectures for Parallel Processing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICAPP.1995.472239","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Distributed programs are much more difficult to design, understand and implement than sequential or parallel ones. This is mainly due to the uncertainty created by the asynchrony inherent to distributed machines. So appropriate concepts and tools have to be devised to help the programmer of distributed applications in his task. This paper is motivated by the practical problem called distributed debugging. It presents concepts and tools that help the programmer to analyze distributed executions. Two basic problems are addressed: replay of a distributed execution (how to reproduce an equivalent execution despite of asynchrony) and the detection of a stable or unstable property of a distributed execution. Concepts and tools presented are fundamental when designing an environment for distributed program development. This paper is essentially a survey presenting a state of the art in replay mechanisms and detection of unstable properties on global states of distributed executions.<>