Sayali Salvi, Daniel Kästner, C. Ferdinand, Tom Bienmüller
In this article we present an approach to couple model-based testing with static analysis based on a tool coupling between Astrée and EmbeddedTester. Astrée reports all potential run-time errors in C programs. This makes it possible to prove the absence of run-time errors, but users may have to deal with false alarms, i.e. spurious notifications about potential run-time errors. Investigating alarms to find out whether they are true errors which have to be fixed, or whether they are false alarms can cause significant effort. The key idea of this work is to apply model-based testing to automatically find test vectors for alarms reported by the static analyzer. When a test vector reproducing the error has been found, it has been proven that it is a true error, when no error has been found with EmbeddedTester's model checking-based CV engine, it has been proven to be a false alarm. This can significantly reduce the alarm analysis effort and reduces the level of expertise needed to perform the code-level software verification.
{"title":"Exploiting Synergies between Static Analysis and Model-Based Testing","authors":"Sayali Salvi, Daniel Kästner, C. Ferdinand, Tom Bienmüller","doi":"10.1109/EDCC.2015.20","DOIUrl":"https://doi.org/10.1109/EDCC.2015.20","url":null,"abstract":"In this article we present an approach to couple model-based testing with static analysis based on a tool coupling between Astrée and EmbeddedTester. Astrée reports all potential run-time errors in C programs. This makes it possible to prove the absence of run-time errors, but users may have to deal with false alarms, i.e. spurious notifications about potential run-time errors. Investigating alarms to find out whether they are true errors which have to be fixed, or whether they are false alarms can cause significant effort. The key idea of this work is to apply model-based testing to automatically find test vectors for alarms reported by the static analyzer. When a test vector reproducing the error has been found, it has been proven that it is a true error, when no error has been found with EmbeddedTester's model checking-based CV engine, it has been proven to be a false alarm. This can significantly reduce the alarm analysis effort and reduces the level of expertise needed to perform the code-level software verification.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114410518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We address the problem of group formation in automotive cooperative applications using wireless vehicle-to-vehicle communication. Group formation (GF) is an essential step in bootstrapping self-organizing distributed applications such as virtual traffic lights. We propose a synchronous GF algorithm and investigate its behaviour in the presence of an unbounded number of asymmetric communication failures (receive omissions). Given that GF is an agreement problem, we know from previous research that it is impossible to design a GF algorithm that can guarantee agreement on the group membership in the presence of an unbounded number of messages losses. Thus, under this assumption, disagreement is an unavoidable outcome of a GF algorithm. We consider two types of disagreement(failure modes): safe and unsafe disagreement. To reduce the probability of unsafe disagreement, our algorithm uses a local oracle to estimate the number of nodes that are attempting to participate in the GF process. (Such estimates can be provided by roadside sensors or local sensors in a vehicle such as cameras.)For the proposed algorithm, we show how the probability of unsafe and safe disagreement varies for different system settings as a function of the probability of message loss. We also show how these probabilities vary depending on the correctness of the local oracles. More specifically, our results show that unsafe disagreement occurs only if the local oracles underestimates the number of participating nodes.
{"title":"On the Probability of Unsafe Disagreement in Group Formation Algorithms for Vehicular Ad Hoc Networks","authors":"Negin Fathollahnejad, R. Pathan, J. Karlsson","doi":"10.1109/EDCC.2015.29","DOIUrl":"https://doi.org/10.1109/EDCC.2015.29","url":null,"abstract":"We address the problem of group formation in automotive cooperative applications using wireless vehicle-to-vehicle communication. Group formation (GF) is an essential step in bootstrapping self-organizing distributed applications such as virtual traffic lights. We propose a synchronous GF algorithm and investigate its behaviour in the presence of an unbounded number of asymmetric communication failures (receive omissions). Given that GF is an agreement problem, we know from previous research that it is impossible to design a GF algorithm that can guarantee agreement on the group membership in the presence of an unbounded number of messages losses. Thus, under this assumption, disagreement is an unavoidable outcome of a GF algorithm. We consider two types of disagreement(failure modes): safe and unsafe disagreement. To reduce the probability of unsafe disagreement, our algorithm uses a local oracle to estimate the number of nodes that are attempting to participate in the GF process. (Such estimates can be provided by roadside sensors or local sensors in a vehicle such as cameras.)For the proposed algorithm, we show how the probability of unsafe and safe disagreement varies for different system settings as a function of the probability of message loss. We also show how these probabilities vary depending on the correctness of the local oracles. More specifically, our results show that unsafe disagreement occurs only if the local oracles underestimates the number of participating nodes.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125706326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
B. Sangchoolie, Fatemeh Ayatolahi, R. Johansson, J. Karlsson
ISA-level fault injection, i.e. the injection of bit-flip faults in Instruction Set Architecture (ISA) registers and main memory words, is widely used for studying the impact of transient and intermittent hardware faults in computer systems. This paper compares two techniques for ISA-level fault injection: inject-on-read, and inject-on-write. The first technique injects bit-flips in a data-item (the content of a register or memory word) just before the data-item is read by a machine instruction, while the second one injects bit-flips in a data-item just after it has been updated by a machine instruction. In addition, the paper compares two variants of inject-on-read, one where all faults are given the same weight and one where weight factors are used to reflect the time a data-item spends in a register or memory word. The weighted injected-on-read aims to accurately model soft errors that occur when an ionizing particle perturbs a data-item while it resides in an ISA register or a memory word. This is in contrast to inject-on-write, which emulates errors that propagate into an ISA register or a memory word. Our experiments show significant differences in the results obtained with the three techniques.
ISA级故障注入,即在指令集体系结构(Instruction Set Architecture, ISA)寄存器和主存字中注入位翻转故障,被广泛用于研究计算机系统中瞬态和间歇性硬件故障的影响。本文比较了isa级故障注入的两种技术:读时注入和写时注入。第一种技术在数据项(寄存器或内存字的内容)被机器指令读取之前注入位翻转,而第二种技术在数据项被机器指令更新之后注入位翻转。此外,本文还比较了读时注入的两种变体,其中一种是赋予所有错误相同的权重,另一种是使用权重因子来反映数据项在寄存器或存储字中花费的时间。加权的读时注入旨在准确地模拟当电离粒子扰动驻留在ISA寄存器或存储字中的数据项时发生的软错误。这与写时注入相反,后者模拟传播到ISA寄存器或内存字中的错误。我们的实验表明,这三种技术得到的结果有显著差异。
{"title":"A Comparison of Inject-on-Read and Inject-on-Write in ISA-Level Fault Injection","authors":"B. Sangchoolie, Fatemeh Ayatolahi, R. Johansson, J. Karlsson","doi":"10.1109/EDCC.2015.24","DOIUrl":"https://doi.org/10.1109/EDCC.2015.24","url":null,"abstract":"ISA-level fault injection, i.e. the injection of bit-flip faults in Instruction Set Architecture (ISA) registers and main memory words, is widely used for studying the impact of transient and intermittent hardware faults in computer systems. This paper compares two techniques for ISA-level fault injection: inject-on-read, and inject-on-write. The first technique injects bit-flips in a data-item (the content of a register or memory word) just before the data-item is read by a machine instruction, while the second one injects bit-flips in a data-item just after it has been updated by a machine instruction. In addition, the paper compares two variants of inject-on-read, one where all faults are given the same weight and one where weight factors are used to reflect the time a data-item spends in a register or memory word. The weighted injected-on-read aims to accurately model soft errors that occur when an ionizing particle perturbs a data-item while it resides in an ISA register or a memory word. This is in contrast to inject-on-write, which emulates errors that propagate into an ISA register or a memory word. Our experiments show significant differences in the results obtained with the three techniques.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133976735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
L. J. Saiz, P. Gil, J. Gracia, D. Gil, J. Baraza-Calvo
Error correction codes (ECCs) are commonly used in computer systems to protect information from errors. For example, single error correction (SEC) codes are frequently used for memory protection. Due to continuous technology scaling, soft errors on registers have become a major concern, and ECCs are required to protect them. Nevertheless, using an ECC increases delay, area and power consumption. In this way, ECCs are traditionally designed focusing on minimizing the number of redundant bits added. This is important in memories, as these bits are added to each word in the whole memory. However, this fact is less important in registers, where minimizing the encoding and decoding delay can be more interesting. This paper proposes a method to develop codes with 1-gate delay encoders and 4-gate delay decoders, independently of the word length. These codes have been designed to correct single errors only in data bits to reduce the overhead.
{"title":"Ultrafast Single Error Correction Codes for Protecting Processor Registers","authors":"L. J. Saiz, P. Gil, J. Gracia, D. Gil, J. Baraza-Calvo","doi":"10.1109/EDCC.2015.30","DOIUrl":"https://doi.org/10.1109/EDCC.2015.30","url":null,"abstract":"Error correction codes (ECCs) are commonly used in computer systems to protect information from errors. For example, single error correction (SEC) codes are frequently used for memory protection. Due to continuous technology scaling, soft errors on registers have become a major concern, and ECCs are required to protect them. Nevertheless, using an ECC increases delay, area and power consumption. In this way, ECCs are traditionally designed focusing on minimizing the number of redundant bits added. This is important in memories, as these bits are added to each word in the whole memory. However, this fact is less important in registers, where minimizing the encoding and decoding delay can be more interesting. This paper proposes a method to develop codes with 1-gate delay encoders and 4-gate delay decoders, independently of the word length. These codes have been designed to correct single errors only in data bits to reduce the overhead.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114818124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Horst Schirmeier, Martin Hoffmann, Christian J. Dietrich, M. Lenz, D. Lohmann, O. Spinczyk
Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.
{"title":"FAIL*: An Open and Versatile Fault-Injection Framework for the Assessment of Software-Implemented Hardware Fault Tolerance","authors":"Horst Schirmeier, Martin Hoffmann, Christian J. Dietrich, M. Lenz, D. Lohmann, O. Spinczyk","doi":"10.1109/EDCC.2015.28","DOIUrl":"https://doi.org/10.1109/EDCC.2015.28","url":null,"abstract":"Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115520240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Software is continually evolving, to fix bugs and add new features. Industry users, however, often value stability, and thus may not be able to update their code base to the latest versions. This raises the need to selectively backport new features to older software versions. Traditionally, backporting has been done by cluttering the backported code with preprocessor directives, to replace behaviors that are unsupported in an earlier version by appropriate workarounds. This approach however, involves writing a lot of error-prone backporting code, and results in implementations that are hard to read and maintain. We consider this issue in the context of the Linux kernel, for whicholder versions are in wide use. We present a new backporting strategy that relies on the use of a backporting compatability library and on code that is automatically generated using the program transformation tool Coccinelle. This approach reduces the amount of code that must be manually written, and thus can help the Linux kernel backporting effort scale while maintainingthe dependability of the backporting process.
{"title":"Increasing Automation in the Backporting of Linux Drivers Using Coccinelle","authors":"Luis R. Rodriguez, J. Lawall","doi":"10.1109/EDCC.2015.23","DOIUrl":"https://doi.org/10.1109/EDCC.2015.23","url":null,"abstract":"Software is continually evolving, to fix bugs and add new features. Industry users, however, often value stability, and thus may not be able to update their code base to the latest versions. This raises the need to selectively backport new features to older software versions. Traditionally, backporting has been done by cluttering the backported code with preprocessor directives, to replace behaviors that are unsupported in an earlier version by appropriate workarounds. This approach however, involves writing a lot of error-prone backporting code, and results in implementations that are hard to read and maintain. We consider this issue in the context of the Linux kernel, for whicholder versions are in wide use. We present a new backporting strategy that relies on the use of a backporting compatability library and on code that is automatically generated using the program transformation tool Coccinelle. This approach reduces the amount of code that must be manually written, and thus can help the Linux kernel backporting effort scale while maintainingthe dependability of the backporting process.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130121028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Technology advances provide a myriad of advantages for VLSI systems, but also increase the sensitivity of the combinational logic to different fault profiles. Shorter and shorter faults which up to date had been filtered, named as fugacious faults, require new attention as they are considered a feasible sign of warning prior to potential failures. Despite their increasing impact on modern VLSI systems, such faults are not largely considered today by the safety industry. Their early detection is however critical to enable an early evaluation of potential risks for the system and the subsequent deployment of suitable failure avoidance mechanisms. For instance, the early detection of fugacious faults will provide the necessary means to extend the mission time of a system thanks to the temporal avoidance of aging effects. Because classical detection mechanisms are not suited to cope with such fugacious faults, this paper proposes a method specifically designed to detect and diagnose them. Reported experiments will show the feasibility and interest of the proposal.
{"title":"Increasing the Dependability of VLSI Systems through Early Detection of Fugacious Faults","authors":"Jaime Espinosa, D. Andrés, P. Gil","doi":"10.1109/EDCC.2015.13","DOIUrl":"https://doi.org/10.1109/EDCC.2015.13","url":null,"abstract":"Technology advances provide a myriad of advantages for VLSI systems, but also increase the sensitivity of the combinational logic to different fault profiles. Shorter and shorter faults which up to date had been filtered, named as fugacious faults, require new attention as they are considered a feasible sign of warning prior to potential failures. Despite their increasing impact on modern VLSI systems, such faults are not largely considered today by the safety industry. Their early detection is however critical to enable an early evaluation of potential risks for the system and the subsequent deployment of suitable failure avoidance mechanisms. For instance, the early detection of fugacious faults will provide the necessary means to extend the mission time of a system thanks to the temporal avoidance of aging effects. Because classical detection mechanisms are not suited to cope with such fugacious faults, this paper proposes a method specifically designed to detect and diagnose them. Reported experiments will show the feasibility and interest of the proposal.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121384377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Peter Munk, Mohammad Shadi Al Hakeem, Raphael Lisicki, Helge Parzyjegla, Jan Richling, Hans-Ulrich Heiß
Commercial-off-the-shelf (COTS) many-core processors offer the performance needed for computational-intensive safety-critical real-time applications such as autonomous driving. However, these consumer-grade many-core processors are increasingly susceptible to faults because of their highly integrated design. In this paper, we present a fault-tolerance framework that eases the usage of COTS many-core processors for safety-critical applications. Our framework employs an adaptable software-based fault-tolerance mechanism that combines N Modular Redundancy (NMR) with a repair process and a rejuvenating round robin voting scheme. A Stochastic Activity Network (SAN) model of the fault-tolerance mechanism allows the framework to adapt the parameters of the mechanism such that a specified target availability is achieved with minimum overhead. Experiments on a cycle-accurate simulator empirically prove the correctness of the SAN model and evaluate the overhead of the framework.
{"title":"Toward a Fault-Tolerance Framework for COTS Many-Core Systems","authors":"Peter Munk, Mohammad Shadi Al Hakeem, Raphael Lisicki, Helge Parzyjegla, Jan Richling, Hans-Ulrich Heiß","doi":"10.1109/EDCC.2015.32","DOIUrl":"https://doi.org/10.1109/EDCC.2015.32","url":null,"abstract":"Commercial-off-the-shelf (COTS) many-core processors offer the performance needed for computational-intensive safety-critical real-time applications such as autonomous driving. However, these consumer-grade many-core processors are increasingly susceptible to faults because of their highly integrated design. In this paper, we present a fault-tolerance framework that eases the usage of COTS many-core processors for safety-critical applications. Our framework employs an adaptable software-based fault-tolerance mechanism that combines N Modular Redundancy (NMR) with a repair process and a rejuvenating round robin voting scheme. A Stochastic Activity Network (SAN) model of the fault-tolerance mechanism allows the framework to adapt the parameters of the mechanism such that a specified target availability is achieved with minimum overhead. Experiments on a cycle-accurate simulator empirically prove the correctness of the SAN model and evaluate the overhead of the framework.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125437867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Previous approaches to fault and failure modeling are based on adding explicit models of faults/failures, and failure propagation to behavioral and architectural modes. This adds a lot of overhead (extra work), and also, is a cause of creating inconsistencies, especially by obtaining a mismatch between failures and violation of requirements or specifications. Instead of creating separate models for failures, the idea here is to exploit the fundamental definition of failures as violation of requirement or specification. We assume that the systems functionality is specified using a set of requirements, and in particular, requirements structured according to contracts theory. Instead of creating separate models for failure propagation, we exploit the structuring of requirements obtained when the system is specified using contracts theory. The use of contracts theory establishes a formal framework for how traceability links between requirements themselves and to the architecture are specified. It is further explained how fault and failure propagation models in the form of Bayesian Networks are obtained. One particular challenge is the modeling of faults/failure and their propagation when fault management mechanisms have been implemented. Therefore this area is covered in some extra depth.
{"title":"Failure Propagation Modeling Based on Contracts Theory","authors":"M. Nyberg, Jonas Westman","doi":"10.1109/EDCC.2015.21","DOIUrl":"https://doi.org/10.1109/EDCC.2015.21","url":null,"abstract":"Previous approaches to fault and failure modeling are based on adding explicit models of faults/failures, and failure propagation to behavioral and architectural modes. This adds a lot of overhead (extra work), and also, is a cause of creating inconsistencies, especially by obtaining a mismatch between failures and violation of requirements or specifications. Instead of creating separate models for failures, the idea here is to exploit the fundamental definition of failures as violation of requirement or specification. We assume that the systems functionality is specified using a set of requirements, and in particular, requirements structured according to contracts theory. Instead of creating separate models for failure propagation, we exploit the structuring of requirements obtained when the system is specified using contracts theory. The use of contracts theory establishes a formal framework for how traceability links between requirements themselves and to the architecture are specified. It is further explained how fault and failure propagation models in the form of Bayesian Networks are obtained. One particular challenge is the modeling of faults/failure and their propagation when fault management mechanisms have been implemented. Therefore this area is covered in some extra depth.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125896647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
We study theoretical and practical aspects of five of the most well-known self-stabilizing dining philosophers algorithms. We theoretically prove that three of them are incorrect. For practical evaluation, we simulate all algorithms and evaluate their fault-tolerance, latency and throughput of critical section access. We present a new combined algorithm that achieves the best throughput of the two remaining correct algorithms by determining the system load and switching between these basic algorithms. We prove the combined algorithm correct, simulate it and study its performance characteristics.
{"title":"Evaluating and Optimizing Stabilizing Dining Philosophers","authors":"Jordan Adamek, Mikhail Nesterenko, S. Tixeuil","doi":"10.1109/EDCC.2015.11","DOIUrl":"https://doi.org/10.1109/EDCC.2015.11","url":null,"abstract":"We study theoretical and practical aspects of five of the most well-known self-stabilizing dining philosophers algorithms. We theoretically prove that three of them are incorrect. For practical evaluation, we simulate all algorithms and evaluate their fault-tolerance, latency and throughput of critical section access. We present a new combined algorithm that achieves the best throughput of the two remaining correct algorithms by determining the system load and switching between these basic algorithms. We prove the combined algorithm correct, simulate it and study its performance characteristics.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126459417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}