首页 > 最新文献

2015 11th European Dependable Computing Conference (EDCC)最新文献

英文 中文
Exploiting Synergies between Static Analysis and Model-Based Testing 利用静态分析和基于模型的测试之间的协同作用
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.20
Sayali Salvi, Daniel Kästner, C. Ferdinand, Tom Bienmüller
In this article we present an approach to couple model-based testing with static analysis based on a tool coupling between Astrée and EmbeddedTester. Astrée reports all potential run-time errors in C programs. This makes it possible to prove the absence of run-time errors, but users may have to deal with false alarms, i.e. spurious notifications about potential run-time errors. Investigating alarms to find out whether they are true errors which have to be fixed, or whether they are false alarms can cause significant effort. The key idea of this work is to apply model-based testing to automatically find test vectors for alarms reported by the static analyzer. When a test vector reproducing the error has been found, it has been proven that it is a true error, when no error has been found with EmbeddedTester's model checking-based CV engine, it has been proven to be a false alarm. This can significantly reduce the alarm analysis effort and reduces the level of expertise needed to perform the code-level software verification.
在这篇文章中,我们提出了一种将基于模型的测试与静态分析结合起来的方法,该方法基于astr宇航和EmbeddedTester之间的工具耦合。astracei报告C程序中所有潜在的运行时错误。这使得证明没有运行时错误成为可能,但用户可能不得不处理假警报,即关于潜在运行时错误的虚假通知。调查警报以确定它们是否是必须修复的真实错误,或者它们是否是假警报可能会导致大量工作。该工作的核心思想是应用基于模型的测试方法,对静态分析器所报告的警报自动寻找测试向量。当发现重现错误的测试向量时,已经证明这是一个真实的错误,当使用EmbeddedTester的基于模型检查的CV引擎没有发现错误时,已经证明这是一个假警报。这可以显著减少警报分析工作,并降低执行代码级软件验证所需的专业知识水平。
{"title":"Exploiting Synergies between Static Analysis and Model-Based Testing","authors":"Sayali Salvi, Daniel Kästner, C. Ferdinand, Tom Bienmüller","doi":"10.1109/EDCC.2015.20","DOIUrl":"https://doi.org/10.1109/EDCC.2015.20","url":null,"abstract":"In this article we present an approach to couple model-based testing with static analysis based on a tool coupling between Astrée and EmbeddedTester. Astrée reports all potential run-time errors in C programs. This makes it possible to prove the absence of run-time errors, but users may have to deal with false alarms, i.e. spurious notifications about potential run-time errors. Investigating alarms to find out whether they are true errors which have to be fixed, or whether they are false alarms can cause significant effort. The key idea of this work is to apply model-based testing to automatically find test vectors for alarms reported by the static analyzer. When a test vector reproducing the error has been found, it has been proven that it is a true error, when no error has been found with EmbeddedTester's model checking-based CV engine, it has been proven to be a false alarm. This can significantly reduce the alarm analysis effort and reduces the level of expertise needed to perform the code-level software verification.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114410518","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
On the Probability of Unsafe Disagreement in Group Formation Algorithms for Vehicular Ad Hoc Networks 车载Ad Hoc网络群形成算法中的不安全不一致概率研究
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.29
Negin Fathollahnejad, R. Pathan, J. Karlsson
We address the problem of group formation in automotive cooperative applications using wireless vehicle-to-vehicle communication. Group formation (GF) is an essential step in bootstrapping self-organizing distributed applications such as virtual traffic lights. We propose a synchronous GF algorithm and investigate its behaviour in the presence of an unbounded number of asymmetric communication failures (receive omissions). Given that GF is an agreement problem, we know from previous research that it is impossible to design a GF algorithm that can guarantee agreement on the group membership in the presence of an unbounded number of messages losses. Thus, under this assumption, disagreement is an unavoidable outcome of a GF algorithm. We consider two types of disagreement(failure modes): safe and unsafe disagreement. To reduce the probability of unsafe disagreement, our algorithm uses a local oracle to estimate the number of nodes that are attempting to participate in the GF process. (Such estimates can be provided by roadside sensors or local sensors in a vehicle such as cameras.)For the proposed algorithm, we show how the probability of unsafe and safe disagreement varies for different system settings as a function of the probability of message loss. We also show how these probabilities vary depending on the correctness of the local oracles. More specifically, our results show that unsafe disagreement occurs only if the local oracles underestimates the number of participating nodes.
我们使用无线车对车通信解决了汽车协同应用中的组形成问题。群形成(GF)是自组织分布式应用(如虚拟交通灯)的一个重要步骤。我们提出了一种同步GF算法,并研究了它在无限数量的非对称通信失败(接收遗漏)存在下的行为。考虑到GF是一个协议问题,我们从以往的研究中知道,不可能设计出一个GF算法,在消息丢失无限大的情况下保证组成员的一致性。因此,在这种假设下,分歧是GF算法不可避免的结果。我们考虑两种类型的分歧(失效模式):安全和不安全分歧。为了减少不安全分歧的概率,我们的算法使用一个本地oracle来估计试图参与GF过程的节点数量。(这种估计可以由路边传感器或车辆上的本地传感器(如摄像头)提供。)对于所提出的算法,我们展示了不同系统设置下不安全和安全不一致的概率如何作为消息丢失概率的函数而变化。我们还展示了这些概率如何根据本地预言的正确性而变化。更具体地说,我们的结果表明,只有当本地预言机低估了参与节点的数量时,才会发生不安全的分歧。
{"title":"On the Probability of Unsafe Disagreement in Group Formation Algorithms for Vehicular Ad Hoc Networks","authors":"Negin Fathollahnejad, R. Pathan, J. Karlsson","doi":"10.1109/EDCC.2015.29","DOIUrl":"https://doi.org/10.1109/EDCC.2015.29","url":null,"abstract":"We address the problem of group formation in automotive cooperative applications using wireless vehicle-to-vehicle communication. Group formation (GF) is an essential step in bootstrapping self-organizing distributed applications such as virtual traffic lights. We propose a synchronous GF algorithm and investigate its behaviour in the presence of an unbounded number of asymmetric communication failures (receive omissions). Given that GF is an agreement problem, we know from previous research that it is impossible to design a GF algorithm that can guarantee agreement on the group membership in the presence of an unbounded number of messages losses. Thus, under this assumption, disagreement is an unavoidable outcome of a GF algorithm. We consider two types of disagreement(failure modes): safe and unsafe disagreement. To reduce the probability of unsafe disagreement, our algorithm uses a local oracle to estimate the number of nodes that are attempting to participate in the GF process. (Such estimates can be provided by roadside sensors or local sensors in a vehicle such as cameras.)For the proposed algorithm, we show how the probability of unsafe and safe disagreement varies for different system settings as a function of the probability of message loss. We also show how these probabilities vary depending on the correctness of the local oracles. More specifically, our results show that unsafe disagreement occurs only if the local oracles underestimates the number of participating nodes.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125706326","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
A Comparison of Inject-on-Read and Inject-on-Write in ISA-Level Fault Injection isa级故障注入中读时注入与写时注入的比较
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.24
B. Sangchoolie, Fatemeh Ayatolahi, R. Johansson, J. Karlsson
ISA-level fault injection, i.e. the injection of bit-flip faults in Instruction Set Architecture (ISA) registers and main memory words, is widely used for studying the impact of transient and intermittent hardware faults in computer systems. This paper compares two techniques for ISA-level fault injection: inject-on-read, and inject-on-write. The first technique injects bit-flips in a data-item (the content of a register or memory word) just before the data-item is read by a machine instruction, while the second one injects bit-flips in a data-item just after it has been updated by a machine instruction. In addition, the paper compares two variants of inject-on-read, one where all faults are given the same weight and one where weight factors are used to reflect the time a data-item spends in a register or memory word. The weighted injected-on-read aims to accurately model soft errors that occur when an ionizing particle perturbs a data-item while it resides in an ISA register or a memory word. This is in contrast to inject-on-write, which emulates errors that propagate into an ISA register or a memory word. Our experiments show significant differences in the results obtained with the three techniques.
ISA级故障注入,即在指令集体系结构(Instruction Set Architecture, ISA)寄存器和主存字中注入位翻转故障,被广泛用于研究计算机系统中瞬态和间歇性硬件故障的影响。本文比较了isa级故障注入的两种技术:读时注入和写时注入。第一种技术在数据项(寄存器或内存字的内容)被机器指令读取之前注入位翻转,而第二种技术在数据项被机器指令更新之后注入位翻转。此外,本文还比较了读时注入的两种变体,其中一种是赋予所有错误相同的权重,另一种是使用权重因子来反映数据项在寄存器或存储字中花费的时间。加权的读时注入旨在准确地模拟当电离粒子扰动驻留在ISA寄存器或存储字中的数据项时发生的软错误。这与写时注入相反,后者模拟传播到ISA寄存器或内存字中的错误。我们的实验表明,这三种技术得到的结果有显著差异。
{"title":"A Comparison of Inject-on-Read and Inject-on-Write in ISA-Level Fault Injection","authors":"B. Sangchoolie, Fatemeh Ayatolahi, R. Johansson, J. Karlsson","doi":"10.1109/EDCC.2015.24","DOIUrl":"https://doi.org/10.1109/EDCC.2015.24","url":null,"abstract":"ISA-level fault injection, i.e. the injection of bit-flip faults in Instruction Set Architecture (ISA) registers and main memory words, is widely used for studying the impact of transient and intermittent hardware faults in computer systems. This paper compares two techniques for ISA-level fault injection: inject-on-read, and inject-on-write. The first technique injects bit-flips in a data-item (the content of a register or memory word) just before the data-item is read by a machine instruction, while the second one injects bit-flips in a data-item just after it has been updated by a machine instruction. In addition, the paper compares two variants of inject-on-read, one where all faults are given the same weight and one where weight factors are used to reflect the time a data-item spends in a register or memory word. The weighted injected-on-read aims to accurately model soft errors that occur when an ionizing particle perturbs a data-item while it resides in an ISA register or a memory word. This is in contrast to inject-on-write, which emulates errors that propagate into an ISA register or a memory word. Our experiments show significant differences in the results obtained with the three techniques.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133976735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Ultrafast Single Error Correction Codes for Protecting Processor Registers 保护处理器寄存器的超快单纠错码
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.30
L. J. Saiz, P. Gil, J. Gracia, D. Gil, J. Baraza-Calvo
Error correction codes (ECCs) are commonly used in computer systems to protect information from errors. For example, single error correction (SEC) codes are frequently used for memory protection. Due to continuous technology scaling, soft errors on registers have become a major concern, and ECCs are required to protect them. Nevertheless, using an ECC increases delay, area and power consumption. In this way, ECCs are traditionally designed focusing on minimizing the number of redundant bits added. This is important in memories, as these bits are added to each word in the whole memory. However, this fact is less important in registers, where minimizing the encoding and decoding delay can be more interesting. This paper proposes a method to develop codes with 1-gate delay encoders and 4-gate delay decoders, independently of the word length. These codes have been designed to correct single errors only in data bits to reduce the overhead.
纠错码(ECCs)通常用于计算机系统中,以保护信息免受错误。例如,单错误纠正(SEC)码经常用于内存保护。由于技术的不断扩展,寄存器上的软错误已经成为一个主要问题,需要ecc来保护它们。然而,使用ECC会增加延迟、面积和功耗。通过这种方式,ecc传统上的设计重点是尽量减少冗余位的增加。这在记忆中很重要,因为这些比特会加到整个记忆中的每个单词上。然而,这个事实在寄存器中不那么重要,在寄存器中最小化编码和解码延迟可能更有趣。本文提出了一种开发具有1门延迟编码器和4门延迟解码器的编码的方法,与字长无关。这些代码被设计为只纠正数据位中的单个错误,以减少开销。
{"title":"Ultrafast Single Error Correction Codes for Protecting Processor Registers","authors":"L. J. Saiz, P. Gil, J. Gracia, D. Gil, J. Baraza-Calvo","doi":"10.1109/EDCC.2015.30","DOIUrl":"https://doi.org/10.1109/EDCC.2015.30","url":null,"abstract":"Error correction codes (ECCs) are commonly used in computer systems to protect information from errors. For example, single error correction (SEC) codes are frequently used for memory protection. Due to continuous technology scaling, soft errors on registers have become a major concern, and ECCs are required to protect them. Nevertheless, using an ECC increases delay, area and power consumption. In this way, ECCs are traditionally designed focusing on minimizing the number of redundant bits added. This is important in memories, as these bits are added to each word in the whole memory. However, this fact is less important in registers, where minimizing the encoding and decoding delay can be more interesting. This paper proposes a method to develop codes with 1-gate delay encoders and 4-gate delay decoders, independently of the word length. These codes have been designed to correct single errors only in data bits to reduce the overhead.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114818124","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
FAIL*: An Open and Versatile Fault-Injection Framework for the Assessment of Software-Implemented Hardware Fault Tolerance FAIL*:一个开放和通用的软件实现硬件容错评估的故障注入框架
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.28
Horst Schirmeier, Martin Hoffmann, Christian J. Dietrich, M. Lenz, D. Lohmann, O. Spinczyk
Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.
由于电压和结构的收缩,辐射对电路运行的影响增加,导致未来的硬件设计显示出更高的软错误率。软件开发人员必须处理这些影响以确保功能安全。然而,基于软件的硬件容错是一种整体特性,在实践中很难实现,可能会受到每一个设计决策的影响。我们提出FAIL*,一个开放和通用的架构级故障注入(FI)框架,用于在迭代软件开发过程中持续评估和量化容错性。FAIL*为开发人员提供可重用和可组合的FI活动,先进的预处理和后处理分析,以轻松识别软件中的敏感点,为几个硬件和模拟器平台提供良好的抽象后端实现,以及通过提供大规模并行化来扩展FI活动。我们描述FAIL*,它在安全关键软件开发过程中的应用,以及从现实世界的例子中吸取的教训。
{"title":"FAIL*: An Open and Versatile Fault-Injection Framework for the Assessment of Software-Implemented Hardware Fault Tolerance","authors":"Horst Schirmeier, Martin Hoffmann, Christian J. Dietrich, M. Lenz, D. Lohmann, O. Spinczyk","doi":"10.1109/EDCC.2015.28","DOIUrl":"https://doi.org/10.1109/EDCC.2015.28","url":null,"abstract":"Due to voltage and structure shrinking, the influence of radiation on a circuit's operation increases, resulting in future hardware designs exhibiting much higher rates of soft errors. Software developers have to cope with these effects to ensure functional safety. However, software-based hardware fault tolerance is a holistic property that is tricky to achieve in practice, potentially impaired by every single design decision. We present FAIL*, an open and versatile architecture-level fault-injection (FI) framework for the continuous assessment and quantification of fault tolerance in an iterative software development process. FAIL* supplies the developer with reusable and composable FI campaigns, advanced pre-and post-processing analyses to easily identify sensitive spots in the software, well-abstracted back-end implementations for several hardware and simulator platforms, and scalability of FI campaigns by providing massive parallelization. We describe FAIL*, its application to the development process of safety-critical software, and the lessons learned from a real-world example.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"344 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115520240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 54
Increasing Automation in the Backporting of Linux Drivers Using Coccinelle 使用Coccinelle提高Linux驱动程序后移植的自动化程度
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.23
Luis R. Rodriguez, J. Lawall
Software is continually evolving, to fix bugs and add new features. Industry users, however, often value stability, and thus may not be able to update their code base to the latest versions. This raises the need to selectively backport new features to older software versions. Traditionally, backporting has been done by cluttering the backported code with preprocessor directives, to replace behaviors that are unsupported in an earlier version by appropriate workarounds. This approach however, involves writing a lot of error-prone backporting code, and results in implementations that are hard to read and maintain. We consider this issue in the context of the Linux kernel, for whicholder versions are in wide use. We present a new backporting strategy that relies on the use of a backporting compatability library and on code that is automatically generated using the program transformation tool Coccinelle. This approach reduces the amount of code that must be manually written, and thus can help the Linux kernel backporting effort scale while maintainingthe dependability of the backporting process.
软件是不断发展的,为了修复错误和增加新功能。然而,行业用户通常看重稳定性,因此可能无法将他们的代码库更新到最新版本。这就需要有选择地将新功能移植到旧版本的软件中。传统上,向后移植是通过将向后移植的代码与预处理器指令混在一起来完成的,通过适当的解决方法来替换早期版本中不支持的行为。然而,这种方法涉及编写大量容易出错的后移植代码,并导致难以阅读和维护的实现。我们在Linux内核的背景下考虑这个问题,因为持有者版本被广泛使用。我们提出了一种新的反向移植策略,该策略依赖于使用反向移植兼容性库和使用程序转换工具Coccinelle自动生成的代码。这种方法减少了必须手工编写的代码量,因此可以帮助扩展Linux内核的后移植工作,同时保持后移植过程的可靠性。
{"title":"Increasing Automation in the Backporting of Linux Drivers Using Coccinelle","authors":"Luis R. Rodriguez, J. Lawall","doi":"10.1109/EDCC.2015.23","DOIUrl":"https://doi.org/10.1109/EDCC.2015.23","url":null,"abstract":"Software is continually evolving, to fix bugs and add new features. Industry users, however, often value stability, and thus may not be able to update their code base to the latest versions. This raises the need to selectively backport new features to older software versions. Traditionally, backporting has been done by cluttering the backported code with preprocessor directives, to replace behaviors that are unsupported in an earlier version by appropriate workarounds. This approach however, involves writing a lot of error-prone backporting code, and results in implementations that are hard to read and maintain. We consider this issue in the context of the Linux kernel, for whicholder versions are in wide use. We present a new backporting strategy that relies on the use of a backporting compatability library and on code that is automatically generated using the program transformation tool Coccinelle. This approach reduces the amount of code that must be manually written, and thus can help the Linux kernel backporting effort scale while maintainingthe dependability of the backporting process.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"70 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130121028","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Increasing the Dependability of VLSI Systems through Early Detection of Fugacious Faults 通过早期检测逸散性故障来提高VLSI系统的可靠性
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.13
Jaime Espinosa, D. Andrés, P. Gil
Technology advances provide a myriad of advantages for VLSI systems, but also increase the sensitivity of the combinational logic to different fault profiles. Shorter and shorter faults which up to date had been filtered, named as fugacious faults, require new attention as they are considered a feasible sign of warning prior to potential failures. Despite their increasing impact on modern VLSI systems, such faults are not largely considered today by the safety industry. Their early detection is however critical to enable an early evaluation of potential risks for the system and the subsequent deployment of suitable failure avoidance mechanisms. For instance, the early detection of fugacious faults will provide the necessary means to extend the mission time of a system thanks to the temporal avoidance of aging effects. Because classical detection mechanisms are not suited to cope with such fugacious faults, this paper proposes a method specifically designed to detect and diagnose them. Reported experiments will show the feasibility and interest of the proposal.
技术的进步为VLSI系统提供了无数的优势,但也增加了组合逻辑对不同故障剖面的灵敏度。目前已被过滤的越来越短的断层,被称为易失性断层,需要新的关注,因为它们被认为是潜在故障之前的一个可行的警告信号。尽管它们对现代VLSI系统的影响越来越大,但安全行业今天并没有很大程度上考虑到这些故障。然而,它们的早期检测对于早期评估系统的潜在风险以及随后部署适当的故障避免机制至关重要。例如,由于在时间上避免了老化效应,及早发现易失性故障将为延长系统的任务时间提供必要的手段。由于传统的检测机制不适合处理这种暂态故障,本文提出了一种专门设计的方法来检测和诊断这种暂态故障。报告的实验将表明该建议的可行性和兴趣。
{"title":"Increasing the Dependability of VLSI Systems through Early Detection of Fugacious Faults","authors":"Jaime Espinosa, D. Andrés, P. Gil","doi":"10.1109/EDCC.2015.13","DOIUrl":"https://doi.org/10.1109/EDCC.2015.13","url":null,"abstract":"Technology advances provide a myriad of advantages for VLSI systems, but also increase the sensitivity of the combinational logic to different fault profiles. Shorter and shorter faults which up to date had been filtered, named as fugacious faults, require new attention as they are considered a feasible sign of warning prior to potential failures. Despite their increasing impact on modern VLSI systems, such faults are not largely considered today by the safety industry. Their early detection is however critical to enable an early evaluation of potential risks for the system and the subsequent deployment of suitable failure avoidance mechanisms. For instance, the early detection of fugacious faults will provide the necessary means to extend the mission time of a system thanks to the temporal avoidance of aging effects. Because classical detection mechanisms are not suited to cope with such fugacious faults, this paper proposes a method specifically designed to detect and diagnose them. Reported experiments will show the feasibility and interest of the proposal.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121384377","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Toward a Fault-Tolerance Framework for COTS Many-Core Systems 面向COTS多核系统的容错框架
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.32
Peter Munk, Mohammad Shadi Al Hakeem, Raphael Lisicki, Helge Parzyjegla, Jan Richling, Hans-Ulrich Heiß
Commercial-off-the-shelf (COTS) many-core processors offer the performance needed for computational-intensive safety-critical real-time applications such as autonomous driving. However, these consumer-grade many-core processors are increasingly susceptible to faults because of their highly integrated design. In this paper, we present a fault-tolerance framework that eases the usage of COTS many-core processors for safety-critical applications. Our framework employs an adaptable software-based fault-tolerance mechanism that combines N Modular Redundancy (NMR) with a repair process and a rejuvenating round robin voting scheme. A Stochastic Activity Network (SAN) model of the fault-tolerance mechanism allows the framework to adapt the parameters of the mechanism such that a specified target availability is achieved with minimum overhead. Experiments on a cycle-accurate simulator empirically prove the correctness of the SAN model and evaluate the overhead of the framework.
商用现货(COTS)多核处理器提供了自动驾驶等计算密集型安全关键实时应用所需的性能。然而,这些消费级多核处理器由于其高度集成的设计而越来越容易出现故障。在本文中,我们提出了一个容错框架,简化了对安全关键应用的COTS多核处理器的使用。我们的框架采用了一种适应性强的基于软件的容错机制,该机制将N模冗余(NMR)与修复过程和循环投票方案相结合。容错机制的随机活动网络(SAN)模型允许框架调整机制的参数,以便以最小的开销实现指定的目标可用性。在周期精确模拟器上的实验经验证明了SAN模型的正确性,并对框架的开销进行了评估。
{"title":"Toward a Fault-Tolerance Framework for COTS Many-Core Systems","authors":"Peter Munk, Mohammad Shadi Al Hakeem, Raphael Lisicki, Helge Parzyjegla, Jan Richling, Hans-Ulrich Heiß","doi":"10.1109/EDCC.2015.32","DOIUrl":"https://doi.org/10.1109/EDCC.2015.32","url":null,"abstract":"Commercial-off-the-shelf (COTS) many-core processors offer the performance needed for computational-intensive safety-critical real-time applications such as autonomous driving. However, these consumer-grade many-core processors are increasingly susceptible to faults because of their highly integrated design. In this paper, we present a fault-tolerance framework that eases the usage of COTS many-core processors for safety-critical applications. Our framework employs an adaptable software-based fault-tolerance mechanism that combines N Modular Redundancy (NMR) with a repair process and a rejuvenating round robin voting scheme. A Stochastic Activity Network (SAN) model of the fault-tolerance mechanism allows the framework to adapt the parameters of the mechanism such that a specified target availability is achieved with minimum overhead. Experiments on a cycle-accurate simulator empirically prove the correctness of the SAN model and evaluate the overhead of the framework.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125437867","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Failure Propagation Modeling Based on Contracts Theory 基于契约理论的失效传播建模
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.21
M. Nyberg, Jonas Westman
Previous approaches to fault and failure modeling are based on adding explicit models of faults/failures, and failure propagation to behavioral and architectural modes. This adds a lot of overhead (extra work), and also, is a cause of creating inconsistencies, especially by obtaining a mismatch between failures and violation of requirements or specifications. Instead of creating separate models for failures, the idea here is to exploit the fundamental definition of failures as violation of requirement or specification. We assume that the systems functionality is specified using a set of requirements, and in particular, requirements structured according to contracts theory. Instead of creating separate models for failure propagation, we exploit the structuring of requirements obtained when the system is specified using contracts theory. The use of contracts theory establishes a formal framework for how traceability links between requirements themselves and to the architecture are specified. It is further explained how fault and failure propagation models in the form of Bayesian Networks are obtained. One particular challenge is the modeling of faults/failure and their propagation when fault management mechanisms have been implemented. Therefore this area is covered in some extra depth.
以前的故障和故障建模方法是基于添加显式的故障/故障模型,以及故障传播到行为和体系结构模式。这增加了大量的开销(额外的工作),同时也是造成不一致的原因,特别是由于在失败和违反需求或规范之间获得不匹配。这里的思想不是为失败创建单独的模型,而是利用失败的基本定义作为对需求或规范的违反。我们假设系统功能是使用一组需求来指定的,特别是根据契约理论构建的需求。我们没有为故障传播创建单独的模型,而是利用契约理论指定系统时获得的需求结构。契约理论的使用为需求本身和体系结构之间的可追溯性链接如何被指定建立了一个正式的框架。进一步解释了如何以贝叶斯网络的形式获得故障和故障传播模型。一个特别的挑战是在实现了故障管理机制后对故障/故障及其传播进行建模。因此,这一区域将被更深入地覆盖。
{"title":"Failure Propagation Modeling Based on Contracts Theory","authors":"M. Nyberg, Jonas Westman","doi":"10.1109/EDCC.2015.21","DOIUrl":"https://doi.org/10.1109/EDCC.2015.21","url":null,"abstract":"Previous approaches to fault and failure modeling are based on adding explicit models of faults/failures, and failure propagation to behavioral and architectural modes. This adds a lot of overhead (extra work), and also, is a cause of creating inconsistencies, especially by obtaining a mismatch between failures and violation of requirements or specifications. Instead of creating separate models for failures, the idea here is to exploit the fundamental definition of failures as violation of requirement or specification. We assume that the systems functionality is specified using a set of requirements, and in particular, requirements structured according to contracts theory. Instead of creating separate models for failure propagation, we exploit the structuring of requirements obtained when the system is specified using contracts theory. The use of contracts theory establishes a formal framework for how traceability links between requirements themselves and to the architecture are specified. It is further explained how fault and failure propagation models in the form of Bayesian Networks are obtained. One particular challenge is the modeling of faults/failure and their propagation when fault management mechanisms have been implemented. Therefore this area is covered in some extra depth.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125896647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Evaluating and Optimizing Stabilizing Dining Philosophers 评价和优化稳定饮食哲学家
Pub Date : 2015-09-07 DOI: 10.1109/EDCC.2015.11
Jordan Adamek, Mikhail Nesterenko, S. Tixeuil
We study theoretical and practical aspects of five of the most well-known self-stabilizing dining philosophers algorithms. We theoretically prove that three of them are incorrect. For practical evaluation, we simulate all algorithms and evaluate their fault-tolerance, latency and throughput of critical section access. We present a new combined algorithm that achieves the best throughput of the two remaining correct algorithms by determining the system load and switching between these basic algorithms. We prove the combined algorithm correct, simulate it and study its performance characteristics.
我们研究了五个最著名的自稳定用餐哲学家算法的理论和实践方面。我们从理论上证明其中三个是不正确的。为了进行实际评估,我们模拟了所有算法,并评估了它们的容错性、延迟和临界区访问吞吐量。我们提出了一种新的组合算法,通过确定系统负载并在这两种基本算法之间进行切换,实现了剩余两种正确算法的最佳吞吐量。验证了该组合算法的正确性,并对其进行了仿真,研究了其性能特性。
{"title":"Evaluating and Optimizing Stabilizing Dining Philosophers","authors":"Jordan Adamek, Mikhail Nesterenko, S. Tixeuil","doi":"10.1109/EDCC.2015.11","DOIUrl":"https://doi.org/10.1109/EDCC.2015.11","url":null,"abstract":"We study theoretical and practical aspects of five of the most well-known self-stabilizing dining philosophers algorithms. We theoretically prove that three of them are incorrect. For practical evaluation, we simulate all algorithms and evaluate their fault-tolerance, latency and throughput of critical section access. We present a new combined algorithm that achieves the best throughput of the two remaining correct algorithms by determining the system load and switching between these basic algorithms. We prove the combined algorithm correct, simulate it and study its performance characteristics.","PeriodicalId":138826,"journal":{"name":"2015 11th European Dependable Computing Conference (EDCC)","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-09-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126459417","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2015 11th European Dependable Computing Conference (EDCC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1