首页 > 最新文献

Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers最新文献

英文 中文
On the development of fault-tolerant on-board control software and its evaluation by fault injection 车载容错控制软件的开发及其故障注入评价
T. Vardanega, P. David, J.-F. Chane, Wolfgang R. Mader, R. Messaros, J. Arlat
As commercial drivers promote the integration of functions of different criticality into a limited set of processing elements, software plays an increasingly important role on board today's satellites. This trend questions the adequacy of the traditional development process and calls for a design and validation approach capable of achieving the required dependability without blowing the development costs. This paper reports on the most innovative features of an integrated project aimed at designing a software-intensive fault tolerance approach suitable for embedded flight control systems, and at assessing its efficiency by means of a non-intrusive software-implemented fault injection prototype tool.<>
随着商业驱动程序推动将不同关键功能集成到有限的处理元素中,软件在当今卫星上发挥着越来越重要的作用。这种趋势对传统开发过程的充分性提出了质疑,并要求设计和验证方法能够在不增加开发成本的情况下实现所需的可靠性。本文报告了一个集成项目的最具创新性的特点,该项目旨在设计一种适用于嵌入式飞行控制系统的软件密集型容错方法,并通过非侵入式软件实现的故障注入原型工具来评估其效率。
{"title":"On the development of fault-tolerant on-board control software and its evaluation by fault injection","authors":"T. Vardanega, P. David, J.-F. Chane, Wolfgang R. Mader, R. Messaros, J. Arlat","doi":"10.1109/FTCS.1995.466947","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466947","url":null,"abstract":"As commercial drivers promote the integration of functions of different criticality into a limited set of processing elements, software plays an increasingly important role on board today's satellites. This trend questions the adequacy of the traditional development process and calls for a design and validation approach capable of achieving the required dependability without blowing the development costs. This paper reports on the most innovative features of an integrated project aimed at designing a software-intensive fault tolerance approach suitable for embedded flight control systems, and at assessing its efficiency by means of a non-intrusive software-implemented fault injection prototype tool.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125930384","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Feasibility and effectiveness of the algorithm for overhead reduction in analog checkers 降低模拟检查器开销算法的可行性和有效性
Yingquan Zhou, M. Wong, Y. Min
Self-checking in analog circuits is more difficult than in digital circuits. The technique proposed by A. Chatterjee (1993) can address concurrent error detection and correction in linear analog circuits and hence the reliability of the original circuit is greatly improved. However, hardware overhead is an important issue in this technique, which has never been addressed before. This paper proposes an algorithm for reduction of hardware overhead in the analog checker, and also presents a series of theoretical results, including the concept of all-non-zero solutions and several existence conditions of such solutions. As the basis of the algorithm, these results are new in the mathematical world and can be used to verify the feasibility and effectiveness of the algorithm. Without changing the original circuit, the proposed algorithm can not only reduce the number of passive elements, but also the number of analog operators so that the error detection circuitry in the checker has optimal hardware overhead.<>
模拟电路的自检比数字电路的自检困难。A. Chatterjee(1993)提出的技术可以解决线性模拟电路中的并发错误检测和校正问题,从而大大提高了原始电路的可靠性。然而,硬件开销是该技术中的一个重要问题,以前从未解决过。本文提出了一种减少模拟检查器硬件开销的算法,并给出了一系列的理论结果,包括全非零解的概念和这种解的几个存在条件。作为算法的基础,这些结果在数学界是新的,可以用来验证算法的可行性和有效性。在不改变原电路的情况下,该算法不仅可以减少无源元件的数量,还可以减少模拟算子的数量,从而使检查器中的错误检测电路具有最优的硬件开销。
{"title":"Feasibility and effectiveness of the algorithm for overhead reduction in analog checkers","authors":"Yingquan Zhou, M. Wong, Y. Min","doi":"10.1109/FTCS.1995.466974","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466974","url":null,"abstract":"Self-checking in analog circuits is more difficult than in digital circuits. The technique proposed by A. Chatterjee (1993) can address concurrent error detection and correction in linear analog circuits and hence the reliability of the original circuit is greatly improved. However, hardware overhead is an important issue in this technique, which has never been addressed before. This paper proposes an algorithm for reduction of hardware overhead in the analog checker, and also presents a series of theoretical results, including the concept of all-non-zero solutions and several existence conditions of such solutions. As the basis of the algorithm, these results are new in the mathematical world and can be used to verify the feasibility and effectiveness of the algorithm. Without changing the original circuit, the proposed algorithm can not only reduce the number of passive elements, but also the number of analog operators so that the error detection circuitry in the checker has optimal hardware overhead.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128667525","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
OBDD-based optimization of input probabilities for weighted random pattern generation 基于obdd的加权随机模式生成输入概率优化
Rolf Krieger, B. Becker, Can Ökmen
Numerous methods have been devised to compute and to optimize fault detection probabilities for combinational circuits. The methods range from topological to algebraic. In combination with OBDDs, algebraic methods have received more and more attention. Recently, an OBDD based method has been presented which allows the computation of exact fault detection probabilities for many combinational circuits. We combine this method with strategies making use of necessary assignments (computed by an implication procedure). The experimental results show that the resulting method leads to a decrease of the time and space requirements for computing fault detection probabilities of the hard faults by a factor of 4 on average compared to the original algorithm. By this means it is now possible to efficiently use the OBDD based approach also for the optimization of input probabilities for weighted random pattern testing. Since in contrast to other optimization procedures this method is based on the exact fault detection probabilities we succeed in the determination of weight sets of superior quality, i.e. the test application time (number of random patterns) is considerably reduced compared to previous approaches.<>
已经设计了许多方法来计算和优化组合电路的故障检测概率。方法的范围从拓扑学到代数。代数方法与obdd相结合,越来越受到人们的重视。最近提出了一种基于OBDD的方法,可以计算出许多组合电路的精确故障检测概率。我们将这种方法与使用必要赋值(通过隐含过程计算)的策略结合起来。实验结果表明,该方法将计算硬故障检测概率的时间和空间要求比原算法平均降低了4倍。通过这种方法,现在可以有效地使用基于OBDD的方法来优化加权随机模式测试的输入概率。由于与其他优化程序相比,该方法基于精确的故障检测概率,我们成功地确定了高质量的权重集,即测试应用时间(随机模式的数量)与以前的方法相比大大减少。
{"title":"OBDD-based optimization of input probabilities for weighted random pattern generation","authors":"Rolf Krieger, B. Becker, Can Ökmen","doi":"10.1109/FTCS.1995.466991","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466991","url":null,"abstract":"Numerous methods have been devised to compute and to optimize fault detection probabilities for combinational circuits. The methods range from topological to algebraic. In combination with OBDDs, algebraic methods have received more and more attention. Recently, an OBDD based method has been presented which allows the computation of exact fault detection probabilities for many combinational circuits. We combine this method with strategies making use of necessary assignments (computed by an implication procedure). The experimental results show that the resulting method leads to a decrease of the time and space requirements for computing fault detection probabilities of the hard faults by a factor of 4 on average compared to the original algorithm. By this means it is now possible to efficiently use the OBDD based approach also for the optimization of input probabilities for weighted random pattern testing. Since in contrast to other optimization procedures this method is based on the exact fault detection probabilities we succeed in the determination of weight sets of superior quality, i.e. the test application time (number of random patterns) is considerably reduced compared to previous approaches.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"66 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121988893","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Synthesizing finite state machines for minimum length synchronizing sequence using partial scan 部分扫描合成最小长度同步序列的有限状态机
N. Jiang, Richard M. Chou, K. Saluja
The goal is to synthesize an FSM with the objective to minimize the number of scanned flip-flops while requiring a minimum number of system clocks to reach the synchronizable state. An algorithm for selecting state variables for scanning while minimizing the length of the synchronizing sequence based on the reverse-order-search technique is presented. Extra transitions may be required to avoid possible lock-in conditions if the initial state is an invalid state for the machines where the number of states is not a power of 2. Experimental results show that the proposed method guarantees synchronizability and testability through the proper state assignment with reasonable hardware overhead for the benchmark circuits.<>
目标是合成一个FSM,其目标是最小化扫描触发器的数量,同时需要最小数量的系统时钟来达到可同步状态。提出了一种基于逆序搜索技术的扫描状态变量选择算法,同时最小化同步序列的长度。对于状态数不是2的机器,如果初始状态是无效状态,则可能需要额外的转换以避免可能的锁定条件。实验结果表明,该方法通过适当的状态分配和合理的硬件开销,保证了基准电路的同步性和可测试性。
{"title":"Synthesizing finite state machines for minimum length synchronizing sequence using partial scan","authors":"N. Jiang, Richard M. Chou, K. Saluja","doi":"10.1109/FTCS.1995.466980","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466980","url":null,"abstract":"The goal is to synthesize an FSM with the objective to minimize the number of scanned flip-flops while requiring a minimum number of system clocks to reach the synchronizable state. An algorithm for selecting state variables for scanning while minimizing the length of the synchronizing sequence based on the reverse-order-search technique is presented. Extra transitions may be required to avoid possible lock-in conditions if the initial state is an invalid state for the machines where the number of states is not a power of 2. Experimental results show that the proposed method guarantees synchronizability and testability through the proper state assignment with reasonable hardware overhead for the benchmark circuits.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"157 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124412556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Dependability assessment using binary decision diagrams (BDDs) 使用二元决策图(bdd)进行可靠性评估
S. A. Doyle, J. Dugan
Presents the DREDD (Dependability and Risk Evaluation using Decision Diagrams) algorithm which incorporates coverage modeling into a BDD solution of a combinatorial model. BDDs, which do not use cutsets to generate system unreliability, can be used to find exact solutions for extremely large systems. The DREDD algorithm takes advantage of the efficiency of the BDD solution approach and increases the accuracy of a combinatorial model by including consideration of imperfect coverage. The usefulness of combinatorial models, long appreciated for their logical structure and concise representational form, is extended to include many fault-tolerant systems previously thought to require more complicated analysis techniques in order to include coverage modeling. In. This paper, the DREDD approach is presented and applied to the analysis of two sample systems, the F18 flight control system and a fault-tolerant multistage interconnection network.<>
提出了DREDD(使用决策图的可靠性和风险评估)算法,该算法将覆盖建模集成到组合模型的BDD解决方案中。bdd不使用切割集来产生系统不可靠性,可用于找到超大型系统的精确解决方案。DREDD算法利用了BDD求解方法的效率,并通过考虑不完全覆盖提高了组合模型的精度。组合模型的有用性,长期以来因其逻辑结构和简洁的表示形式而受到赞赏,被扩展到包括许多容错系统,这些系统以前被认为需要更复杂的分析技术才能包括覆盖建模。在。本文提出了DREDD方法,并将其应用于F18飞行控制系统和容错多级互连网络两个样本系统的分析。
{"title":"Dependability assessment using binary decision diagrams (BDDs)","authors":"S. A. Doyle, J. Dugan","doi":"10.1109/FTCS.1995.466973","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466973","url":null,"abstract":"Presents the DREDD (Dependability and Risk Evaluation using Decision Diagrams) algorithm which incorporates coverage modeling into a BDD solution of a combinatorial model. BDDs, which do not use cutsets to generate system unreliability, can be used to find exact solutions for extremely large systems. The DREDD algorithm takes advantage of the efficiency of the BDD solution approach and increases the accuracy of a combinatorial model by including consideration of imperfect coverage. The usefulness of combinatorial models, long appreciated for their logical structure and concise representational form, is extended to include many fault-tolerant systems previously thought to require more complicated analysis techniques in order to include coverage modeling. In. This paper, the DREDD approach is presented and applied to the analysis of two sample systems, the F18 flight control system and a fault-tolerant multistage interconnection network.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"76 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127621301","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 64
VAX/VMS event monitoring and analysis VAX/VMS事件监控与分析
Michael F. Buckley, D. Siewiorek
Event logs can be used effectively to improve computer system availability. Uses include retrospective and predictive diagnosis; fault management; failure rate estimation; and trend analysis. Unfortunately, much of the research to date has been hampered by the lack of suitable event data, and occasionally by the incorrect interpretation of the available data. This research uses one of the largest sets of data, and the most intensive investigation of the monitoring process conducted to date, to examine event monitoring and analysis. 2.35 million events from 193 VAX/VMS systems covering 335 machine years were used. Examples are presented which show that monitoring deficiencies complicate the analyses, consume additional time, and make incorrect conclusions more likely. For example, incorrect handling of bogus timestamps changes the mean time between groups of events by an order of magnitude. An analysis procedure to identify defects is provided, along with design rules to create better quality logs.<>
事件日志可以有效地用于提高计算机系统的可用性。用途包括回顾性和预测性诊断;故障管理;故障率估计;趋势分析。不幸的是,迄今为止的许多研究都因缺乏合适的事件数据而受阻,有时还因对现有数据的错误解释而受阻。本研究使用了迄今为止最大的数据集之一,并对监测过程进行了最深入的调查,以检查事件监测和分析。使用了来自193个VAX/VMS系统的235万个事件,涵盖了335个机器年。给出的例子表明,监测缺陷使分析复杂化,消耗额外的时间,并更有可能得出错误的结论。例如,对虚假时间戳的错误处理会将事件组之间的平均时间更改一个数量级。提供了识别缺陷的分析过程,以及创建更好质量日志的设计规则。
{"title":"VAX/VMS event monitoring and analysis","authors":"Michael F. Buckley, D. Siewiorek","doi":"10.1109/FTCS.1995.466958","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466958","url":null,"abstract":"Event logs can be used effectively to improve computer system availability. Uses include retrospective and predictive diagnosis; fault management; failure rate estimation; and trend analysis. Unfortunately, much of the research to date has been hampered by the lack of suitable event data, and occasionally by the incorrect interpretation of the available data. This research uses one of the largest sets of data, and the most intensive investigation of the monitoring process conducted to date, to examine event monitoring and analysis. 2.35 million events from 193 VAX/VMS systems covering 335 machine years were used. Examples are presented which show that monitoring deficiencies complicate the analyses, consume additional time, and make incorrect conclusions more likely. For example, incorrect handling of bogus timestamps changes the mean time between groups of events by an order of magnitude. An analysis procedure to identify defects is provided, along with design rules to create better quality logs.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"54 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133537219","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 44
Software rejuvenation: analysis, module and applications 软件复兴:分析、模块和应用
Yennun Huang, C. Kintala, N. Kolettis, N. D. Fulton
Software rejuvenation is the concept of gracefully terminating an application and immediately restarting it at a clean internal state. In a client-server type of application where the server is intended to ran perpetually for providing a service to its clients, rejuvenating the server process periodically during the most idle time of the server increases the availability of that service. In a long-running computation-intensive application, rejuvenating the application periodically and restarting it at a previous checkpoint increases the likelihood of successfully completing the application execution. We present a model for analyzing software rejuvenation in such continuously-running applications and express downtime and costs due to downtime during rejuvenation in terms of the parameters in that model. Threshold conditions for rejuvenation to be beneficial are also derived. We implemented a reusable module to perform software rejuvenation. That module can be embedded in any existing application on a UNIX platform with minimal effort. Experiences with software rejuvenation in a billing data collection subsystem of a telecommunications operations system and other continuously-running systems and scientific applications in AT&T are described.<>
软件复兴是优雅地终止一个应用程序,并在一个干净的内部状态下立即重新启动它的概念。在客户机-服务器类型的应用程序中,服务器打算永久运行以向其客户机提供服务,在服务器最空闲的时间定期重新激活服务器进程可以增加该服务的可用性。在长时间运行的计算密集型应用程序中,定期恢复应用程序并在前一个检查点重新启动它可以增加成功完成应用程序执行的可能性。我们提出了一个模型来分析这种连续运行的应用程序中的软件再生,并根据该模型中的参数表示再生过程中由于停机而导致的停机时间和成本。还推导了有利于返老还童的阈值条件。我们实现了一个可重用模块来执行软件再生。该模块可以轻松地嵌入到UNIX平台上的任何现有应用程序中。描述了在电信运营系统的计费数据收集子系统和AT&T的其他连续运行系统和科学应用中进行软件复兴的经验。
{"title":"Software rejuvenation: analysis, module and applications","authors":"Yennun Huang, C. Kintala, N. Kolettis, N. D. Fulton","doi":"10.1109/FTCS.1995.466961","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466961","url":null,"abstract":"Software rejuvenation is the concept of gracefully terminating an application and immediately restarting it at a clean internal state. In a client-server type of application where the server is intended to ran perpetually for providing a service to its clients, rejuvenating the server process periodically during the most idle time of the server increases the availability of that service. In a long-running computation-intensive application, rejuvenating the application periodically and restarting it at a previous checkpoint increases the likelihood of successfully completing the application execution. We present a model for analyzing software rejuvenation in such continuously-running applications and express downtime and costs due to downtime during rejuvenation in terms of the parameters in that model. Threshold conditions for rejuvenation to be beneficial are also derived. We implemented a reusable module to perform software rejuvenation. That module can be embedded in any existing application on a UNIX platform with minimal effort. Experiences with software rejuvenation in a billing data collection subsystem of a telecommunications operations system and other continuously-running systems and scientific applications in AT&T are described.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131622103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 964
Optimal resiliency against mobile faults 针对移动故障的最佳弹性
H. Buhrman, J. Garay, J. Hoepman
We consider a model where malicious agents can corrupt hosts and move around in a network of processors. We consider a family of mobile fault models MF(t/n-1,/spl rho/). In MF(t/n-1,/spl rho/) there are a total of n processors, the maximum number of mobile faults is t, and their roaming pace is /spl rho/ (for example, /spl rho/=3 means that it takes an agent at least 3 rounds to "hop" to the next host). We study in these models the classical testbed problem for fault tolerant distributed computing: Byzantine agreement. It has been shown that if /spl rho/=1, then agreement cannot be reached in the presence of even one fault, unless one of the processors remains uncorrupted for a certain amount of time. Subject to this proviso, we present a protocol for MF(/sup 1///sub 3/,1), which is optimal. The running time of the protocol is O(n) rounds, also optimal for these models.<>
我们考虑一个模型,其中恶意代理可以破坏主机并在处理器网络中移动。我们考虑一类运动断层模型MF(t/n-1,/spl rho/)。在MF(t/n-1,/spl rho/)中,总共有n个处理器,移动故障的最大数量为t,它们的漫游速度为/spl rho/(例如,/spl rho/=3表示代理至少需要3轮才能“跳”到下一个主机)。在这些模型中,我们研究了容错分布式计算的经典试验台问题:拜占庭协议。已经证明,如果/spl rho/=1,则即使存在一个故障也无法达成协议,除非其中一个处理器在一定时间内保持未损坏。在此前提下,我们提出了一种最优MF(/sup 1///sub 3/,1)协议。该协议的运行时间为O(n)轮,对于这些模型也是最优的。
{"title":"Optimal resiliency against mobile faults","authors":"H. Buhrman, J. Garay, J. Hoepman","doi":"10.1109/FTCS.1995.466995","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466995","url":null,"abstract":"We consider a model where malicious agents can corrupt hosts and move around in a network of processors. We consider a family of mobile fault models MF(t/n-1,/spl rho/). In MF(t/n-1,/spl rho/) there are a total of n processors, the maximum number of mobile faults is t, and their roaming pace is /spl rho/ (for example, /spl rho/=3 means that it takes an agent at least 3 rounds to \"hop\" to the next host). We study in these models the classical testbed problem for fault tolerant distributed computing: Byzantine agreement. It has been shown that if /spl rho/=1, then agreement cannot be reached in the presence of even one fault, unless one of the processors remains uncorrupted for a certain amount of time. Subject to this proviso, we present a protocol for MF(/sup 1///sub 3/,1), which is optimal. The running time of the protocol is O(n) rounds, also optimal for these models.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"320 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132018464","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Availability and performance evaluation of database systems under periodic checkpoints 定期检查点下数据库系统的可用性和性能评估
Reinaldo Vallejos Campos, E. D. S. E. Silva
Checkpointing roll back and recovery is a common technique to insure data integrity, to increase availability and to improve the performance of transaction oriented database systems. Parameters such as the the checkpointing frequency and the system load have an impact on the overall performance and it is important to develop accurate models of the system under study. We find expressions for the system availability and for the expected response time of the transactions from a model that, unlike previous analytical work, takes into account the dependency among the recovery times between two checkpoints. Furthermore, our model can incorporate details concerning the contention for the system resources.<>
检查点回滚和恢复是确保数据完整性、提高可用性和改进面向事务的数据库系统性能的常用技术。检查点频率和系统负载等参数对系统的整体性能有影响,因此建立所研究系统的精确模型非常重要。我们从一个模型中找到系统可用性和事务预期响应时间的表达式,该模型与以前的分析工作不同,它考虑了两个检查点之间恢复时间之间的依赖关系。此外,我们的模型可以包含有关系统资源争用的细节。
{"title":"Availability and performance evaluation of database systems under periodic checkpoints","authors":"Reinaldo Vallejos Campos, E. D. S. E. Silva","doi":"10.1109/FTCS.1995.466983","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466983","url":null,"abstract":"Checkpointing roll back and recovery is a common technique to insure data integrity, to increase availability and to improve the performance of transaction oriented database systems. Parameters such as the the checkpointing frequency and the system load have an impact on the overall performance and it is important to develop accurate models of the system under study. We find expressions for the system availability and for the expected response time of the transactions from a model that, unlike previous analytical work, takes into account the dependency among the recovery times between two checkpoints. Furthermore, our model can incorporate details concerning the contention for the system resources.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116750953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Systematic validation of pipeline interlock for superscalar microarchitectures 超标量微架构管道联锁的系统验证
T. Diep, John Paul Shen
The paper presents a new approach to microarchitecture validation that adopts a paradigm analogous to that of automatic test pattern generation (ATPG) for digital logic testing. In this approach, the microarchitecture is rigorously specified in a set of machine description files. Based on these files, all possible pipeline hazards can be systematically identified Using this hazard list (analogous to a fault list for ATPG), specific sequences of instructions (analogous to test patterns) are automatically generated and constitute the test program. The execution of this test program validates the correct detection and resolution of all interinstruction dependences by the microarchitecture's pipeline interlock mechanism. Actual software tools have been developed for the automatic construction of the hazard list and the automatic generation of the test sequences. These explicitly generated can achieve higher sequences coverage in fewer cycles than adhoc approaches. 100% coverage of the hazard list can be ensured. These tools have been applied to four contemporary superscalar processors, namely the Alpha AXP 21064 and 21164 microprocessors, and the PowerPC 601 and 620 microprocessors.<>
本文提出了一种新的微架构验证方法,该方法采用了一种类似于数字逻辑测试的自动测试模式生成(ATPG)的范式。在这种方法中,微体系结构在一组机器描述文件中严格指定。基于这些文件,可以系统地识别所有可能的管道危险。使用该危险列表(类似于ATPG的故障列表),自动生成特定的指令序列(类似于测试模式)并构成测试程序。该测试程序的执行验证了微体系结构的管道联锁机制对所有相互指令依赖的正确检测和解析。实际的软件工具已经开发出来,用于自动构建危险清单和自动生成测试序列。这些显式生成的方法可以在更少的周期内实现更高的序列覆盖率。可确保危害清单100%覆盖。这些工具已经应用于四个当代超标量处理器,即Alpha AXP 21064和21164微处理器,以及PowerPC 601和620微处理器。
{"title":"Systematic validation of pipeline interlock for superscalar microarchitectures","authors":"T. Diep, John Paul Shen","doi":"10.1109/FTCS.1995.466993","DOIUrl":"https://doi.org/10.1109/FTCS.1995.466993","url":null,"abstract":"The paper presents a new approach to microarchitecture validation that adopts a paradigm analogous to that of automatic test pattern generation (ATPG) for digital logic testing. In this approach, the microarchitecture is rigorously specified in a set of machine description files. Based on these files, all possible pipeline hazards can be systematically identified Using this hazard list (analogous to a fault list for ATPG), specific sequences of instructions (analogous to test patterns) are automatically generated and constitute the test program. The execution of this test program validates the correct detection and resolution of all interinstruction dependences by the microarchitecture's pipeline interlock mechanism. Actual software tools have been developed for the automatic construction of the hazard list and the automatic generation of the test sequences. These explicitly generated can achieve higher sequences coverage in fewer cycles than adhoc approaches. 100% coverage of the hazard list can be ensured. These tools have been applied to four contemporary superscalar processors, namely the Alpha AXP 21064 and 21164 microprocessors, and the PowerPC 601 and 620 microprocessors.<<ETX>>","PeriodicalId":309075,"journal":{"name":"Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1995-06-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115620576","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 17
期刊
Twenty-Fifth International Symposium on Fault-Tolerant Computing. Digest of Papers
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1