[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing最新文献

英文中文

A posteriori agreement for fault-tolerant clock synchronization on broadcast networks 广播网络中容错时钟同步的后验协议

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243580

P. Veríssimo, Luís E. T. Rodrigues

The authors present a clock synchronization algorithm, a posteriori agreement, based on a new variant of the well-known convergence nonaveraging technique. Exploiting an obvious characteristic of broadcast networks, largely reduces the effect of message delivery delay variance. In consequence, the precision achieved by the algorithm is drastically improved. Accuracy preservation is near to optimal. The solution does not require the use of dedicated hardware.<>

作者提出了一种时钟同步算法——后验一致性算法，该算法是基于一种著名的收敛非平均技术的新变体。利用广播网络的明显特性，极大地降低了消息传递时延变化的影响。因此，该算法的精度大大提高。精度保持接近最佳。该解决方案不需要使用专用硬件。

引用次数: 73

Fault tolerant neural networks in optimization problems 优化问题中的容错神经网络

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243594

Yoichi Koyanagi, Y. Tohma

The authors discuss the influence of stuck-at faults in neural networks for solving optimization problems. They use a Hopfield model of a neural network, applying it to the traveling salesman problem of five cities. The asymmetric nature of fault tolerance of the network against stuck-at-zero and stuck-at-one faults is revealed. A method to alleviate this asymmetry and enhance the fault tolerance greatly is proposed.<>

讨论了神经网络中卡滞故障对优化问题求解的影响。他们使用神经网络的Hopfield模型，将其应用于五个城市的旅行商问题。揭示了网络对卡在0和卡在1故障容错的非对称性质。提出了一种减轻这种不对称性并大大提高容错能力的方法。

引用次数: 8

Free dimensions-an effective approach to achieving fault tolerance in hypercube 自由维数——一种实现超立方体容错的有效方法

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243603

C. Raghavendra, P. Yang, S. Tien

In the n-dimensional hypercube, Q/sub n/, for large n, faults can occur with relatively high probability. How to use the inherent redundancy present in the hypercube to obtain fault tolerance is discussed, along with computing in faulty hypercubes. The authors study the fault tolerance independently present in hypercubes by defining and using the concept of free dimensions. Briefly, in Q/sub n/, a dimension is said to be free if no pair of nodes across the dimension link are both faulty. Efficient algorithms are presented for finding free dimensions, given a set of faulty nodes, and it is shown that at least n-f+1 free dimensions exist with f>

在n维超立方体Q/sub n/中，对于较大的n，故障发生的概率相对较高。讨论了如何利用超立方体中存在的固有冗余来获得容错，以及在故障超立方体中进行计算。通过定义和使用自由维的概念，研究了超立方体中独立存在的容错问题。简而言之，在Q/sub n/中，如果维度链路上没有一对节点同时出现故障，则称一个维度是自由的。给出了给定一组故障节点的自由维数的有效算法，并证明了当f>时，至少存在n-f+1个自由维数。

引用次数: 51

Transis: a communication subsystem for high availability Transis:一个高可用性的通信子系统

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243613

Y. Amir, D. Dolev, S. Kramer, D. Malkhi

The authors describe Transis, a communication subsystem for high availability. Transis is a transport layer that supports reliable multicast services. The main novelty is in the efficient implementation using broadcast. The basis of Transis is automatic maintenance of dynamic membership. The membership algorithm is symmetrical, operates within the regular flow of messages, and overcomes partitions and remerging. The higher layer provides various multicast services for sets of processes.<>

作者描述了Transis，一个高可用性的通信子系统。传输层是支持可靠多播服务的传输层。其主要新颖之处在于利用广播的高效实现。Transis的基础是动态成员的自动维护。成员算法是对称的，在规则的消息流中操作，并且克服了分区和合并。上层为进程集提供各种多播服务。

引用次数: 239

A comparison of software defects in database management systems and operating systems 数据库管理系统和操作系统中软件缺陷的比较

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243586

M. Sullivan, R. Chillarege

An analysis of software defects reported at customer sites in two large IBM database management products, DB2 and IMS, is presented. The analysis considers several different error classification systems and compares the results to those of an earlier study of field defects in IBM's MVS operating system. The authors compare the error type, defect type, and error trigger distributions of the DB2, IMS, and MVS products; show that there may exist an asymptotic behavior in the error type distribution as a function of a defect type; and discuss the undefined state errors that dominate the error type distribution.<>

本文对两个大型IBM数据库管理产品(DB2和IMS)的客户站点报告的软件缺陷进行了分析。该分析考虑了几种不同的错误分类系统，并将结果与早期对IBM MVS操作系统中现场缺陷的研究结果进行了比较。作者比较了DB2、IMS和MVS产品的错误类型、缺陷类型和错误触发分布;证明误差类型分布作为缺陷类型的函数可能存在渐近行为;并讨论了主导误差类型分布的未定义状态误差

引用次数: 142

Routing in modular fault tolerant multiprocessor systems 模块化容错多处理器系统中的路由

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243601

M. S. Alam, R. Melhem

The authors consider a class of modular multiprocessor architectures in which spares are added to each module to cover for faulty nodes within that module, thus forming a fault tolerant basic block (FTBB). The goal is to preserve the logical adjacency between active nodes by means of a routing algorithm which delivers messages successfully to their destinations. Two phase routing strategies are introduced that route messages first to their destination FTBB, and then to the destination nodes within the destination FTBB. This strategy may be applied to a variety of architectures including binary hypercubes and 3-D tori. In the presence of f faults in these systems. It is shown that the worst case length of the message route is max( sigma +f, (K+1) sigma )+M, where sigma is the shortest path in the absence of faults, and M and K are the numbers of primary nodes and spare nodes in a FTBB, respectively. The average routing overhead is much lower than the worst case overhead.<>

作者考虑了一类模块化多处理器体系结构，其中每个模块都添加了备件以覆盖该模块内的故障节点，从而形成了容错基本块(FTBB)。目标是通过路由算法将消息成功地传递到活动节点的目的地，从而保持活动节点之间的逻辑邻接性。引入了两阶段路由策略，首先将消息路由到目标FTBB，然后路由到目标FTBB内的目标节点。该策略可以应用于各种架构，包括二进制超立方体和3-D环面。在这些系统中存在故障时。结果表明，最坏情况下的消息路由长度为max(sigma +f， (K+1) sigma)+M，其中sigma为无故障情况下的最短路径，M和K分别为FTBB内主节点和备用节点的数量。平均路由开销远低于最坏情况下的开销。

引用次数: 6

Wafer testing with pairwise comparisons 两两比较晶圆测试

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243563

K. Huang, V. Agarwal, L. LaForge

A novel diagnosis scheme is proposed for wafer testing, in which the test access port of each die is utilized to perform comparison tests on its neighbors. A probabilistic diagnosis algorithm is presented, which correctly identifies almost all dies, even when the probability of failure of a die is larger than 0.5. The algorithm is shown to be particularly suitable for constant degree structures, such as rectangular and octagonal grids. The algorithm is designed for wafer scale structures, where the boundary dies do not have a complete regular structure. The algorithm also allows for the fault coverage of the tests to be imperfect. In addition, diagnosis is done locally. Both the test time and the diagnosis time are invariant with respect to the number of dies on the wafer. The algorithm can also tolerate some systematic errors. The dies are tested in parallel with this approach.<>

提出了一种新的晶圆测试诊断方案，利用每个晶圆的测试接入端口对相邻晶圆进行比较测试。提出了一种概率诊断算法，该算法可以正确识别几乎所有的模具，即使模具失效的概率大于0.5。该算法特别适用于等度结构，如矩形和八边形网格。该算法是针对晶圆尺度结构中边界模不具有完整规则结构的情况设计的。该算法还允许测试的错误覆盖不完美。此外，诊断是局部完成的。测试时间和诊断时间与晶圆片上的芯片数量无关。该算法还可以容忍一些系统误差。模具与这种方法并行测试。

引用次数: 1

Dynamic reconfiguration of CSP programs for fault tolerance 基于容错的动态重构CSP程序

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243616

P. Jalote

In a distributed computation being performed by a network of communicating processes, failure of a process due to the failure of its host node can cause the entire computation to be aborted. The author proposes a scheme to make a distributed program resilient to the failure of one of its constituent processes. The distributed computation is completed despite the failure of a process. The scheme is for CSP programs and allows nondeterminism within a process. In CSP, the process name is used in input/output commands. Since synchronous communication is used, if a process specified in the input/output command of a process P does not execute a matching output/input command, P might get blocked. In the proposed scheme, if a process fails, another process starts executing on a backup node from the last checkpoint (CP) of the failed process. Programmed exception handling is used to ensure proper recovery and fault tolerance.<>

在由通信进程网络执行的分布式计算中，由于其主机节点故障而导致的进程失败可能导致整个计算终止。作者提出了一种方案，使分布式程序对其组成进程之一的故障具有弹性。尽管进程失败，分布式计算仍然完成。该方案适用于CSP程序，允许进程内的不确定性。在CSP中，进程名用于输入/输出命令。由于使用了同步通信，如果在进程P的输入/输出命令中指定的进程没有执行匹配的输出/输入命令，则P可能会被阻塞。在建议的方案中，如果一个进程失败，另一个进程从失败进程的最后一个检查点(CP)开始在备份节点上执行。程序异常处理用于确保适当的恢复和容错

引用次数: 2

Failure mode assumptions and assumption coverage 失效模式假设和假设范围

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243562

D. Powell

A method is proposed for the formal analysis of failure mode assumptions and for the evaluation of the dependability of systems whose design correctness is conditioned on the validity of such assumptions. Formal definitions are given for the types of errors that can affect items of service delivered by a system or component. Failure node assumptions are then formalized as assertions on the types of errors that a component may induce in its enclosing system. The concept of assumption coverage is introduced to relate the notion of partially-ordered assumption assertions to the quantification of system dependability. Assumption coverage is shown to be extremely important in systems requiring very high dependability. It is also shown that the need to increase system redundancy to accommodate more severe modes of component failure can sometimes result in a decrease in dependability.<>

提出了一种失效模式假设的形式化分析方法，并对其设计正确性以失效模式假设的有效性为条件的系统可靠性进行了评估。对于可能影响系统或组件交付的服务项的错误类型给出了正式定义。然后将故障节点假设形式化为对组件可能在其封闭系统中引起的错误类型的断言。引入假设覆盖率的概念，将部分有序假设断言的概念与系统可靠性的量化联系起来。假设覆盖在需要非常高可靠性的系统中是非常重要的。它还表明，需要增加系统冗余以适应更严重的组件故障模式，有时会导致可靠性降低。

引用次数: 349

Branch recovery with compiler-assisted multiple instruction retry 使用编译器辅助的多指令重试进行分支恢复

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

Pub Date : 1992-07-08 DOI: 10.1109/FTCS.1992.243614

N. J. Alewine, Shyh-Kwei Chen, C. Li, W. Fuchs, Wen-mei W. Hwu

A compiler-assisted approach to implementing multiple instruction retry has recently been developed by C.-C. J Li et al. (1991). They extend compiler-assisted multiple instruction retry to include a broad class of code execution failures. Five benchmarks were used to measure the performance penalty of hazard resolution. Results indicate that the enhanced pure software approach can produce performance penalties consistent with existing hardware techniques. A combined compiler/hardware resolution strategy is also described and was evaluated. Experimental results indicate a lower performance penalty than with either a totally hardware or totally software approach.<>

c - c最近开发了一种编译器辅助的实现多指令重试的方法。李俊等(1991)。它们扩展了编译器辅助的多指令重试，以包括广泛的代码执行失败类别。使用五个基准来衡量危害解决的性能损失。结果表明，增强的纯软件方法可以产生与现有硬件技术一致的性能损失。本文还描述了一种编译器/硬件联合解析策略，并对其进行了评估。实验结果表明，与完全采用硬件或完全采用软件的方法相比，该方法的性能损失较小。

引用次数: 30

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

[1992] Digest of Papers. FTCS-22: The Twenty-Second International Symposium on Fault-Tolerant Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀