Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105571
S. Al-Bassam, B. Bose
A class of balanced (or DC-free) codes that are useful for unidirectional and asymmetric error detection is defined. Optimal (serial and parallel) balanced codes are constructed.<>
定义了一类用于单向和非对称错误检测的平衡(或无直流)码。构造了最优(串行和并行)平衡码。
{"title":"Design of efficient balanced codes","authors":"S. Al-Bassam, B. Bose","doi":"10.1109/FTCS.1989.105571","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105571","url":null,"abstract":"A class of balanced (or DC-free) codes that are useful for unidirectional and asymmetric error detection is defined. Optimal (serial and parallel) balanced codes are constructed.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115195692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105602
M. Nicolaidis, S. Noraz, B. Courtois
The authors generalize the concept of fail-safe systems and introduce the concept of strongly fail-safe systems. As an application, they present an interface that can be implemented in MOS technologies. It transforms the outputs of self-checking systems into signals adequate to drive electromechanical actuators and such that the whole system (self-checking circuit and interface) is strongly fail-safe.<>
{"title":"A generalized theory of fail-safe systems","authors":"M. Nicolaidis, S. Noraz, B. Courtois","doi":"10.1109/FTCS.1989.105602","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105602","url":null,"abstract":"The authors generalize the concept of fail-safe systems and introduce the concept of strongly fail-safe systems. As an application, they present an interface that can be implemented in MOS technologies. It transforms the outputs of self-checking systems into signals adequate to drive electromechanical actuators and such that the whole system (self-checking circuit and interface) is strongly fail-safe.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115132509","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105576
R. Horst
The design of the cache and control store memories of the Tandem NonStop VLX processor is discussed. Service costs are reduced by using hot-standby sparing to improve the reliability of the large static RAM arrays. Detection, isolation, and spare substitution of failed RAMs are performed automatically without the disruption of normal processing. A control store design with sparing is described. A mathematical model is used to predict reliability improvements for the multiple arrays for each processor board. The model takes into account the selected repair policy which calls for replacing a board only on spare exhaustion or on the failure of nonspared logic. The success of the chosen approach is illustrated through model predictions as well as through field failure data.<>
{"title":"Reliable design of high-speed cache and control store memories","authors":"R. Horst","doi":"10.1109/FTCS.1989.105576","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105576","url":null,"abstract":"The design of the cache and control store memories of the Tandem NonStop VLX processor is discussed. Service costs are reduced by using hot-standby sparing to improve the reliability of the large static RAM arrays. Detection, isolation, and spare substitution of failed RAMs are performed automatically without the disruption of normal processing. A control store design with sparing is described. A mathematical model is used to predict reliability improvements for the multiple arrays for each processor board. The model takes into account the selected repair policy which calls for replacing a board only on spare exhaustion or on the failure of nonspared logic. The success of the chosen approach is illustrated through model predictions as well as through field failure data.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"3 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115522111","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105569
L. Dunning, G. Dial, M. Varanasi
Codes are developed for detecting unidirectional errors in t bytes simultaneously (t-UBED). Some of the codes constructed also provide all unidirectional error detection (AUED). These classes of codes are different from AUED codes in that the errors in one byte may be of the form 1 to 0 while in another byte they may be of the form 0 to 1. The codes developed are for bytes of length nine consisting of eight data bits and one parity bit and utilize one byte for parity check information in addition to the parity check bits. Under various assumptions, codes varying in protection from 2-UBED to 4-UBED and 2-UBED+AUED to 3-UBED+AUED are constructed.<>
{"title":"Unidirectional 9-bit byte error detecting codes for computer memory systems","authors":"L. Dunning, G. Dial, M. Varanasi","doi":"10.1109/FTCS.1989.105569","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105569","url":null,"abstract":"Codes are developed for detecting unidirectional errors in t bytes simultaneously (t-UBED). Some of the codes constructed also provide all unidirectional error detection (AUED). These classes of codes are different from AUED codes in that the errors in one byte may be of the form 1 to 0 while in another byte they may be of the form 0 to 1. The codes developed are for bytes of length nine consisting of eight data bits and one parity bit and utilize one byte for parity check information in addition to the parity check bits. Under various assumptions, codes varying in protection from 2-UBED to 4-UBED and 2-UBED+AUED to 3-UBED+AUED are constructed.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116560911","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105556
T. Yoneda, Kazutoshi Nakade, Y. Tohma
A novel timing verification method is presented. A system is divided into units, and the behavior of each unit is described by the internal state transitions and the occurrence of events. The analysis method reveals all possible system behavior, ignoring the timing relations between events that occur at different units. The results of an example (bus access protocol for the PROWAY system) for the timing verification of larger systems show that the method is much faster and needs much less memory region than the method based on timed Petri nets.<>
{"title":"A fast timing verification method based on the independence of units","authors":"T. Yoneda, Kazutoshi Nakade, Y. Tohma","doi":"10.1109/FTCS.1989.105556","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105556","url":null,"abstract":"A novel timing verification method is presented. A system is divided into units, and the behavior of each unit is described by the internal state transitions and the occurrence of events. The analysis method reveals all possible system behavior, ignoring the timing relations between events that occur at different units. The results of an example (bus access protocol for the PROWAY system) for the timing verification of larger systems show that the method is much faster and needs much less memory region than the method based on timed Petri nets.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126590437","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105568
Jehoshua Bruck, M. Blaum
A novel construction that differs from the traditional way of constructing systematic EC/AUED/(error-correcting/all unidirectional error-detecting) codes is presented. The usual method is to take a systematic t-error-correcting code and then append a tail so that the code can detect more than t errors when they are unidirectional. In the authors' construction, the t-error-correcting code is modified in such a way that the weight distribution of the original code is reduced. The authors then have to add a smaller tail. Frequently they have less redundancy than the best available systematic t-EC/AUED codes.<>
{"title":"Some new EC/AUED codes","authors":"Jehoshua Bruck, M. Blaum","doi":"10.1109/FTCS.1989.105568","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105568","url":null,"abstract":"A novel construction that differs from the traditional way of constructing systematic EC/AUED/(error-correcting/all unidirectional error-detecting) codes is presented. The usual method is to take a systematic t-error-correcting code and then append a tail so that the code can detect more than t errors when they are unidirectional. In the authors' construction, the t-error-correcting code is modified in such a way that the weight distribution of the original code is reduced. The authors then have to add a smaller tail. Frequently they have less redundancy than the best available systematic t-EC/AUED codes.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"241 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132756040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105566
A. Xu, B. Liskov
A distributed implementation of a parallel system is of interest because it can provide an economical source of concurrency, can be scaled easily to match the needs of particular computations, and can be fault-tolerant. A design is described for such an implementation for the Linda parallel programming system, in which processes share a memory called the tuple space. Fault tolerance is achieved by replication: by having more than one copy of the tuple space, some replicas can provide information when others are not accessible due to failures. The replication technique takes advantage of the semantics of Linda so that processes encounter little delay in accessing the tuple space. In addition to providing an efficient implementation for Linda, the study extends work on replication techniques by showing what can be done when semantics are taken into account.<>
{"title":"A design for a fault-tolerant, distributed implementation of Linda","authors":"A. Xu, B. Liskov","doi":"10.1109/FTCS.1989.105566","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105566","url":null,"abstract":"A distributed implementation of a parallel system is of interest because it can provide an economical source of concurrency, can be scaled easily to match the needs of particular computations, and can be fault-tolerant. A design is described for such an implementation for the Linda parallel programming system, in which processes share a memory called the tuple space. Fault tolerance is achieved by replication: by having more than one copy of the tuple space, some replicas can provide information when others are not accessible due to failures. The replication technique takes advantage of the semantics of Linda so that processes encounter little delay in accessing the tuple space. In addition to providing an efficient implementation for Linda, the study extends work on replication techniques by showing what can be done when semantics are taken into account.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"30 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133645854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105620
A. Gheith, K. Schwan
CHAOS/sup art/ is an object-based, real-time operating system kernel that provides an extended notion of atomic transactions as the basic mechanisms for programming real-time, embedded applications. These transactions are expressed as object invocations with guaranteed timing, consistency, and recovery attributes. The mechanisms implemented by CHAOS/sup art/ kernel provide a predictable, accountable, and efficient basis for programming with real-time transactions. These mechanisms are predictable because they have well-defined upper bounds on their execution times that are (can be) determined before their execution. They are accountable because their decisions are guaranteed to be honored as long as the system is in an application-specific safe state.<>
{"title":"CHAOS/sup art/: support for real-time atomic transactions","authors":"A. Gheith, K. Schwan","doi":"10.1109/FTCS.1989.105620","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105620","url":null,"abstract":"CHAOS/sup art/ is an object-based, real-time operating system kernel that provides an extended notion of atomic transactions as the basic mechanisms for programming real-time, embedded applications. These transactions are expressed as object invocations with guaranteed timing, consistency, and recovery attributes. The mechanisms implemented by CHAOS/sup art/ kernel provide a predictable, accountable, and efficient basis for programming with real-time transactions. These mechanisms are predictable because they have well-defined upper bounds on their execution times that are (can be) determined before their execution. They are accountable because their decisions are guaranteed to be honored as long as the system is in an application-specific safe state.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131025178","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105575
M. Morganti
Public telecommunications networks make extensive use of fault-tolerant techniques in order to achieve high dependability. Part of this fault tolerance is embedded in the network architecture itself, or in the nature of the offered services, while some other part is explicitly added to increase the reliability and availability of specific network elements. The evolution toward integrated services digital networks (ISDN) and integrated broadband communications networks (IBCN) is now posing serious challenges and offering important opportunities to further increase this essential aspect of telecommunications. A brief review if presented of the basic characteristics of existing networks, and some of the major problems that are currently being addressed in relation to their expected evolution over the next 20 years are examined.<>
{"title":"F-T in telecommunications networks: state, perspectives, trends","authors":"M. Morganti","doi":"10.1109/FTCS.1989.105575","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105575","url":null,"abstract":"Public telecommunications networks make extensive use of fault-tolerant techniques in order to achieve high dependability. Part of this fault tolerance is embedded in the network architecture itself, or in the nature of the offered services, while some other part is explicitly added to increase the reliability and availability of specific network elements. The evolution toward integrated services digital networks (ISDN) and integrated broadband communications networks (IBCN) is now posing serious challenges and offering important opportunities to further increase this essential aspect of telecommunications. A brief review if presented of the basic characteristics of existing networks, and some of the major problems that are currently being addressed in relation to their expected evolution over the next 20 years are examined.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"83 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114178222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 1989-06-21DOI: 10.1109/FTCS.1989.105537
R. Harper, Gail Nagle, Martin A. Serrano
In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper fault-tolerant parallel processor (FTPP). When used in conjunction with the FTPP's fault-detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence, and recovery. This user interface is described and its use demonstrated.<>
{"title":"Use of a functional programming model in fault tolerant parallel processing","authors":"R. Harper, Gail Nagle, Martin A. Serrano","doi":"10.1109/FTCS.1989.105537","DOIUrl":"https://doi.org/10.1109/FTCS.1989.105537","url":null,"abstract":"In a fault-tolerant parallel computer, a functional programming model can facilitate distributed checkpointing, error recovery, load balancing, and graceful degradation. Such a model has been implemented on the Draper fault-tolerant parallel processor (FTPP). When used in conjunction with the FTPP's fault-detection and masking capabilities, this implementation results in a graceful degradation of system performance after faults. Three graceful degradation algorithms are presented. A user interface has been implemented which requires minimal cognitive overhead by the application programmer, masking such complexities as the system's redundancy, distributed nature, variable complement of processing resources, load balancing, fault occurrence, and recovery. This user interface is described and its use demonstrated.<<ETX>>","PeriodicalId":230363,"journal":{"name":"[1989] The Nineteenth International Symposium on Fault-Tolerant Computing. Digest of Papers","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"1989-06-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124919222","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}