High resource availability in computer networks has become a critical issue as it can be affected by the increasing number of network users. A methodology based on classifying Website visitors into priority groups is proposed, assuring high availability for priority classes, based on reserved Website resources that can be accessed only by these groups and simultaneously providing as many resources as possible to lower priority visitors. A birth-death process is proposed to model Website visitors' arrival and service. An optimization problem is solved to determine the optimal trade off between resource availability for high priority visitors and free resource access to lower priority visitors. The major contribution of this paper consists in deriving formulas for the probability that a Website visitor has no further access to resources and in determining the optimal reserved resources assuring the above trade off
{"title":"Resource Availability Optimization for Priority Classes in a Website","authors":"V. Koutras, A. Platis","doi":"10.1109/PRDC.2006.54","DOIUrl":"https://doi.org/10.1109/PRDC.2006.54","url":null,"abstract":"High resource availability in computer networks has become a critical issue as it can be affected by the increasing number of network users. A methodology based on classifying Website visitors into priority groups is proposed, assuring high availability for priority classes, based on reserved Website resources that can be accessed only by these groups and simultaneously providing as many resources as possible to lower priority visitors. A birth-death process is proposed to model Website visitors' arrival and service. An optimization problem is solved to determine the optimal trade off between resource availability for high priority visitors and free resource access to lower priority visitors. The major contribution of this paper consists in deriving formulas for the probability that a Website visitor has no further access to resources and in determining the optimal reserved resources assuring the above trade off","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"88 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125076122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round
本文提出了一种容错分布式实时系统的处理器组成员协议,该协议利用周期性的、时间触发的调度在系统通信网络上发送消息。该协议允许无故障节点在存在故障沉默或故障报告节点故障以及网络故障(丢失或损坏的消息)的情况下就所有节点的操作状态达成一致。该协议基于这样的原则:在一个包含n个节点的系统中,成员中一个节点发送的每条消息都得到k个其他节点的确认,其中k可以设置为2到n - 1之间的任意数字。节点故障协议(成员退出)和节点恢复协议(成员重新整合)由两种不同的机制处理。如果在同一通信回合中不超过f = k - 1次失败,则保证离开协议,而每通信回合最多可以将一个节点重新整合到成员中
{"title":"Flexible, Cost-EffectiveMembership Agreement in Synchronous Systems","authors":"R. Barbosa, J. Karlsson","doi":"10.1109/PRDC.2006.36","DOIUrl":"https://doi.org/10.1109/PRDC.2006.36","url":null,"abstract":"This paper presents a processor group membership protocol for fault-tolerant distributed real-time systems that utilize periodic, time-triggered scheduling for sending messages over the system's communication network. The protocol allows fault-free nodes to reach agreement on the operational state of all nodes in the presence of fail-silent or fail-reporting node failures as well as network failures (lost or corrupted messages). The protocol is based on the principle that each message sent by a node in the membership is acknowledged by k other nodes in a system of n nodes, where k can be set to any number between 2 and n - 1. Agreement on node failure (membership departure) and agreement on node recovery (membership reintegration) are handled by two different mechanisms. Agreement on departure is guaranteed if no more than f = k - 1 failures occur in the same communication round, while at most one node can be reintegrated into the membership per communication round","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127393462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Due to the dynamic nature of the network topology and resource constraints, designing an efficient routing in MANETs is challenging. To tolerate communication faults, this study explores the network redundancy through multipath routing. The designated on-demand hybrid multipath routing (OHMR) features two novel characteristics; it establishes multiple node-disjoint and braided routing paths between a source-destination pair and it maintains an end-to-end transmission for a longer period than other multipath routing schemes. Through simulation results, we show OHMR can reduce the frequency of route discoveries and achieve a higher packet delivery ratio
{"title":"A Hybrid Multipath Routing in Mobile ad hoc Networks","authors":"C. Sue, Ren-Jie Chiou","doi":"10.1109/PRDC.2006.9","DOIUrl":"https://doi.org/10.1109/PRDC.2006.9","url":null,"abstract":"Due to the dynamic nature of the network topology and resource constraints, designing an efficient routing in MANETs is challenging. To tolerate communication faults, this study explores the network redundancy through multipath routing. The designated on-demand hybrid multipath routing (OHMR) features two novel characteristics; it establishes multiple node-disjoint and braided routing paths between a source-destination pair and it maintains an end-to-end transmission for a longer period than other multipath routing schemes. Through simulation results, we show OHMR can reduce the frequency of route discoveries and achieve a higher packet delivery ratio","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130452249","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Current semiconductor technologies have become susceptible to high-energy neutrons from space. Following the trends in smaller transistors, lower supply voltage, and higher clock frequency, current microprocessors are susceptible to soft errors, which constitute the vast majority of hardware failures. Based on these trends, it is expected that the quality with respect to reliability becomes important as well as performance for microprocessors. In light of this, a lot of fault-tolerance microarchitectures are recently proposed. These studies mainly focus on detecting transient faults, and hence almost every previous study evaluated processor performance in the absence of faults. This analysis only presents the performance impact of constraints introduced by fault detection mechanism. One of the reasons why this evaluation methodology is widely selected is that faults are expected to be rare enough that the overall performance is determined by fault-free behavior. However, evaluating recovery cost of fault tolerant execution is also important, because it is predicted that transient hardware faults occur more frequently as semiconductor technology is improved. Therefore, this paper focuses on recovery from faults
{"title":"Evaluating the Impact of Fault Recovery on Superscalar Processor Performance","authors":"Toshinori Sato, A. Chiyonobu","doi":"10.1109/PRDC.2006.33","DOIUrl":"https://doi.org/10.1109/PRDC.2006.33","url":null,"abstract":"Current semiconductor technologies have become susceptible to high-energy neutrons from space. Following the trends in smaller transistors, lower supply voltage, and higher clock frequency, current microprocessors are susceptible to soft errors, which constitute the vast majority of hardware failures. Based on these trends, it is expected that the quality with respect to reliability becomes important as well as performance for microprocessors. In light of this, a lot of fault-tolerance microarchitectures are recently proposed. These studies mainly focus on detecting transient faults, and hence almost every previous study evaluated processor performance in the absence of faults. This analysis only presents the performance impact of constraints introduced by fault detection mechanism. One of the reasons why this evaluation methodology is widely selected is that faults are expected to be rare enough that the overall performance is determined by fault-free behavior. However, evaluating recovery cost of fault tolerant execution is also important, because it is predicted that transient hardware faults occur more frequently as semiconductor technology is improved. Therefore, this paper focuses on recovery from faults","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"285 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115291597","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The general approach to fault tolerance in uniprocessor systems is to use time redundancy in the schedule so that any task instance can be re-executed in presence of faults during the execution. In this paper a scheme is presented to add enough and efficient time redundancy to the rate-monotonic (RM) scheduling policy for periodic real-time tasks. This scheme can be used to tolerate transient faults during the execution of tasks. For performance evaluation of this idea a tool is developed
{"title":"Fault-Tolerant Rate-Monotonic Scheduling Algorithm in Uniprocessor Embedded Systems","authors":"H. Beitollahi, Geert Deconinck","doi":"10.1109/PRDC.2006.35","DOIUrl":"https://doi.org/10.1109/PRDC.2006.35","url":null,"abstract":"The general approach to fault tolerance in uniprocessor systems is to use time redundancy in the schedule so that any task instance can be re-executed in presence of faults during the execution. In this paper a scheme is presented to add enough and efficient time redundancy to the rate-monotonic (RM) scheduling policy for periodic real-time tasks. This scheme can be used to tolerate transient faults during the execution of tasks. For performance evaluation of this idea a tool is developed","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130726647","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Kishor S. Trivedi, Ranjith Vasireddy, David Trindale, S. Nathan, R. Castro
Carrier grade high availability platforms are designed to enable the development and deployment of highly available services in the telecommunications industry. In order to build-in high availability and compare availabilities that differ in the sixth decimal place during the design phase, fairly detailed stochastic models are needed to evaluate the design and perform design tradeoffs. This paper describes an availability model for a high availability platform using three-level hierarchical decomposition that mixes reliability block diagrams and Markov chains. The model is built and evaluated using the SHARPS software package. Sensitivity analysis is performed to identify the effects of critical parameters
{"title":"Modeling High Availability","authors":"Kishor S. Trivedi, Ranjith Vasireddy, David Trindale, S. Nathan, R. Castro","doi":"10.1109/PRDC.2006.45","DOIUrl":"https://doi.org/10.1109/PRDC.2006.45","url":null,"abstract":"Carrier grade high availability platforms are designed to enable the development and deployment of highly available services in the telecommunications industry. In order to build-in high availability and compare availabilities that differ in the sixth decimal place during the design phase, fairly detailed stochastic models are needed to evaluate the design and perform design tradeoffs. This paper describes an availability model for a high availability platform using three-level hierarchical decomposition that mixes reliability block diagrams and Markov chains. The model is built and evaluated using the SHARPS software package. Sensitivity analysis is performed to identify the effects of critical parameters","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115608081","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Modern processors face growing verification and reliability challenges posed by increasing micro-architecture complexity and aggressive technology scaling. While viable approaches have been proposed to address these challenges in the context of uniprocessors, little work has been done for emerging multithreaded processors. Multithreading raises new issues for validation due to inter-thread interactions and inherent complexity of the underlying hardware. We propose an extension of the DIVA approach, which employs a simple checker processor to effectively validate the complex superscalar processor, to perform instruction-level runtime validation for both intra-thread and inter-thread correctness properties for multithreaded execution. We present the validation methodology using a representative simultaneous-multithreaded (SMT) architecture, and briefly discuss its general applicability to other forms of multithreading. Detailed timing simulation shows this solution has low performance penalty, while providing general robustness against both operational and functional errors with relatively small hardware overhead
{"title":"Dependable Multithreaded Processing Using Runtime Validation","authors":"Kaiyu Chen, S. Malik","doi":"10.1109/PRDC.2006.24","DOIUrl":"https://doi.org/10.1109/PRDC.2006.24","url":null,"abstract":"Modern processors face growing verification and reliability challenges posed by increasing micro-architecture complexity and aggressive technology scaling. While viable approaches have been proposed to address these challenges in the context of uniprocessors, little work has been done for emerging multithreaded processors. Multithreading raises new issues for validation due to inter-thread interactions and inherent complexity of the underlying hardware. We propose an extension of the DIVA approach, which employs a simple checker processor to effectively validate the complex superscalar processor, to perform instruction-level runtime validation for both intra-thread and inter-thread correctness properties for multithreaded execution. We present the validation methodology using a representative simultaneous-multithreaded (SMT) architecture, and briefly discuss its general applicability to other forms of multithreading. Detailed timing simulation shows this solution has low performance penalty, while providing general robustness against both operational and functional errors with relatively small hardware overhead","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"296 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122695858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Over the past 30 years, many software reliability growth models (SRGMs) have been proposed for estimation of reliability growth of software. In fact, effective debugging is not easy because the fault may not be immediately obvious. In the past, some researchers ever used an infinite server queueing (ISO) model to describe the software debugging behavior. An infinite-server queueing model is considered where access of customers to service is controlled by a gate and the gate is open only if all servers are free. However, the finite server queueing (FSQ) model is first advantageously modeled as an infinite-server system. Thus, in this paper, we show how to incorporate both FSQ and ISQ models into software reliability estimation and prediction. In addition, we also consider the factor of perfect/imperfect debugging. Experimental results show that the proposed framework to incorporate both fault detection and correction processes for SRGM has a fairly accurate prediction capability
{"title":"Software Reliability Prediction and Assessment Using both Finite and Infinite Server Queueing Approaches","authors":"Wei-Chih Huang, Chin-Yu Huang, C. Sue","doi":"10.1109/PRDC.2006.57","DOIUrl":"https://doi.org/10.1109/PRDC.2006.57","url":null,"abstract":"Over the past 30 years, many software reliability growth models (SRGMs) have been proposed for estimation of reliability growth of software. In fact, effective debugging is not easy because the fault may not be immediately obvious. In the past, some researchers ever used an infinite server queueing (ISO) model to describe the software debugging behavior. An infinite-server queueing model is considered where access of customers to service is controlled by a gate and the gate is open only if all servers are free. However, the finite server queueing (FSQ) model is first advantageously modeled as an infinite-server system. Thus, in this paper, we show how to incorporate both FSQ and ISQ models into software reliability estimation and prediction. In addition, we also consider the factor of perfect/imperfect debugging. Experimental results show that the proposed framework to incorporate both fault detection and correction processes for SRGM has a fairly accurate prediction capability","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"71 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124843154","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper addresses the conditional probability for undetectable burst errors when using two CRC (cyclic redundancy check) codes in tandem, given that non-guaranteed-detectable burst errors have occurred. Three distinctive cases are discussed based on the error positions and the conditional probability of undetectable errors under each case is derived. Because this probability depends on the exact CRC polynomials selected, an upper-bound assessment is provided to address general cases
{"title":"Assessment on Undetectable Burst Errors in Tandem CRCs","authors":"Meng-Lai Yin, B. Orenstein","doi":"10.1109/PRDC.2006.20","DOIUrl":"https://doi.org/10.1109/PRDC.2006.20","url":null,"abstract":"This paper addresses the conditional probability for undetectable burst errors when using two CRC (cyclic redundancy check) codes in tandem, given that non-guaranteed-detectable burst errors have occurred. Three distinctive cases are discussed based on the error positions and the conditional probability of undetectable errors under each case is derived. Because this probability depends on the exact CRC polynomials selected, an upper-bound assessment is provided to address general cases","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131588889","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2006-12-18DOI: 10.1007/978-3-540-71703-4_24
M. Vieira, António Casimiro Costa, H. Madeira
{"title":"Towards Timely ACID Transactions in DBMS","authors":"M. Vieira, António Casimiro Costa, H. Madeira","doi":"10.1007/978-3-540-71703-4_24","DOIUrl":"https://doi.org/10.1007/978-3-540-71703-4_24","url":null,"abstract":"","PeriodicalId":314915,"journal":{"name":"2006 12th Pacific Rim International Symposium on Dependable Computing (PRDC'06)","volume":"96 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2006-12-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125305636","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}