Self-adaptive systems are widely recognized as the future of computer systems. Due to their dynamic and evolving nature, the characterization of self-adaptation and resilience attributes is of upmost importance. The problem is that nowadays there is no practical way to characterize self-adaptation capabilities or to compare alternative solutions concerning resilience. In this paper we discuss the problem of resilience benchmarking of self-adaptive systems. We start by identifying a set of key challenges and then propose a research roadmap to tackle those challenges.
{"title":"Benchmarking the Resilience of Self-Adaptive Systems: A New Research Challenge","authors":"Raquel Almeida, H. Madeira, M. Vieira","doi":"10.1109/SRDS.2010.50","DOIUrl":"https://doi.org/10.1109/SRDS.2010.50","url":null,"abstract":"Self-adaptive systems are widely recognized as the future of computer systems. Due to their dynamic and evolving nature, the characterization of self-adaptation and resilience attributes is of upmost importance. The problem is that nowadays there is no practical way to characterize self-adaptation capabilities or to compare alternative solutions concerning resilience. In this paper we discuss the problem of resilience benchmarking of self-adaptive systems. We start by identifying a set of key challenges and then propose a research roadmap to tackle those challenges.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125254778","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Service Oriented Architecture (SOA) is an architectural pattern providing agility to align technical solutions to modular business services that are decoupled from service consumers. Service capabilities such as interface options, quality of service (QoS), throughput, security and other constraints are described in the Service Level Agreement (SLA) that would typically be published in the service registry (UDDI) for use by consumers and/or mediation mechanisms. For mobile data streaming applications, problems arise when a service provider’s SLA attributes cannot be mapped one-to-one to the service consumers (i.e. 150MB/sec video stream service provider to 5MB/sec data consumer). In this paper we present a generic framework prototype for managing and disseminating streaming data within a SOA environment as an alternative to custom service implementations based upon specific consumers or data types. Based on this framework, we implemented a set of services: Stream Discovery Service, Stream Multiplexor / Demultiplexor(routing) Service, Stream Brokering Service, Stream Repository Service and Stream Filtering Service to demonstrate the flexibility of such a streaming data framework within SOA environment.
{"title":"Towards Mobile Data Streaming in Service Oriented Architecture","authors":"Norman Ahmed, M. Linderman, Jason Bryant","doi":"10.1109/SRDS.2010.45","DOIUrl":"https://doi.org/10.1109/SRDS.2010.45","url":null,"abstract":"Service Oriented Architecture (SOA) is an architectural pattern providing agility to align technical solutions to modular business services that are decoupled from service consumers. Service capabilities such as interface options, quality of service (QoS), throughput, security and other constraints are described in the Service Level Agreement (SLA) that would typically be published in the service registry (UDDI) for use by consumers and/or mediation mechanisms. For mobile data streaming applications, problems arise when a service provider’s SLA attributes cannot be mapped one-to-one to the service consumers (i.e. 150MB/sec video stream service provider to 5MB/sec data consumer). In this paper we present a generic framework prototype for managing and disseminating streaming data within a SOA environment as an alternative to custom service implementations based upon specific consumers or data types. Based on this framework, we implemented a set of services: Stream Discovery Service, Stream Multiplexor / Demultiplexor(routing) Service, Stream Brokering Service, Stream Repository Service and Stream Filtering Service to demonstrate the flexibility of such a streaming data framework within SOA environment.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"410 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125402049","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Computing systems are becoming the heart of modern technology, implementing critical tasks usually demanded to and implying human interactions. This highlights the problem of dependability in computer science contexts. High availability computing/clusters is a possible solution in such cases, implementing standby redundancy as a trade-off between dependability and costs. From the engineering perspective, this implies the use of specific techniques and tools for adequately evaluating the reliability/availability of high availability clusters, also taking into account dependencies among nodes (standby, repair, etc.) and the effect of wear and tear into such nodes, especially when failure and repair times are not exponentially distributed. The solution proposed in this paper is based on the use of phase type distributions and Kronecker algebra. In fact, we represent the reliability and maintainability of each component by specific phase type distributions, whose interactions describe the system availability. This latter is thus modeled by an expanded Markov chain expressed in terms of Kronecker algebra in order to face the state space explosion problem of expansion techniques and to represent the memory policies related to the aging process. More specifically, the paper firstly details the technique and then applies it to the evaluation of a standby redundant system representing a high availability cluster taken as example with the aim of demonstrating its effectiveness. Moreover, in order to show the potentiality of the technique, different maintenance strategies are evaluated and therefore compared.
{"title":"Availability Assessment of HA Standby Redundant Clusters","authors":"S. Distefano, F. Longo, M. Scarpa","doi":"10.1109/SRDS.2010.37","DOIUrl":"https://doi.org/10.1109/SRDS.2010.37","url":null,"abstract":"Computing systems are becoming the heart of modern technology, implementing critical tasks usually demanded to and implying human interactions. This highlights the problem of dependability in computer science contexts. High availability computing/clusters is a possible solution in such cases, implementing standby redundancy as a trade-off between dependability and costs. From the engineering perspective, this implies the use of specific techniques and tools for adequately evaluating the reliability/availability of high availability clusters, also taking into account dependencies among nodes (standby, repair, etc.) and the effect of wear and tear into such nodes, especially when failure and repair times are not exponentially distributed. The solution proposed in this paper is based on the use of phase type distributions and Kronecker algebra. In fact, we represent the reliability and maintainability of each component by specific phase type distributions, whose interactions describe the system availability. This latter is thus modeled by an expanded Markov chain expressed in terms of Kronecker algebra in order to face the state space explosion problem of expansion techniques and to represent the memory policies related to the aging process. More specifically, the paper firstly details the technique and then applies it to the evaluation of a standby redundant system representing a high availability cluster taken as example with the aim of demonstrating its effectiveness. Moreover, in order to show the potentiality of the technique, different maintenance strategies are evaluated and therefore compared.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"84 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124131796","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process’s state can be determined only accessing non-faulty process’s memory. In the iterature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment.
{"title":"Diskless Checkpointing with Rollback-Dependency Trackability","authors":"R. Menderico, Islene C. Garcia","doi":"10.1109/SRDS.2010.17","DOIUrl":"https://doi.org/10.1109/SRDS.2010.17","url":null,"abstract":"One way to implement fault tolerant applications is storing its current state in stable memory and, when a failure occurs, restart the application from the last global consistent state. If the number of simultaneous failures is expected to be small a diskless check pointing approach can be used, where a failed process’s state can be determined only accessing non-faulty process’s memory. In the iterature diskless check pointing is usually based on synchronous protocols or properties of the application. In this paper we present a quasi-synchronous diskless check pointing algorithm, called RDT-Diskless, based on Rollback-Dependency Track ability. The proposed algorithm includes a garbage collection approach that limits the number of checkpoints that must be kept in memory. A framework, called Cheops, was developed and experimental results were obtained from a commercial cloud environment.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"49 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115050361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Gossip protocols are an efficient and reliable way to disseminate information. These protocols have nevertheless a drawback: they are unable to limit the dissemination of spam messages. Indeed, messages are redundantly disseminated in the network and it is enough that a small subset of nodes forward spam messages to have them received by a majority of nodes. In this paper, we present Fire Spam, a gossiping protocol that is able to limit spam dissemination. Fire Spam organizes nodes in a ladder topology, where nodes highly capable of filtering spam are at the top of the ladder, whereas nodes with a low spam filtering capability are at the bottom of the ladder. Messages are disseminated from the bottom of the ladder to its top. The ladder does thus act as a progressive spam filter. In order to make it usable in practice, we designed Fire Spam in the BAR model. This model takes into account selfish and malicious behaviors. We evaluate Fire Spam using simulations. We show that it drastically limits the dissemination of spam messages, while still ensuring reliable dissemination of good messages.
{"title":"FireSpam: Spam Resilient Gossiping in the BAR Model","authors":"Sonia Ben Mokhtar, Alessio Pace, Vivien Quéma","doi":"10.1109/SRDS.2010.33","DOIUrl":"https://doi.org/10.1109/SRDS.2010.33","url":null,"abstract":"Gossip protocols are an efficient and reliable way to disseminate information. These protocols have nevertheless a drawback: they are unable to limit the dissemination of spam messages. Indeed, messages are redundantly disseminated in the network and it is enough that a small subset of nodes forward spam messages to have them received by a majority of nodes. In this paper, we present Fire Spam, a gossiping protocol that is able to limit spam dissemination. Fire Spam organizes nodes in a ladder topology, where nodes highly capable of filtering spam are at the top of the ladder, whereas nodes with a low spam filtering capability are at the bottom of the ladder. Messages are disseminated from the bottom of the ladder to its top. The ladder does thus act as a progressive spam filter. In order to make it usable in practice, we designed Fire Spam in the BAR model. This model takes into account selfish and malicious behaviors. We evaluate Fire Spam using simulations. We show that it drastically limits the dissemination of spam messages, while still ensuring reliable dissemination of good messages.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"142 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132309934","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Fatemeh Borran, Martin Hutle, Nuno Santos, A. Schiper
We introduce the notion of a swift algorithm. Informally, an algorithm that solves the repeated consensus is swift if, in a partial synchronous run of this algorithm, eventually no timeout expires, i.e., the algorithm execution proceeds with the actual speed of the system. This definition differs from other efficiency criteria for partial synchronous systems. Furthermore, we show that the notion of swiftness explains why failure detector based algorithms are typically more efficient than round-based algorithms, since the former are naturally swift while the latter are naturally non-swift. We show that this is not an inherent difference between the models, and provide a round implementation that is swift, therefore performing similarly to failure detector algorithms while maintaining the advantages of the round model.
{"title":"Swift Algorithms for Repeated Consensus","authors":"Fatemeh Borran, Martin Hutle, Nuno Santos, A. Schiper","doi":"10.1109/SRDS.2010.18","DOIUrl":"https://doi.org/10.1109/SRDS.2010.18","url":null,"abstract":"We introduce the notion of a swift algorithm. Informally, an algorithm that solves the repeated consensus is swift if, in a partial synchronous run of this algorithm, eventually no timeout expires, i.e., the algorithm execution proceeds with the actual speed of the system. This definition differs from other efficiency criteria for partial synchronous systems. Furthermore, we show that the notion of swiftness explains why failure detector based algorithms are typically more efficient than round-based algorithms, since the former are naturally swift while the latter are naturally non-swift. We show that this is not an inherent difference between the models, and provide a round implementation that is swift, therefore performing similarly to failure detector algorithms while maintaining the advantages of the round model.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115475850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Testing web services for robustness is an effective way of disclosing software bugs. However, when executing robustness tests, a very large amount of service responses has to be manually classified to distinguish regular responses from responses that indicate robustness problems. Besides requiring a large amount of time and effort, this complex classification process can easily lead to errors resulting from the human intervention in such a laborious task. Text classification algorithms have been applied successfully in many contexts (e.g., spam identification, text categorization, etc) and are considered a powerful tool for the successful automation of several classification-based tasks. In this paper we present a study on the applicability of five widely used text classification algorithms in the context of web services robustness testing. In practice, we assess the effectiveness of Support Vector Machines, Naïve Bayes, Large Linear Classification, K-nearest neighbor (Ibk), and Hyperpipes in classifying web services responses. Results indicate that these algorithms can be effectively used to automate the identification of robustness issues while reducing human intervention. However, in all mechanisms there are cases of misclassified responses, which means that there is space for improvement.
测试web服务的健壮性是发现软件缺陷的有效方法。但是,在执行健壮性测试时,必须手动对大量服务响应进行分类,以区分常规响应和指示健壮性问题的响应。除了需要大量的时间和精力之外,这种复杂的分类过程很容易由于人工干预而导致错误。文本分类算法已经成功地应用于许多环境中(例如,垃圾邮件识别,文本分类等),并且被认为是一些基于分类的任务成功自动化的强大工具。本文研究了五种广泛使用的文本分类算法在web服务鲁棒性测试中的适用性。在实践中,我们评估了支持向量机(Support Vector Machines)、Naïve贝叶斯(Bayes)、大线性分类(Large Linear Classification)、k近邻(K-nearest neighbor, Ibk)和Hyperpipes对web服务响应进行分类的有效性。结果表明,这些算法可以有效地用于鲁棒性问题的自动识别,同时减少人为干预。然而,在所有机制中都存在错误分类反应的情况,这意味着存在改进的空间。
{"title":"Applying Text Classification Algorithms in Web Services Robustness Testing","authors":"N. Laranjeiro, R. Oliveira, M. Vieira","doi":"10.1109/SRDS.2010.36","DOIUrl":"https://doi.org/10.1109/SRDS.2010.36","url":null,"abstract":"Testing web services for robustness is an effective way of disclosing software bugs. However, when executing robustness tests, a very large amount of service responses has to be manually classified to distinguish regular responses from responses that indicate robustness problems. Besides requiring a large amount of time and effort, this complex classification process can easily lead to errors resulting from the human intervention in such a laborious task. Text classification algorithms have been applied successfully in many contexts (e.g., spam identification, text categorization, etc) and are considered a powerful tool for the successful automation of several classification-based tasks. In this paper we present a study on the applicability of five widely used text classification algorithms in the context of web services robustness testing. In practice, we assess the effectiveness of Support Vector Machines, Naïve Bayes, Large Linear Classification, K-nearest neighbor (Ibk), and Hyperpipes in classifying web services responses. Results indicate that these algorithms can be effectively used to automate the identification of robustness issues while reducing human intervention. However, in all mechanisms there are cases of misclassified responses, which means that there is space for improvement.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"55 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124865672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Although the technology and applications of wireless sensor networks have greatly increased over the last years, ensuring a dependable real-time operation despite faults and temporal uncertainties is still an on-going research topic. The problems are particularly significant when considering that future applications will interact with their environment not only for supervision or monitoring, but also to directly control physical (real-time) entities, sometimes with safety-critical requirements. We believe that reasoning in terms of data validity might be a good way to approach the problem. The ability to know if sensor data flowing in the system is valid – data validity awareness –, is a first step to achieve a dependable operation. But more than that, it should be possible to ensure, given requirements for data validity throughout the operation, a dependable perception of the environment. In this paper we essentially discuss the problem, analyzing some of the issues that need to be addressed to achieve these goals. Particularly, we introduce fundamental concepts and relevant definitions, we elaborate on the main impediments to achieve data validity awareness and describe relevant means to deal with these impediments. Finally, we address the issue of ensuring a dependable perception and present some research ideas in this direction.
{"title":"Data Validity and Dependable Perception in Networked Sensor-Based Systems","authors":"Luis Marques, A. Casimiro","doi":"10.1109/SRDS.2010.52","DOIUrl":"https://doi.org/10.1109/SRDS.2010.52","url":null,"abstract":"Although the technology and applications of wireless sensor networks have greatly increased over the last years, ensuring a dependable real-time operation despite faults and temporal uncertainties is still an on-going research topic. The problems are particularly significant when considering that future applications will interact with their environment not only for supervision or monitoring, but also to directly control physical (real-time) entities, sometimes with safety-critical requirements. We believe that reasoning in terms of data validity might be a good way to approach the problem. The ability to know if sensor data flowing in the system is valid – data validity awareness –, is a first step to achieve a dependable operation. But more than that, it should be possible to ensure, given requirements for data validity throughout the operation, a dependable perception of the environment. In this paper we essentially discuss the problem, analyzing some of the issues that need to be addressed to achieve these goals. Particularly, we introduce fundamental concepts and relevant definitions, we elaborate on the main impediments to achieve data validity awareness and describe relevant means to deal with these impediments. Finally, we address the issue of ensuring a dependable perception and present some research ideas in this direction.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130113483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Haifeng Chen, Guofei Jiang, Hui Zhang, K. Yoshihira
With the growing scale of current computing systems, traditional configuration tuning methods become less effective because they usually assume a small number of parameters in the system. In order to handle the scalability issue of configuration tuning, this paper proposes a cooperative optimization framework, which mimics the behavior of team playing to discover the optimal configuration setting in computing systems. We follow a ‘best of the best’ rule to decompose the tuning task into a number of small subtasks with manageable size and complexity. While each decomposed module is responsible for the optimization of its own configuration parameters, all the modules share the performance evaluations of new samples as common feedbacks to enhance their optimization objectives. As a result, the qualities of generated samples become improved during the search, and the cooperative sampling will eventually discover the optimal configurations in the system. Experimental results demonstrate that our proposed cooperative optimization can identify better solutions within limited time periods compared with other state of the art configuration search methods. Such advantage becomes more significant when the number of configuration parameters increases.
{"title":"A Cooperative Sampling Approach to Discovering Optimal Configurations in Large Scale Computing Systems","authors":"Haifeng Chen, Guofei Jiang, Hui Zhang, K. Yoshihira","doi":"10.1109/SRDS.2010.21","DOIUrl":"https://doi.org/10.1109/SRDS.2010.21","url":null,"abstract":"With the growing scale of current computing systems, traditional configuration tuning methods become less effective because they usually assume a small number of parameters in the system. In order to handle the scalability issue of configuration tuning, this paper proposes a cooperative optimization framework, which mimics the behavior of team playing to discover the optimal configuration setting in computing systems. We follow a ‘best of the best’ rule to decompose the tuning task into a number of small subtasks with manageable size and complexity. While each decomposed module is responsible for the optimization of its own configuration parameters, all the modules share the performance evaluations of new samples as common feedbacks to enhance their optimization objectives. As a result, the qualities of generated samples become improved during the search, and the cooperative sampling will eventually discover the optimal configurations in the system. Experimental results demonstrate that our proposed cooperative optimization can identify better solutions within limited time periods compared with other state of the art configuration search methods. Such advantage becomes more significant when the number of configuration parameters increases.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125416186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
A. Bondavalli, F. Brancati, A. Ceccarelli, M. Vadursi
A software clock capable of self-evaluating its synchronization uncertainty is experimentally validated for a specific implementation on a node synchronized through NTP. The validation methodology takes advantage of an external node equipped with a GPS-synchronized clock acting as a reference, which is connected to the node hosting the system under test through a fast Ethernet connection. Experiments are carried out for different values of the software clock parameters and different types of workload, and address the possible occurrence of faults in the system under test and in the NTP synchronization mechanism. The validation methodology is designed to be as less intrusive as possible and to grant a resolution of the order of few hundreds of microseconds. The experimental results show very good performance of R&SAClock, and their analysis gives precious hints for further improvements.
{"title":"Experimental Validation of a Synchronization Uncertainty-Aware Software Clock","authors":"A. Bondavalli, F. Brancati, A. Ceccarelli, M. Vadursi","doi":"10.1109/SRDS.2010.35","DOIUrl":"https://doi.org/10.1109/SRDS.2010.35","url":null,"abstract":"A software clock capable of self-evaluating its synchronization uncertainty is experimentally validated for a specific implementation on a node synchronized through NTP. The validation methodology takes advantage of an external node equipped with a GPS-synchronized clock acting as a reference, which is connected to the node hosting the system under test through a fast Ethernet connection. Experiments are carried out for different values of the software clock parameters and different types of workload, and address the possible occurrence of faults in the system under test and in the NTP synchronization mechanism. The validation methodology is designed to be as less intrusive as possible and to grant a resolution of the order of few hundreds of microseconds. The experimental results show very good performance of R&SAClock, and their analysis gives precious hints for further improvements.","PeriodicalId":219204,"journal":{"name":"2010 29th IEEE Symposium on Reliable Distributed Systems","volume":"57 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2010-10-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126687954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}