首页 > 最新文献

2012 IEEE 31st Symposium on Reliable Distributed Systems最新文献

英文 中文
Exploring Compile Time Caching of Explicit Queries in Programming Codes 探索编程代码中显式查询的编译时缓存
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.27
Venkata Krishna Suhas Nerella, S. Madria, T. Weigert
Object oriented programming languages raised the level of abstraction by incorporating first class query constructs explicitly in the program codes. These query constructs allow programmers to express operations over collections as object queries and also provide optimal query execution utilizing query optimization strategies from domain of databases. However, when a query is repeated in the program, it is executed afresh. This paper presents an approach to reduce the run time execution of programs involving explicit queries by caching the results of repeated queries and incrementally maintaining the cached results. We propose determination of cache entries at compile time by performing the program analysis. We also describe the cache heuristics for determining which queries to cache.
面向对象编程语言通过在程序代码中显式地合并第一类查询构造来提高抽象级别。这些查询构造允许程序员将集合上的操作表示为对象查询,并利用数据库域的查询优化策略提供最佳查询执行。然而,当一个查询在程序中重复时,它将被重新执行。本文提出了一种通过缓存重复查询的结果和增量维护缓存结果来减少涉及显式查询的程序的运行时执行的方法。我们建议在编译时通过执行程序分析来确定缓存项。我们还描述了用于确定缓存哪些查询的缓存启发式方法。
{"title":"Exploring Compile Time Caching of Explicit Queries in Programming Codes","authors":"Venkata Krishna Suhas Nerella, S. Madria, T. Weigert","doi":"10.1109/SRDS.2012.27","DOIUrl":"https://doi.org/10.1109/SRDS.2012.27","url":null,"abstract":"Object oriented programming languages raised the level of abstraction by incorporating first class query constructs explicitly in the program codes. These query constructs allow programmers to express operations over collections as object queries and also provide optimal query execution utilizing query optimization strategies from domain of databases. However, when a query is repeated in the program, it is executed afresh. This paper presents an approach to reduce the run time execution of programs involving explicit queries by caching the results of repeated queries and incrementally maintaining the cached results. We propose determination of cache entries at compile time by performing the program analysis. We also describe the cache heuristics for determining which queries to cache.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127902478","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
From Backup to Hot Standby: High Availability for HDFS 从备份到热备:HDFS的高可用性
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.33
Andrew Oriani, Islene C. Garcia
Cluster-based distributed file systems generally have a single master to service clients and manage the namespace. Although simple and efficient, that design compromises availability, because the failure of the master takes the entire system down. Before version 2.0.0-alpha, the Hadoop Distributed File System (HDFS) -- an open-source storage, widely used by applications that operate over large datasets, such as MapReduce, and for which an uptime of 24x7 is becoming essential -- was an example of such systems. Given that scenario, this paper proposes a hot standby for the master of HDFS achieved by (i) extending the master's state replication performed by its check pointer helper, the Backup Node, and by (ii) introducing an automatic fail over mechanism. The step (i) took advantage of the message duplication technique developed by other high availability solution for HDFS named Avatar Nodes. The step (ii) employed another Hadoop software: ZooKeeper, a distributed coordination service. That approach resulted in small code changes, 1373 lines, not requiring external components to the Hadoop project. Thus, easing the maintenance and deployment of the file system. Compared to HDFS 0.21, tests showed that both in loads dominated by metadata operations or I/O operations, the reduction of data throughput is no more than 15% on average, and the time to switch the hot standby to active is less than 100 ms. Those results demonstrate the applicability of our solution to real systems. We also present related work on high availability for other file systems and HDFS, including the official solution, recently included in HDFS 2.0.0-alpha.
基于集群的分布式文件系统通常有一个主服务器来服务客户机和管理名称空间。尽管这种设计简单而有效,但它会损害可用性,因为主服务器的故障会使整个系统瘫痪。在2.0.0-alpha版本之前,Hadoop分布式文件系统(HDFS)就是此类系统的一个例子。HDFS是一种开源存储,广泛用于运行大型数据集的应用程序,如MapReduce,并且24x7的正常运行时间变得至关重要。在这种情况下,本文提出了HDFS主节点的热备,通过以下方式实现:(i)扩展主节点的状态复制,由其检查指针助手Backup Node执行,以及(ii)引入自动故障转移机制。步骤(i)利用了其他高可用性解决方案开发的消息复制技术,名为Avatar Nodes。步骤(ii)使用了另一个Hadoop软件:ZooKeeper,一个分布式协调服务。这种方法只对代码进行了很小的修改,只有1373行,不需要Hadoop项目的外部组件。从而简化了文件系统的维护和部署。与HDFS 0.21相比,测试表明,无论是元数据操作为主的负载还是I/O操作为主的负载,数据吞吐量的平均下降幅度都不超过15%,双机热备切换到主用的时间都在100ms以内。这些结果证明了我们的解决方案在实际系统中的适用性。我们还介绍了其他文件系统和HDFS的高可用性相关工作,包括最近包含在HDFS 2.0.0-alpha中的官方解决方案。
{"title":"From Backup to Hot Standby: High Availability for HDFS","authors":"Andrew Oriani, Islene C. Garcia","doi":"10.1109/SRDS.2012.33","DOIUrl":"https://doi.org/10.1109/SRDS.2012.33","url":null,"abstract":"Cluster-based distributed file systems generally have a single master to service clients and manage the namespace. Although simple and efficient, that design compromises availability, because the failure of the master takes the entire system down. Before version 2.0.0-alpha, the Hadoop Distributed File System (HDFS) -- an open-source storage, widely used by applications that operate over large datasets, such as MapReduce, and for which an uptime of 24x7 is becoming essential -- was an example of such systems. Given that scenario, this paper proposes a hot standby for the master of HDFS achieved by (i) extending the master's state replication performed by its check pointer helper, the Backup Node, and by (ii) introducing an automatic fail over mechanism. The step (i) took advantage of the message duplication technique developed by other high availability solution for HDFS named Avatar Nodes. The step (ii) employed another Hadoop software: ZooKeeper, a distributed coordination service. That approach resulted in small code changes, 1373 lines, not requiring external components to the Hadoop project. Thus, easing the maintenance and deployment of the file system. Compared to HDFS 0.21, tests showed that both in loads dominated by metadata operations or I/O operations, the reduction of data throughput is no more than 15% on average, and the time to switch the hot standby to active is less than 100 ms. Those results demonstrate the applicability of our solution to real systems. We also present related work on high availability for other file systems and HDFS, including the official solution, recently included in HDFS 2.0.0-alpha.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"50 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120925488","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
Response Time Reliability in Cloud Environments: An Empirical Study of n-Tier Applications at High Resource Utilization 云环境下的响应时间可靠性:高资源利用率n层应用的实证研究
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.61
Qingyang Wang, Yasuhiko Kanemasa, Jack Li, D. Jayasinghe, Motoyuki Kawaba, C. Pu
When running mission-critical web-facing applications (e.g., electronic commerce) in cloud environments, predictable response time, e.g., specified as service level agreements (SLA), is a major performance reliability requirement. Through extensive measurements of n-tier application benchmarks in a cloud environment, we study three factors that significantly impact the application response time predictability: bursty workloads (typical of web-facing applications), soft resource management strategies (e.g., global thread pool or local thread pool), and bursts in system software consumption of hardware resources (e.g., Java Virtual Machine garbage collection). Using a set of profit-based performance criteria derived from typical SLAs, we show that response time reliability is brittle, with large response time variations (order of several seconds) depending on each one of those factors. For example, for the same workload and hardware platform, modest increases in workload burstiness may result in profit drops of more than 50%. Our results show that profitbased performance criteria may contribute significantly to the successful delimitation of performance unreliability boundaries and thus support effective management of clouds.
当在云环境中运行面向web的关键任务应用程序(例如,电子商务)时,可预测的响应时间(例如,指定为服务水平协议(SLA))是主要的性能可靠性需求。通过在云环境中对n层应用程序基准的广泛测量,我们研究了三个显著影响应用程序响应时间可预测性的因素:突发工作负载(典型的面向web的应用程序),软资源管理策略(例如,全局线程池或本地线程池),以及系统软件对硬件资源的突发消耗(例如,Java虚拟机垃圾收集)。使用一组源自典型sla的基于利润的性能标准,我们发现响应时间可靠性是脆弱的,响应时间的变化很大(几秒钟),这取决于这些因素中的每一个。例如,对于相同的工作负载和硬件平台,工作负载突发性的适度增加可能导致利润下降50%以上。我们的研究结果表明,基于利润的绩效标准可能对成功界定绩效不可靠性边界做出重大贡献,从而支持对云的有效管理。
{"title":"Response Time Reliability in Cloud Environments: An Empirical Study of n-Tier Applications at High Resource Utilization","authors":"Qingyang Wang, Yasuhiko Kanemasa, Jack Li, D. Jayasinghe, Motoyuki Kawaba, C. Pu","doi":"10.1109/SRDS.2012.61","DOIUrl":"https://doi.org/10.1109/SRDS.2012.61","url":null,"abstract":"When running mission-critical web-facing applications (e.g., electronic commerce) in cloud environments, predictable response time, e.g., specified as service level agreements (SLA), is a major performance reliability requirement. Through extensive measurements of n-tier application benchmarks in a cloud environment, we study three factors that significantly impact the application response time predictability: bursty workloads (typical of web-facing applications), soft resource management strategies (e.g., global thread pool or local thread pool), and bursts in system software consumption of hardware resources (e.g., Java Virtual Machine garbage collection). Using a set of profit-based performance criteria derived from typical SLAs, we show that response time reliability is brittle, with large response time variations (order of several seconds) depending on each one of those factors. For example, for the same workload and hardware platform, modest increases in workload burstiness may result in profit drops of more than 50%. Our results show that profitbased performance criteria may contribute significantly to the successful delimitation of performance unreliability boundaries and thus support effective management of clouds.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131130298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Model-Driven Comparison of State-Machine-Based and Deferred-Update Replication Schemes 基于状态机和延迟更新复制方案的模型驱动比较
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.44
P. Wojciechowski, Tadeusz Kobus, Maciej Kokociński
In this paper, we analyze and experimentally compare state-machine-based and deferred-update (or transactional) replication, both relying on atomic broadcast. We define a model that describes the upper and lower bounds on the execution of concurrent requests by a service replicated using either scheme. The model is parametrized by the degree of parallelism in either scheme, the number of processor cores, and the type of requests. We analytically compared both schemes and a non-replicated service, considering a bcast- and request-execution-dominant workloads. To evaluate transactional replication experimentally, we developed Paxos STM---a novel fault-tolerant distributed software transactional memory with programming constructs for transaction creation, abort, and retry. For state-machine-based replication, we used JPaxos. Both systems share the same implementat ion of atomic broadcast based on the Paxos algorithm. We present the results of performance evaluation of both replication schemes, and a non-replicated (thus prone to failures) service, considering various workloads. The key result of our theoretical and experimental work is that neither system is superior in all cases. We discuss these results in the paper.
在本文中,我们分析并实验比较了基于状态机的复制和延迟更新(或事务性)复制,两者都依赖于原子广播。我们定义了一个模型,该模型描述了使用任一方案复制的服务执行并发请求的上限和下限。该模型由两种方案的并行度、处理器核数和请求类型进行参数化。我们分析比较了两种模式和非复制服务,考虑了以广播和请求执行为主的工作负载。为了实验性地评估事务性复制,我们开发了Paxos STM——一种新型的容错分布式软件事务性内存,具有用于事务创建、中止和重试的编程构造。对于基于状态机的复制,我们使用了JPaxos。两个系统共享基于Paxos算法的原子广播的相同实现。我们给出了考虑各种工作负载的复制方案和非复制(因此容易出现故障)服务的性能评估结果。我们的理论和实验工作的关键结果是,没有一个系统在所有情况下都是优越的。本文对这些结果进行了讨论。
{"title":"Model-Driven Comparison of State-Machine-Based and Deferred-Update Replication Schemes","authors":"P. Wojciechowski, Tadeusz Kobus, Maciej Kokociński","doi":"10.1109/SRDS.2012.44","DOIUrl":"https://doi.org/10.1109/SRDS.2012.44","url":null,"abstract":"In this paper, we analyze and experimentally compare state-machine-based and deferred-update (or transactional) replication, both relying on atomic broadcast. We define a model that describes the upper and lower bounds on the execution of concurrent requests by a service replicated using either scheme. The model is parametrized by the degree of parallelism in either scheme, the number of processor cores, and the type of requests. We analytically compared both schemes and a non-replicated service, considering a bcast- and request-execution-dominant workloads. To evaluate transactional replication experimentally, we developed Paxos STM---a novel fault-tolerant distributed software transactional memory with programming constructs for transaction creation, abort, and retry. For state-machine-based replication, we used JPaxos. Both systems share the same implementat ion of atomic broadcast based on the Paxos algorithm. We present the results of performance evaluation of both replication schemes, and a non-replicated (thus prone to failures) service, considering various workloads. The key result of our theoretical and experimental work is that neither system is superior in all cases. We discuss these results in the paper.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132297398","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Performance Issue Diagnosis for Online Service Systems 在线服务系统的性能问题诊断
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.49
Qiang Fu, Jian-Guang Lou, Qingwei Lin, Rui Ding, D. Zhang, Zihao Ye, Tao Xie
Monitoring and diagnosing performance issues of an online service system are critical to assure satisfactory performance of the system. Given a detected performance issue and collected system metrics for an online service system, engineers usually need to make great efforts to conduct diagnosis by first identifying performance issue beacons, which are metrics that pinpoint to the root causes. In order to reduce the manual efforts, in this paper, we propose a new approach to effectively detecting performance issue beacons to help with performance issue diagnosis. Our approach includes techniques for mining system metric data to address limitations when applying previous classification-based approaches. Our evaluations on both a controlled environment and a real production environment show that our approach can more effectively identify performance issue beacons from system metric data than previous approaches.
对在线服务系统的性能问题进行监测和诊断是保证在线服务系统良好运行的关键。给定在线服务系统检测到的性能问题和收集到的系统指标,工程师通常需要通过首先识别性能问题信标来进行诊断,这是精确定位根本原因的指标。为了减少人工工作量,本文提出了一种有效检测性能问题信标的新方法,以帮助进行性能问题诊断。我们的方法包括挖掘系统度量数据的技术,以解决应用以前基于分类的方法时的局限性。我们对受控环境和实际生产环境的评估表明,与以前的方法相比,我们的方法可以更有效地从系统度量数据中识别性能问题信标。
{"title":"Performance Issue Diagnosis for Online Service Systems","authors":"Qiang Fu, Jian-Guang Lou, Qingwei Lin, Rui Ding, D. Zhang, Zihao Ye, Tao Xie","doi":"10.1109/SRDS.2012.49","DOIUrl":"https://doi.org/10.1109/SRDS.2012.49","url":null,"abstract":"Monitoring and diagnosing performance issues of an online service system are critical to assure satisfactory performance of the system. Given a detected performance issue and collected system metrics for an online service system, engineers usually need to make great efforts to conduct diagnosis by first identifying performance issue beacons, which are metrics that pinpoint to the root causes. In order to reduce the manual efforts, in this paper, we propose a new approach to effectively detecting performance issue beacons to help with performance issue diagnosis. Our approach includes techniques for mining system metric data to address limitations when applying previous classification-based approaches. Our evaluations on both a controlled environment and a real production environment show that our approach can more effectively identify performance issue beacons from system metric data than previous approaches.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"43 1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116792710","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Securing a Wireless Networked Control System Using Information Fusion 利用信息融合保护无线网络控制系统
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.65
Brijesh Kashyap Chejerla, S. Madria
Security of a wireless sensor network practically governs its usability in several applications. Especially, in applications like Industrial control systems which use NCS and SCADA systems, the security affects the stability of the system. We propose to use an information fusion scheme which allows us to profile the different attacks in wireless sensor networks and study their affects on the control systems stability and feedback. We make use of the Bayesian Networks to obtain hypotheses as outputs which form the decisions. These decisions are made based on the feature extraction and estimation process of the entire information fusion scheme. This allows us to make sure that the WNCS works smoothly without any aberrations even under the influence of security attacks. In this paper, we go on to explain the process that we employ in ensuring the stability and security of the system.
无线传感器网络的安全性实际上决定了它在许多应用中的可用性。特别是在采用NCS和SCADA系统的工业控制系统中,安全性直接影响到系统的稳定性。我们建议使用一种信息融合方案,使我们能够描述无线传感器网络中的不同攻击,并研究它们对控制系统稳定性和反馈的影响。我们利用贝叶斯网络获得假设作为决策的输出。这些决策是基于整个信息融合方案的特征提取和估计过程做出的。这使我们能够确保即使在安全攻击的影响下,wnc也能顺利工作,没有任何异常。在本文中,我们继续解释我们在确保系统的稳定性和安全性方面所采用的过程。
{"title":"Securing a Wireless Networked Control System Using Information Fusion","authors":"Brijesh Kashyap Chejerla, S. Madria","doi":"10.1109/SRDS.2012.65","DOIUrl":"https://doi.org/10.1109/SRDS.2012.65","url":null,"abstract":"Security of a wireless sensor network practically governs its usability in several applications. Especially, in applications like Industrial control systems which use NCS and SCADA systems, the security affects the stability of the system. We propose to use an information fusion scheme which allows us to profile the different attacks in wireless sensor networks and study their affects on the control systems stability and feedback. We make use of the Bayesian Networks to obtain hypotheses as outputs which form the decisions. These decisions are made based on the feature extraction and estimation process of the entire information fusion scheme. This allows us to make sure that the WNCS works smoothly without any aberrations even under the influence of security attacks. In this paper, we go on to explain the process that we employ in ensuring the stability and security of the system.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115550475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Towards Identifying Root Causes of Faults in Service-Based Applications 识别基于服务的应用程序故障的根本原因
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.78
Christian Inzinger, W. Hummer, B. Satzger, P. Leitner, S. Dustdar
In this paper we study fault localization techniques for identification of incompatible configurations and implementations in service-based applications. We propose an approach using pooled decision trees for localization of faulty service parameter and binding configurations, explicitly addressing temporary and changing fault conditions.
本文研究了基于服务的应用中用于识别不兼容配置和实现的故障定位技术。我们提出了一种使用池决策树来定位故障服务参数和绑定配置的方法,明确地处理临时和变化的故障条件。
{"title":"Towards Identifying Root Causes of Faults in Service-Based Applications","authors":"Christian Inzinger, W. Hummer, B. Satzger, P. Leitner, S. Dustdar","doi":"10.1109/SRDS.2012.78","DOIUrl":"https://doi.org/10.1109/SRDS.2012.78","url":null,"abstract":"In this paper we study fault localization techniques for identification of incompatible configurations and implementations in service-based applications. We propose an approach using pooled decision trees for localization of faulty service parameter and binding configurations, explicitly addressing temporary and changing fault conditions.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"2016 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114526635","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
S-Paxos: Offloading the Leader for High Throughput State Machine Replication S-Paxos:卸载Leader以实现高吞吐量状态机复制
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.66
M. Biely, Zarko Milosevic, Nuno Santos, A. Schiper
Implementations of state machine replication are prevalently using variants of Paxos or other leader-based protocols. Typically these protocols are also leader-centric, in the sense that the leader performs more work than the non-leader replicas. Such protocols scale poorly, because as the number of replicas or the load on the system increases, the leader replica quickly reaches the limits of one of its resources. In this paper we show that much of the work performed by the leader in a leader-centric protocol can in fact be evenly distributed among all the replicas, thereby leaving the leader only with minimal additional workload. This is done (i) by distributing the work of handling client communication among all replicas, (ii) by disseminating client requests among replicas in a distributed fashion, and (iii) by executing the ordering protocol on ids. We derive a variant of Paxos incorporating these ideas. Compared to leader-centric protocols, our protocol not only achieves significantly higher throughput for any given number of replicas, but also increases its throughput with the number of replicas.
状态机复制的实现通常使用Paxos的变体或其他基于leader的协议。通常,这些协议也是以领导者为中心的,因为领导者比非领导者副本执行更多的工作。这种协议的可扩展性很差,因为随着副本数量或系统负载的增加,leader副本很快就会达到其资源的极限。在本文中,我们展示了在以leader为中心的协议中,leader执行的大部分工作实际上可以均匀地分布在所有副本中,从而只给leader留下最小的额外工作量。这是通过(i)在所有副本之间分发处理客户端通信的工作,(ii)通过以分布式方式在副本之间传播客户端请求,以及(iii)通过在id上执行排序协议来实现的。我们派生了一个包含这些思想的Paxos变体。与以leader为中心的协议相比,我们的协议不仅可以在任意给定数量的副本上实现更高的吞吐量,而且还可以随着副本数量的增加而增加吞吐量。
{"title":"S-Paxos: Offloading the Leader for High Throughput State Machine Replication","authors":"M. Biely, Zarko Milosevic, Nuno Santos, A. Schiper","doi":"10.1109/SRDS.2012.66","DOIUrl":"https://doi.org/10.1109/SRDS.2012.66","url":null,"abstract":"Implementations of state machine replication are prevalently using variants of Paxos or other leader-based protocols. Typically these protocols are also leader-centric, in the sense that the leader performs more work than the non-leader replicas. Such protocols scale poorly, because as the number of replicas or the load on the system increases, the leader replica quickly reaches the limits of one of its resources. In this paper we show that much of the work performed by the leader in a leader-centric protocol can in fact be evenly distributed among all the replicas, thereby leaving the leader only with minimal additional workload. This is done (i) by distributing the work of handling client communication among all replicas, (ii) by disseminating client requests among replicas in a distributed fashion, and (iii) by executing the ordering protocol on ids. We derive a variant of Paxos incorporating these ideas. Compared to leader-centric protocols, our protocol not only achieves significantly higher throughput for any given number of replicas, but also increases its throughput with the number of replicas.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129361937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 79
An End-to-End Security Auditing Approach for Service Oriented Architectures 面向服务体系结构的端到端安全审计方法
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.5
M. Azarmi, B. Bhargava, Pelin Angin, R. Ranchal, Norman Ahmed, A. Sinclair, M. Linderman, L. B. Othmane
Service-Oriented Architecture (SOA) is becoming a major paradigm for distributed application development in the recent explosion of Internet services and cloud computing. However, SOA introduces new security challenges not present in the single-hop client-server architectures due to the involvement of multiple service providers in a service request. The interactions of independent service domains in SOA could violate service policies or SLAs. In addition, users in SOA systems have no control on what happens in the chain of service invocations. Although the establishment of trust across all involved partners is required as a prerequisite to ensure secure interactions, still a new end-to-end security auditing mechanism is needed to verify the actual service invocations and its conformance to the expected service orchestration. In this paper, we provide an efficient solution for end-to-end security auditing in SOA. The proposed security architecture introduces two new components called taint analysis and trust broker in addition to taking advantages of WS-Security and WS-Trust standards. The interaction of these components maintains session auditing and dynamic trust among services. This solution is transparent to the services, which allows auditing of legacy services without modification. Moreover, we have implemented a prototype of the proposed approach and verified its effectiveness in a LAN setting and the Amazon EC2 cloud computing infrastructure.
在最近Internet服务和云计算的爆炸式增长中,面向服务的体系结构(SOA)正在成为分布式应用程序开发的主要范例。然而,由于在一个服务请求中涉及多个服务提供者,SOA引入了在单跳客户机-服务器体系结构中不存在的新的安全挑战。SOA中独立服务域的交互可能会违反服务策略或sla。此外,SOA系统中的用户无法控制服务调用链中发生的事情。尽管在所有相关的合作伙伴之间建立信任是确保安全交互的先决条件,但是仍然需要一种新的端到端安全审计机制来验证实际的服务调用及其与预期的服务编排的一致性。在本文中,我们为SOA中的端到端安全审计提供了一个有效的解决方案。提议的安全体系结构除了利用WS-Security和WS-Trust标准之外,还引入了两个新的组件,称为污染分析和信任代理。这些组件的交互维护了服务之间的会话审计和动态信任。此解决方案对服务是透明的,允许在不修改的情况下审计遗留服务。此外,我们已经实现了该方法的原型,并在局域网设置和Amazon EC2云计算基础设施中验证了其有效性。
{"title":"An End-to-End Security Auditing Approach for Service Oriented Architectures","authors":"M. Azarmi, B. Bhargava, Pelin Angin, R. Ranchal, Norman Ahmed, A. Sinclair, M. Linderman, L. B. Othmane","doi":"10.1109/SRDS.2012.5","DOIUrl":"https://doi.org/10.1109/SRDS.2012.5","url":null,"abstract":"Service-Oriented Architecture (SOA) is becoming a major paradigm for distributed application development in the recent explosion of Internet services and cloud computing. However, SOA introduces new security challenges not present in the single-hop client-server architectures due to the involvement of multiple service providers in a service request. The interactions of independent service domains in SOA could violate service policies or SLAs. In addition, users in SOA systems have no control on what happens in the chain of service invocations. Although the establishment of trust across all involved partners is required as a prerequisite to ensure secure interactions, still a new end-to-end security auditing mechanism is needed to verify the actual service invocations and its conformance to the expected service orchestration. In this paper, we provide an efficient solution for end-to-end security auditing in SOA. The proposed security architecture introduces two new components called taint analysis and trust broker in addition to taking advantages of WS-Security and WS-Trust standards. The interaction of these components maintains session auditing and dynamic trust among services. This solution is transparent to the services, which allows auditing of legacy services without modification. Moreover, we have implemented a prototype of the proposed approach and verified its effectiveness in a LAN setting and the Amazon EC2 cloud computing infrastructure.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131174805","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Providing Uniform Reliable Broadcast Delivery for Mobile Ad Hoc Networks with MANET Liveness Property 为具有MANET动态特性的移动自组织网络提供统一可靠的广播传输
Pub Date : 2012-10-08 DOI: 10.1109/SRDS.2012.53
J. Brzeziński, M. Kalewski, Jacek Kobusinski
The MANET liveness property ensures that no operative host in an ad hoc network is permanently isolated, and for networks that fulfill the property a few crash-tolerant broadcast protocols have been proposed. However, the protocols proposed till now guarantee that only at least an arbitrary majority of operative hosts receives each disseminated message, and one of these protocols has been further modified to fulfill the properties of regular reliable broadcast. Moreover, it has also been proved that the minimum time of direct connectivity between hosts, and thus the correctness of all these protocols, depends on the total number of hosts in a network and on the total number of messages that can be disseminated by each host concurrently. In this paper, we propose a novel uniform reliable broadcast protocol that works correctly, even though the minimum time of a direct connection between hosts allows them to exchange at least only two messages, which makes the correctness of this protocol independent of the total number of messages that can be disseminated by all nodes in a network.
MANET的动态特性保证了在自组织网络中没有运行的主机是永久隔离的,对于满足该特性的网络,已经提出了一些容错广播协议。然而,目前提出的协议保证至少只有任意多数的操作主机接收到每条传播的消息,并且其中一个协议被进一步修改以满足常规可靠广播的特性。此外,还证明了主机之间直接连接的最小时间,从而所有这些协议的正确性,取决于网络中主机的总数以及每台主机可以同时传播的消息总数。在本文中,我们提出了一种新的统一可靠的广播协议,即使主机之间直接连接的最小时间允许它们至少交换两条消息,也可以正确工作,这使得该协议的正确性与网络中所有节点可以传播的消息总数无关。
{"title":"Providing Uniform Reliable Broadcast Delivery for Mobile Ad Hoc Networks with MANET Liveness Property","authors":"J. Brzeziński, M. Kalewski, Jacek Kobusinski","doi":"10.1109/SRDS.2012.53","DOIUrl":"https://doi.org/10.1109/SRDS.2012.53","url":null,"abstract":"The MANET liveness property ensures that no operative host in an ad hoc network is permanently isolated, and for networks that fulfill the property a few crash-tolerant broadcast protocols have been proposed. However, the protocols proposed till now guarantee that only at least an arbitrary majority of operative hosts receives each disseminated message, and one of these protocols has been further modified to fulfill the properties of regular reliable broadcast. Moreover, it has also been proved that the minimum time of direct connectivity between hosts, and thus the correctness of all these protocols, depends on the total number of hosts in a network and on the total number of messages that can be disseminated by each host concurrently. In this paper, we propose a novel uniform reliable broadcast protocol that works correctly, even though the minimum time of a direct connection between hosts allows them to exchange at least only two messages, which makes the correctness of this protocol independent of the total number of messages that can be disseminated by all nodes in a network.","PeriodicalId":447700,"journal":{"name":"2012 IEEE 31st Symposium on Reliable Distributed Systems","volume":"51 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2012-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131211040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
期刊
2012 IEEE 31st Symposium on Reliable Distributed Systems
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1