首页 > 最新文献

Proceedings. International Conference on Dependable Systems and Networks最新文献

英文 中文
An evaluation of connectivity in mobile wireless ad hoc networks 移动无线自组网的连通性评估
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028890
P. Santi, D. Blough
We consider the following problem for wireless ad hoc networks: assume n nodes, each capable of communicating with nodes within a radius of r, are distributed in a d-dimensional region of side l; how large must the transmitting range r be to ensure that the resulting network is connected? We also consider the mobile version of the problem, in which nodes are allowed to move during a time interval and the value of r ensuring connectedness for a given fraction of the interval must be determined. For the stationary case, we give tight bounds on the relative magnitude of r, n and l yielding a connected graph with high probability in l-dimensional networks, thus solving an open problem. The mobile version of the problem when d=2 is investigated through extensive simulations, which give insight on how mobility affects connectivity and reveal a useful trade-off between communication capability and energy consumption.
我们考虑无线自组织网络的以下问题:假设n个节点,每个节点能够与半径为r的节点通信,分布在边l的d维区域中;传输距离r要有多大才能保证网络连接?我们还考虑了该问题的移动版本,其中允许节点在一段时间间隔内移动,并且必须确定确保在给定时间间隔内连通性的r值。对于平稳情况,我们给出了r, n和l的相对大小的紧界,从而在l维网络中得到一个高概率的连通图,从而解决了一个开放问题。通过广泛的模拟研究了d=2时问题的移动版本,从而深入了解移动性如何影响连接,并揭示了通信能力和能耗之间的有益权衡。
{"title":"An evaluation of connectivity in mobile wireless ad hoc networks","authors":"P. Santi, D. Blough","doi":"10.1109/DSN.2002.1028890","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028890","url":null,"abstract":"We consider the following problem for wireless ad hoc networks: assume n nodes, each capable of communicating with nodes within a radius of r, are distributed in a d-dimensional region of side l; how large must the transmitting range r be to ensure that the resulting network is connected? We also consider the mobile version of the problem, in which nodes are allowed to move during a time interval and the value of r ensuring connectedness for a given fraction of the interval must be determined. For the stationary case, we give tight bounds on the relative magnitude of r, n and l yielding a connected graph with high probability in l-dimensional networks, thus solving an open problem. The mobile version of the problem when d=2 is investigated through extensive simulations, which give insight on how mobility affects connectivity and reveal a useful trade-off between communication capability and energy consumption.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"11 1","pages":"89-98"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78935593","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 113
An adaptive framework for tunable consistency and timeliness using replication 使用复制实现可调一致性和时效性的自适应框架
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028882
S. Krishnamurthy, W. Sanders, M. Cukier
One well-known challenge in using replication to service multiple clients concurrently is that of delivering a timely and consistent response to the clients. In this paper, we address this problem in the context of client applications that have specific temporal and consistency requirements. These applications can tolerate a certain degree of relaxed consistency, in exchange for better response time. We propose a flexible QoS model that allows these clients to specify their temporal and consistency constraints. In order to select replicas to serve these clients, we need to control of the inconsistency of the replicas, so that we have a large enough pool of replicas with the appropriate state to meet a client's timeliness, consistency, and dependability requirements. We describe an adaptive framework that uses lazy update propagation to control the replica inconsistency and employs a probabilistic approach to select replicas dynamically to service a client, based on its QoS specification. The probabilistic approach predicts the ability of a replica to meet a client's QoS specification by using the performance history collected by monitoring the replicas at runtime. We conclude with experimental results based on our implementation.
在使用复制并发地为多个客户机提供服务时,一个众所周知的挑战是向客户机提供及时和一致的响应。在本文中,我们将在具有特定时间和一致性需求的客户端应用程序上下文中解决此问题。这些应用程序可以容忍一定程度的宽松一致性,以换取更好的响应时间。我们提出了一个灵活的QoS模型,允许这些客户端指定它们的时间和一致性约束。为了选择副本来服务这些客户端,我们需要控制副本的不一致性,这样我们就有一个足够大的副本池,具有适当的状态,以满足客户端的及时性、一致性和可靠性需求。我们描述了一个自适应框架,它使用延迟更新传播来控制副本不一致,并采用概率方法根据其QoS规范动态选择副本来为客户端服务。概率方法通过使用在运行时监视副本收集的性能历史记录来预测副本满足客户端QoS规范的能力。最后给出了实验结果。
{"title":"An adaptive framework for tunable consistency and timeliness using replication","authors":"S. Krishnamurthy, W. Sanders, M. Cukier","doi":"10.1109/DSN.2002.1028882","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028882","url":null,"abstract":"One well-known challenge in using replication to service multiple clients concurrently is that of delivering a timely and consistent response to the clients. In this paper, we address this problem in the context of client applications that have specific temporal and consistency requirements. These applications can tolerate a certain degree of relaxed consistency, in exchange for better response time. We propose a flexible QoS model that allows these clients to specify their temporal and consistency constraints. In order to select replicas to serve these clients, we need to control of the inconsistency of the replicas, so that we have a large enough pool of replicas with the appropriate state to meet a client's timeliness, consistency, and dependability requirements. We describe an adaptive framework that uses lazy update propagation to control the replica inconsistency and employs a probabilistic approach to select replicas dynamically to service a client, based on its QoS specification. The probabilistic approach predicts the ability of a replica to meet a client's QoS specification by using the performance history collected by monitoring the replicas at runtime. We conclude with experimental results based on our implementation.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"19 1","pages":"17-26"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76522228","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Pinpoint: problem determination in large, dynamic Internet services 精确定位:在大型动态Internet服务中确定问题
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1029005
Mike Y. Chen, Emre Kıcıman, Eugene Fratkin, A. Fox, E. Brewer
Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today's large, distributed, and dynamic application environments such as e-commerce systems. We present a dynamic analysis methodology that automates problem determination in these environments by 1) coarse-grained tagging of numerous real client requests as they travel through the system and 2) using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault. To validate our methodology, we have implemented Pinpoint, a framework for root cause analysis on the J2EE platform that requires no knowledge of the application components. Pinpoint consists of three parts: a communications layer that traces client requests, a failure detector that uses traffic-sniffing and middleware instrumentation, and a data analysis engine. We evaluate Pinpoint by injecting faults into various application components and show that Pinpoint identifies the faulty components with high accuracy and produces few false-positives.
传统的问题确定技术依赖于静态依赖模型,这些模型很难在当今的大型、分布式和动态应用程序环境(如电子商务系统)中准确生成。我们提出了一种动态分析方法,通过以下方法在这些环境中自动确定问题:1)粗粒度标记大量在系统中传输的真实客户端请求;2)使用数据挖掘技术将这些请求的失败和成功联系起来,以确定哪些组件最有可能出现故障。为了验证我们的方法,我们实现了Pinpoint,这是一个用于在J2EE平台上进行根本原因分析的框架,不需要了解应用程序组件。Pinpoint由三个部分组成:跟踪客户机请求的通信层、使用流量嗅探和中间件检测的故障检测器以及数据分析引擎。我们通过将故障注入到各种应用组件中来对Pinpoint进行评估,结果表明,Pinpoint对故障组件的识别精度很高,并且产生的误报很少。
{"title":"Pinpoint: problem determination in large, dynamic Internet services","authors":"Mike Y. Chen, Emre Kıcıman, Eugene Fratkin, A. Fox, E. Brewer","doi":"10.1109/DSN.2002.1029005","DOIUrl":"https://doi.org/10.1109/DSN.2002.1029005","url":null,"abstract":"Traditional problem determination techniques rely on static dependency models that are difficult to generate accurately in today's large, distributed, and dynamic application environments such as e-commerce systems. We present a dynamic analysis methodology that automates problem determination in these environments by 1) coarse-grained tagging of numerous real client requests as they travel through the system and 2) using data mining techniques to correlate the believed failures and successes of these requests to determine which components are most likely to be at fault. To validate our methodology, we have implemented Pinpoint, a framework for root cause analysis on the J2EE platform that requires no knowledge of the application components. Pinpoint consists of three parts: a communications layer that traces client requests, a failure detector that uses traffic-sniffing and middleware instrumentation, and a data analysis engine. We evaluate Pinpoint by injecting faults into various application components and show that Pinpoint identifies the faulty components with high accuracy and produces few false-positives.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"212 1","pages":"595-604"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76972103","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 890
Experimental analysis of the errors induced into Linux by three fault injection techniques 三种故障注入技术对Linux系统造成的错误的实验分析
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028917
T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun
The main goal of the experimental stud), reported in this paper is to investigate to what extent distinct fault injection techniques lead to similar consequences (errors and failures). The target system we are using to carry out our investigation is the Linux kernel as it provides a representative operating system. It is featuring full controllability and observability thanks to its open source status. Three types of software-implemented fault injection techniques are considered, namely: i) provision of invalid values to the parameters of the kernel calls, ii) corruption of the parameters of the kernel calls, and iii) corruption of the input parameters of the internal functions of the kernel. The workload being used for the experiments is tailored to activate selectively each functional component. The observations encompass typical kernel failure modes (e.g., exceptions and kernel hangs) as well as a detailed analysis of the reported error codes.
本文报道的实验研究的主要目标是研究不同的故障注入技术在多大程度上导致相似的后果(错误和失败)。我们用来进行调查的目标系统是Linux内核,因为它提供了一个具有代表性的操作系统。由于其开源状态,它具有完全的可控性和可观察性。考虑了三种软件实现的故障注入技术,即:i)向内核调用的参数提供无效值,ii)内核调用的参数损坏,以及iii)内核内部函数的输入参数损坏。用于实验的工作量是量身定制的,以选择性地激活每个功能组件。观察结果包括典型的内核故障模式(例如,异常和内核挂起)以及对报告的错误代码的详细分析。
{"title":"Experimental analysis of the errors induced into Linux by three fault injection techniques","authors":"T. Jarboui, J. Arlat, Y. Crouzet, K. Kanoun","doi":"10.1109/DSN.2002.1028917","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028917","url":null,"abstract":"The main goal of the experimental stud), reported in this paper is to investigate to what extent distinct fault injection techniques lead to similar consequences (errors and failures). The target system we are using to carry out our investigation is the Linux kernel as it provides a representative operating system. It is featuring full controllability and observability thanks to its open source status. Three types of software-implemented fault injection techniques are considered, namely: i) provision of invalid values to the parameters of the kernel calls, ii) corruption of the parameters of the kernel calls, and iii) corruption of the input parameters of the internal functions of the kernel. The workload being used for the experiments is tailored to activate selectively each functional component. The observations encompass typical kernel failure modes (e.g., exceptions and kernel hangs) as well as a detailed analysis of the reported error codes.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"41 1","pages":"331-336"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77149687","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
Recovery and performance balance of a COTS DBMS in the presence of operator faults COTS DBMS在操作员故障情况下的恢复与性能平衡
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1029007
M. Vieira, H. Madeira
A major cause of failures in large database management systems (DBMS) is operator faults. Although most of the complex DBMS have comprehensive recovery mechanisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database administrators seldom have feedback on how good a given configuration is concerning recovery. This paper proposes an experimental approach to characterize both the performance and the recoverability in DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS running the standard TPC-C benchmark, extended to include two new elements: a fault load based on operator faults and measures related to recoverability. A classification of operator faults in DBMS is proposed. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.
大型数据库管理系统(DBMS)故障的主要原因是操作人员的失误。尽管大多数复杂的DBMS具有全面的恢复机制,但这些机制的有效性很难表征。另一方面,大型数据库的调优非常复杂,数据库管理员倾向于专注于性能调优,而忽略了恢复机制。最重要的是,数据库管理员很少得到关于给定配置在恢复方面有多好的反馈。本文提出了一种实验方法来表征DBMS的性能和可恢复性。我们的方法通过一个运行标准TPC-C基准测试的Oracle DBMS的性能和恢复基准测试的具体例子来展示,扩展到包括两个新元素:基于操作员错误的故障负载和与可恢复性相关的措施。提出了一种数据库管理系统中算子故障的分类方法。本文最后讨论了结果,并提出了指导方针,以帮助数据库管理员在性能和恢复调优之间找到平衡。
{"title":"Recovery and performance balance of a COTS DBMS in the presence of operator faults","authors":"M. Vieira, H. Madeira","doi":"10.1109/DSN.2002.1029007","DOIUrl":"https://doi.org/10.1109/DSN.2002.1029007","url":null,"abstract":"A major cause of failures in large database management systems (DBMS) is operator faults. Although most of the complex DBMS have comprehensive recovery mechanisms, the effectiveness of these mechanisms is difficult to characterize. On the other hand, the tuning of a large database is very complex and database administrators tend to concentrate on performance tuning and disregard the recovery mechanisms. Above all, database administrators seldom have feedback on how good a given configuration is concerning recovery. This paper proposes an experimental approach to characterize both the performance and the recoverability in DBMS. Our approach is presented through a concrete example of benchmarking the performance and recovery of an Oracle DBMS running the standard TPC-C benchmark, extended to include two new elements: a fault load based on operator faults and measures related to recoverability. A classification of operator faults in DBMS is proposed. The paper ends with the discussion of the results and the proposal of guidelines to help database administrators in finding the balance between performance and recovery tuning.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"51 1","pages":"615-624"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83679145","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 26
Dependability and the grid issues and challenges 可靠性和电网问题与挑战
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028907
R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava
For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.
十多年来,参与科学计算的研究人员一直在研究技术,使先进的科学应用能够利用与跨大地理距离的广域网连接的机器相关的资源。网格计算最初被称为元计算或异构计算,目前是用来描述这种类型的分布式计算模型的最常用术语。一般来说,网格计算强调以灵活、安全和协调的方式跨管理域进行大规模资源共享(不仅包括计算周期,还包括软件和数据)。已经开发了许多软件平台来解决与网格计算相关的所有或子集挑战,包括Condor、Entropia平台、Globus工具包、Legion、LSF、Ninf和Sun的网格引擎。虽然网格最初是为支持科学应用程序而设计的,但最近对扩展该模型以支持企业计算需求(包括基于Web服务的计算需求)的兴趣越来越大。例如,IBM和Sun都将网格作为其企业计算战略的一部分,而最近的全球网格论坛GGF-4 (http://www.gridforum.org/)包含了许多与以这种方式推广网格相关的主题。这项工作的一部分包括定义一个开放网格服务体系结构(OGSA),该体系结构可用于在企业内部和跨企业集成服务。鉴于科学应用程序和企业应用程序之间的差异,可以预期,要实现这一目标,必须解决许多技术问题。这个小组将集中讨论与网格计算相关的一个特殊挑战,即确保网格计算的可靠运行。在此上下文中,可靠性包括广泛的可能属性集合,包括可用性、可靠性、安全性和及时执行。可能讨论的主题包括当前应用程序场景与预期应用程序场景的不同可靠性需求、在两种上下文中实现可靠性的技术障碍,以及与在软件平台(如OGSA)中提供适当支持相关的体系结构问题。总体目标是将在不同社区工作的个人的观点结合在一起,以确定有待解决的问题和挑战,从而使可靠的网格计算成为现实。
{"title":"Dependability and the grid issues and challenges","authors":"R. Schlichting, A. Chien, C. Kesselman, K. Marzullo, J. Plank, S. Shrivastava","doi":"10.1109/DSN.2002.1028907","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028907","url":null,"abstract":"For over a decade, researchers involved with scientific computing have been investigating technologies that allow advanced scientific applications to exploit resources associated with machines connected by wide-area networks across large geographical distances. Originally referred to as metacomputing or heterogenous computing, Grid computing is currently the most common term used to describe this type of distributed computing model. Generally speaking, Grid computing emphasizes large scale resource sharing—not only computational cycles, but also software and data— across administrative domains in a flexible, secure, and coordinated fashion. A number of software platforms have been developed that address all or subsets of the challenges associated with Grid computing, including Condor, the Entropia platform, the Globus toolkit, Legion, LSF, Ninf, and Sun’s Grid Engine. While the Grid was originally designed to support scientific applications, there has been significant interest recently in extending the model to support the needs of enterprise computing, including those based on Web services. For example, both IBM and Sun have made the Grid part of their enterprise computing strategies, while the recent Global Grid Forum GGF-4 (http://www.gridforum.org/) included a number of topics related to generalizing the Grid in this way. Part of this effort includes defining an Open Grid Services Architecture (OGSA) that can be used to integrate services within and across enterprises. As might be expected given the difference between scientific and enterprise applications, there are any number of technical issues that must be addressed to accomplish this goal. This panel will focus on one particular challenge associated with Grid computing, that of ensuring dependable operation of Grid computations. Dependability in this context encompasses a broad collection of possible attributes, including availability, reliability, security, and timely execution. Among the possible topics for discussion are different dependability requirements of current versus envisioned application scenarios, technical barriers to achieving dependability in both contexts, and architectural issues related to providing appropriate support in software platforms such as OGSA. The overall goal is to bring together the perspectives of individuals working in different communities to identify issues and challenges that remain to be solved to make dependable Grid computing a reality.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"695 1","pages":"263-263"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83299449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Reliability and availability analysis for the JPL Remote Exploration and Experimentation System JPL远程探测与实验系统可靠性与可用性分析
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028918
Dongyan Chen, S. Dharmaraja, Dongyan Chen, Lei Li, Kishor S. Trivedi, R. Some, A. Nikora
The NASA Remote Exploration and Experimentation (REE) Project, managed by the Jet Propulsion Laboratory, has the vision of bringing commercial supercomputing technology into space, in a form which meets the demanding environmental requirements, to enable a new class of science investigation and discovery. Dependability goals of the REE system are 99% reliability over 5 years and 99% availability. In this paper we focus on the reliability/availability modeling and analysis of the REE system. We carry out this task using fault trees, reliability block diagrams, stochastic reward nets and hierarchical models. Our analysis helps to determine the ranges of parameters for which the REE dependability goal will be met. The analysis also allows us to assess different hardware and software fault-tolerance techniques.
由喷气推进实验室管理的NASA远程探索和实验(REE)项目的愿景是将商业超级计算技术以满足苛刻的环境要求的形式带入太空,从而实现新的科学调查和发现。REE系统的可靠性目标是5年以上99%的可靠性和99%的可用性。本文主要研究了稀土系统的可靠性/可用性建模与分析。我们使用故障树、可靠性框图、随机奖励网和分层模型来完成这项任务。我们的分析有助于确定满足REE可靠性目标的参数范围。分析还允许我们评估不同的硬件和软件容错技术。
{"title":"Reliability and availability analysis for the JPL Remote Exploration and Experimentation System","authors":"Dongyan Chen, S. Dharmaraja, Dongyan Chen, Lei Li, Kishor S. Trivedi, R. Some, A. Nikora","doi":"10.1109/DSN.2002.1028918","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028918","url":null,"abstract":"The NASA Remote Exploration and Experimentation (REE) Project, managed by the Jet Propulsion Laboratory, has the vision of bringing commercial supercomputing technology into space, in a form which meets the demanding environmental requirements, to enable a new class of science investigation and discovery. Dependability goals of the REE system are 99% reliability over 5 years and 99% availability. In this paper we focus on the reliability/availability modeling and analysis of the REE system. We carry out this task using fault trees, reliability block diagrams, stochastic reward nets and hierarchical models. Our analysis helps to determine the ranges of parameters for which the REE dependability goal will be met. The analysis also allows us to assess different hardware and software fault-tolerance techniques.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"78 1","pages":"337-342"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83132298","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 31
An adaptive decomposition approach for the analysis of stochastic Petri nets 随机Petri网分析的自适应分解方法
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1029010
P. Buchholz
We present a new approximate solution technique for the numerical analysis of superposed generalized stochastic Petri nets (SGSPNs) and related models. The approach combines numerical iterative solution techniques and fixed point computations using the complete knowledge of state space and generator matrix. In contrast to other approximation methods, the proposed method is adaptive by considering states with a high probability in detail and aggregating states with small probabilities. Probabilities are approximated by the results derived during the iterative solution. Thus, a maximum number of states can be predefined and the presented method automatically aggregates states such that the solution is computed using a vector of a size smaller or equal to the maximum. By means of a non-trivial example it is shown that the approach computes good approximations with a low effort for many models.
本文提出了一种新的近似求解方法,用于叠加广义随机Petri网(sgspn)及其相关模型的数值分析。该方法结合了数值迭代求解技术和不动点计算,利用状态空间和生成器矩阵的完整知识。与其他近似方法相比,该方法通过详细考虑高概率状态和汇总小概率状态来自适应。概率由迭代求解过程中得到的结果来近似。因此,可以预先定义状态的最大数量,并且所提出的方法自动聚合状态,以便使用小于或等于最大值的大小的向量来计算解决方案。通过一个非平凡的例子表明,该方法对许多模型都能以较低的代价计算出较好的近似。
{"title":"An adaptive decomposition approach for the analysis of stochastic Petri nets","authors":"P. Buchholz","doi":"10.1109/DSN.2002.1029010","DOIUrl":"https://doi.org/10.1109/DSN.2002.1029010","url":null,"abstract":"We present a new approximate solution technique for the numerical analysis of superposed generalized stochastic Petri nets (SGSPNs) and related models. The approach combines numerical iterative solution techniques and fixed point computations using the complete knowledge of state space and generator matrix. In contrast to other approximation methods, the proposed method is adaptive by considering states with a high probability in detail and aggregating states with small probabilities. Probabilities are approximated by the results derived during the iterative solution. Thus, a maximum number of states can be predefined and the presented method automatically aggregates states such that the solution is computed using a vector of a size smaller or equal to the maximum. By means of a non-trivial example it is shown that the approach computes good approximations with a low effort for many models.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"40 1","pages":"647-656"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76422337","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Modeling and quantification of security attributes of software systems 软件系统安全属性的建模与量化
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1028941
B. Madan, K. Goseva-Popstojanova, K. Vaidyanathan, K. Trivedi
Quite often failures in network based services and server systems may not be accidental, but rather caused by deliberate security intrusions. We would like such systems to either completely preclude the possibility of a security intrusion or design them to be robust enough to continue functioning despite security attacks. Not only is it important to prevent or tolerate security intrusions, it is equally important to treat security as a QoS attribute at par with, if not more important than other QoS attributes such as availability and performability. This paper deals with various issues related to quantifying the security attribute of an intrusion tolerant system, such as the SITAR system. A security intrusion and the response of an intrusion tolerant system to the attack is modeled as a random process. This facilitates the use of stochastic modeling techniques to capture the attacker behavior as well as the system's response to a security intrusion. This model is used to analyze and quantify the security attributes of the system. The security quantification analysis is first carried out for steady-state behavior leading to measures like steady-state availability. By transforming this model to a model with absorbing states, we compute a security measure called the "mean time (or effort) to security failure" and also compute probabilities of security failure due to violations of different security attributes.
基于网络的服务和服务器系统中的故障通常不是偶然的,而是由故意的安全入侵引起的。我们希望这样的系统要么完全排除安全入侵的可能性,要么设计得足够强大,即使受到安全攻击也能继续运行。不仅防止或容忍安全入侵很重要,而且将安全性视为与可用性和可执行性等其他QoS属性同等重要(如果不是更重要的话)的QoS属性也同样重要。本文研究了以SITAR系统为例的入侵容忍系统的安全属性量化问题。将安全入侵和入侵容忍系统对攻击的响应建模为随机过程。这有助于使用随机建模技术来捕获攻击者的行为以及系统对安全入侵的响应。该模型用于分析和量化系统的安全属性。首先对稳态行为进行安全量化分析,得出稳态可用性等测度。通过将该模型转换为具有吸收状态的模型,我们计算了一个称为“安全失效的平均时间(或努力)”的安全度量,并计算了由于违反不同安全属性而导致安全失效的概率。
{"title":"Modeling and quantification of security attributes of software systems","authors":"B. Madan, K. Goseva-Popstojanova, K. Vaidyanathan, K. Trivedi","doi":"10.1109/DSN.2002.1028941","DOIUrl":"https://doi.org/10.1109/DSN.2002.1028941","url":null,"abstract":"Quite often failures in network based services and server systems may not be accidental, but rather caused by deliberate security intrusions. We would like such systems to either completely preclude the possibility of a security intrusion or design them to be robust enough to continue functioning despite security attacks. Not only is it important to prevent or tolerate security intrusions, it is equally important to treat security as a QoS attribute at par with, if not more important than other QoS attributes such as availability and performability. This paper deals with various issues related to quantifying the security attribute of an intrusion tolerant system, such as the SITAR system. A security intrusion and the response of an intrusion tolerant system to the attack is modeled as a random process. This facilitates the use of stochastic modeling techniques to capture the attacker behavior as well as the system's response to a security intrusion. This model is used to analyze and quantify the security attributes of the system. The security quantification analysis is first carried out for steady-state behavior leading to measures like steady-state availability. By transforming this model to a model with absorbing states, we compute a security measure called the \"mean time (or effort) to security failure\" and also compute probabilities of security failure due to violations of different security attributes.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"33 1","pages":"505-514"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76356928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 188
Evaluating the impact of different document types on the performance of web cache replacement schemes 评估不同文档类型对web缓存替换方案性能的影响
Pub Date : 2002-06-23 DOI: 10.1109/DSN.2002.1029017
C. Lindemann, O. P. Waldhorst
In this paper, we present a comprehensive performance study of least recently used and least frequently used with dynamic aging as traditional replacement schemes as well as for the newly proposed schemes greedy dual size and greedy dual. The goal of our study constitutes the understanding how these replacement schemes deal with different web document types. Using trace-driven simulation, we present curves plotting the hit rate and byte hit rate broken down for image, HTML, multi media, and application documents. The presented results show for the first workload that under the packet cost model Greedy Dual outperforms the other schemes both in terms of hit rate and byte hit rate for image, HTML, and multi media documents. However, the advantages of Greedy Dual diminish when the workload contains more distinct multi media documents and a larger number of requests to multi media documents.
本文综合研究了动态老化下最少使用和最少使用的传统替换方案以及新提出的贪心双尺寸和贪心双尺寸方案的性能。我们研究的目标是理解这些替代方案如何处理不同的web文档类型。使用跟踪驱动模拟,我们给出了图像、HTML、多媒体和应用程序文档的命中率和字节命中率曲线。本文给出的结果表明,在第一个工作负载下,在数据包成本模型下,贪婪双通道在图像、HTML和多媒体文档的命中率和字节命中率方面都优于其他方案。然而,当工作负载中包含更多不同的多媒体文档和对多媒体文档的大量请求时,贪婪双元的优势就会减弱。
{"title":"Evaluating the impact of different document types on the performance of web cache replacement schemes","authors":"C. Lindemann, O. P. Waldhorst","doi":"10.1109/DSN.2002.1029017","DOIUrl":"https://doi.org/10.1109/DSN.2002.1029017","url":null,"abstract":"In this paper, we present a comprehensive performance study of least recently used and least frequently used with dynamic aging as traditional replacement schemes as well as for the newly proposed schemes greedy dual size and greedy dual. The goal of our study constitutes the understanding how these replacement schemes deal with different web document types. Using trace-driven simulation, we present curves plotting the hit rate and byte hit rate broken down for image, HTML, multi media, and application documents. The presented results show for the first workload that under the packet cost model Greedy Dual outperforms the other schemes both in terms of hit rate and byte hit rate for image, HTML, and multi media documents. However, the advantages of Greedy Dual diminish when the workload contains more distinct multi media documents and a larger number of requests to multi media documents.","PeriodicalId":93807,"journal":{"name":"Proceedings. International Conference on Dependable Systems and Networks","volume":"57 1","pages":"717-726"},"PeriodicalIF":0.0,"publicationDate":"2002-06-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81106705","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
期刊
Proceedings. International Conference on Dependable Systems and Networks
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1