首页 > 最新文献

2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)最新文献

英文 中文
EagleEye: Towards mandatory security monitoring in virtualized datacenter environment EagleEye:在虚拟化数据中心环境中实现强制安全监控
Yu-Sung Wu, Pei-Keng Sun, Chun-Chi Huang, Sung-Jer Lu, Syu-Fang Lai, Yi-Yung Chen
Virtualized datacenter (VDC) has become a popular approach to large-scale system consolidation and the enabling technology for infrastructure-as-a-service cloud computing. The consolidation inevitably aggregates the security threats once faced by individual systems towards a VDC, and a VDC operator should remain vigilant of the threats at all times. We envision the need for on-demand mandatory security monitoring of critical guest systems as a means to track and deter security threats that could jeopardize the operation of a VDC. Unfortunately, existing VDC security monitoring mechanisms all require pre-installed guest components to operate. The security monitoring would either be up to the discretion of individual tenants or require costly direct management of guest systems by the VDC operator. We propose the EagleEye approach for on-demand mandatory security monitoring in VDC environment, which does not depend on pre-installed guest components. We implement a prototype on-access anti-virus monitor to demonstrate the feasibility of the EagleEye approach. We also identify challenges particular to this approach, and provide a set of solutions meant to strengthen future research in this area.
虚拟化数据中心(VDC)已经成为大规模系统整合和基础设施即服务云计算支持技术的流行方法。这种整合不可避免地将单个系统曾经面临的安全威胁聚集到一个VDC中,VDC运营商应该时刻保持警惕。我们设想对关键客户系统进行按需强制安全监控的需求,作为跟踪和阻止可能危及VDC运行的安全威胁的一种手段。不幸的是,现有的VDC安全监控机制都需要预先安装来宾组件才能运行。安全监控要么由个别租户自行决定,要么需要VDC运营商对客户系统进行成本高昂的直接管理。我们建议在VDC环境中使用EagleEye方法进行按需强制安全监控,该方法不依赖于预安装的来宾组件。我们实现了一个访问式反病毒监视器的原型,以演示EagleEye方法的可行性。我们还指出了这种方法所特有的挑战,并提供了一套旨在加强该领域未来研究的解决方案。
{"title":"EagleEye: Towards mandatory security monitoring in virtualized datacenter environment","authors":"Yu-Sung Wu, Pei-Keng Sun, Chun-Chi Huang, Sung-Jer Lu, Syu-Fang Lai, Yi-Yung Chen","doi":"10.1109/DSN.2013.6575300","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575300","url":null,"abstract":"Virtualized datacenter (VDC) has become a popular approach to large-scale system consolidation and the enabling technology for infrastructure-as-a-service cloud computing. The consolidation inevitably aggregates the security threats once faced by individual systems towards a VDC, and a VDC operator should remain vigilant of the threats at all times. We envision the need for on-demand mandatory security monitoring of critical guest systems as a means to track and deter security threats that could jeopardize the operation of a VDC. Unfortunately, existing VDC security monitoring mechanisms all require pre-installed guest components to operate. The security monitoring would either be up to the discretion of individual tenants or require costly direct management of guest systems by the VDC operator. We propose the EagleEye approach for on-demand mandatory security monitoring in VDC environment, which does not depend on pre-installed guest components. We implement a prototype on-access anti-virus monitor to demonstrate the feasibility of the EagleEye approach. We also identify challenges particular to this approach, and provide a set of solutions meant to strengthen future research in this area.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"271 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123113234","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
SPECTRE: A dependable introspection framework via System Management Mode SPECTRE:一个可靠的内省框架,通过系统管理模式
Fengwei Zhang, Kevin Leach, Kun Sun, A. Stavrou
Virtual Machine Introspection (VMI) systems have been widely adopted for malware detection and analysis. VMI systems use hypervisor technology for system introspection and to expose malicious activity. However, recent malware can detect the presence of virtualization or corrupt the hypervisor state thus avoiding detection. We introduce SPECTRE, a hardware-assisted dependability framework that leverages System Management Mode (SMM) to inspect the state of a system. Contrary to VMI, our trusted code base is limited to BIOS and the SMM implementations. SPECTRE is capable of transparently and quickly examining all layers of running system code including a hypervisor, the OS, and user level applications. We demonstrate several use cases of SPECTRE including heap spray, heap overflow, and rootkit detection using real-world attacks on Windows and Linux platforms. In our experiments, full inspection with SPECTRE is 100 times faster than similar VMI systems because there is no performance overhead due to virtualization.
虚拟机自省(VMI)系统被广泛应用于恶意软件的检测和分析。VMI系统使用管理程序技术进行系统自省和暴露恶意活动。然而,最近的恶意软件可以检测到虚拟化的存在或破坏管理程序状态,从而避免检测。我们介绍SPECTRE,一个硬件辅助的可靠性框架,它利用系统管理模式(SMM)来检查系统的状态。与VMI相反,我们的可信代码库仅限于BIOS和SMM实现。SPECTRE能够透明、快速地检查运行中的系统代码的所有层,包括管理程序、操作系统和用户级应用程序。我们演示了SPECTRE的几个用例,包括堆喷雾、堆溢出和rootkit检测,这些用例使用了Windows和Linux平台上的真实攻击。在我们的实验中,SPECTRE的全面检查速度比类似的VMI系统快100倍,因为没有虚拟化带来的性能开销。
{"title":"SPECTRE: A dependable introspection framework via System Management Mode","authors":"Fengwei Zhang, Kevin Leach, Kun Sun, A. Stavrou","doi":"10.1109/DSN.2013.6575343","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575343","url":null,"abstract":"Virtual Machine Introspection (VMI) systems have been widely adopted for malware detection and analysis. VMI systems use hypervisor technology for system introspection and to expose malicious activity. However, recent malware can detect the presence of virtualization or corrupt the hypervisor state thus avoiding detection. We introduce SPECTRE, a hardware-assisted dependability framework that leverages System Management Mode (SMM) to inspect the state of a system. Contrary to VMI, our trusted code base is limited to BIOS and the SMM implementations. SPECTRE is capable of transparently and quickly examining all layers of running system code including a hypervisor, the OS, and user level applications. We demonstrate several use cases of SPECTRE including heap spray, heap overflow, and rootkit detection using real-world attacks on Windows and Linux platforms. In our experiments, full inspection with SPECTRE is 100 times faster than similar VMI systems because there is no performance overhead due to virtualization.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"41 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126132621","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 69
Automating the debugging of datacenter applications with ADDA 使用ADDA自动调试数据中心应用程序
Cristian Zamfir, Gautam Altekar, I. Stoica
Debugging data-intensive distributed applications running in datacenters is complex and time-consuming because developers do not have practical ways of deterministically replaying failed executions. The reason why building such tools is hard is that non-determinism that may be tolerable on a single node is exacerbated in large clusters of interacting nodes, and datacenter applications produce terabytes of intermediate data exchanged by nodes, thus making full input recording infeasible. We present ADDA, a replay-debugging system for datacenters that has lower recording and storage overhead than existing systems. ADDA is based on two techniques: First, ADDA provides control plane determinism, leveraging our observation that many typical datacenter applications consist of a separate “control plane” and “data plane”, and most bugs reside in the former. Second, ADDA does not record “data plane” inputs, instead it synthesizes them during replay, starting from the application's external inputs, which are typically persisted in append-only storage for reasons unrelated to debugging. We evaluate ADDA and show that it deterministically replays real-world failures in Hypertable and Memcached.
调试在数据中心运行的数据密集型分布式应用程序既复杂又耗时,因为开发人员没有实际的方法来确定地重播失败的执行。构建这样的工具之所以困难,是因为在单个节点上可以容忍的不确定性在交互节点的大型集群中会加剧,并且数据中心应用程序产生节点交换的tb级中间数据,从而使完整的输入记录变得不可行的。我们提出了ADDA,一个重放调试系统的数据中心,有较低的记录和存储开销比现有的系统。ADDA基于两种技术:首先,ADDA提供控制平面确定性,利用我们的观察,即许多典型的数据中心应用程序由独立的“控制平面”和“数据平面”组成,而大多数错误都存在于前者。其次,ADDA不记录“数据平面”输入,而是在重播期间从应用程序的外部输入开始合成它们,这些输入通常持久化在仅追加存储中,原因与调试无关。我们对ADDA进行了评估,并证明它在Hypertable和Memcached中肯定地重播了真实世界的失败。
{"title":"Automating the debugging of datacenter applications with ADDA","authors":"Cristian Zamfir, Gautam Altekar, I. Stoica","doi":"10.1109/DSN.2013.6575303","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575303","url":null,"abstract":"Debugging data-intensive distributed applications running in datacenters is complex and time-consuming because developers do not have practical ways of deterministically replaying failed executions. The reason why building such tools is hard is that non-determinism that may be tolerable on a single node is exacerbated in large clusters of interacting nodes, and datacenter applications produce terabytes of intermediate data exchanged by nodes, thus making full input recording infeasible. We present ADDA, a replay-debugging system for datacenters that has lower recording and storage overhead than existing systems. ADDA is based on two techniques: First, ADDA provides control plane determinism, leveraging our observation that many typical datacenter applications consist of a separate “control plane” and “data plane”, and most bugs reside in the former. Second, ADDA does not record “data plane” inputs, instead it synthesizes them during replay, starting from the application's external inputs, which are typically persisted in append-only storage for reasons unrelated to debugging. We evaluate ADDA and show that it deterministically replays real-world failures in Hypertable and Memcached.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"74 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116175403","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Redefining web browser principals with a Configurable Origin Policy 用可配置的源策略重新定义web浏览器主体
Yinzhi Cao, Vaibhav Rastogi, Zhichun Li, Yan Chen, Alexander Moshchuk
With the advent of Web 2.0, web developers have designed multiple additions to break SOP boundary, such as splitting and combining traditional web browser protection boundaries (security principals). However, these newly generated principals lack a new label to represent its security property. To address the inconsistent label problem, this paper proposes a new way to define a security principal and its labels in the browser. In particular, we propose a Configurable Origin Policy (COP), in which a browser's security principal is defined by a configurable ID rather than a fixed triple <;scheme, host, port>. The server-side and client-side code of a web application can create, join, and destroy its own principals. We perform a formal security analysis on COP to ensure session integrity. Then we also show that COP is compatible with legacy web sites, and those sites utilizing COP are also compatible with legacy browsers.
随着Web 2.0的出现,Web开发人员设计了多个附加功能来打破SOP边界,例如拆分和合并传统的Web浏览器保护边界(安全主体)。然而,这些新生成的主体缺乏表示其安全属性的新标签。为了解决标签不一致的问题,本文提出了一种在浏览器中定义安全主体及其标签的新方法。特别是,我们提出了一个可配置的起源策略(COP),其中浏览器的安全主体由一个可配置的ID定义,而不是一个固定的三元组。web应用程序的服务器端和客户端代码可以创建、连接和销毁自己的主体。我们对COP执行正式的安全性分析,以确保会话的完整性。然后我们还展示了COP与遗留网站兼容,并且那些使用COP的网站也与遗留浏览器兼容。
{"title":"Redefining web browser principals with a Configurable Origin Policy","authors":"Yinzhi Cao, Vaibhav Rastogi, Zhichun Li, Yan Chen, Alexander Moshchuk","doi":"10.1109/DSN.2013.6575317","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575317","url":null,"abstract":"With the advent of Web 2.0, web developers have designed multiple additions to break SOP boundary, such as splitting and combining traditional web browser protection boundaries (security principals). However, these newly generated principals lack a new label to represent its security property. To address the inconsistent label problem, this paper proposes a new way to define a security principal and its labels in the browser. In particular, we propose a Configurable Origin Policy (COP), in which a browser's security principal is defined by a configurable ID rather than a fixed triple <;scheme, host, port>. The server-side and client-side code of a web application can create, join, and destroy its own principals. We perform a formal security analysis on COP to ensure session integrity. Then we also show that COP is compatible with legacy web sites, and those sites utilizing COP are also compatible with legacy browsers.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121889973","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
simFI: From single to simultaneous software fault injections simFI:从单一到同步的软件故障注入
Stefan Winter, Michael Tretter, Benjamin Sattler, N. Suri
Software-implemented fault injection (SWIFI) is an established experimental technique to evaluate the robustness of software systems. While a large number of SWIFI frameworks exist, virtually all are based on a single-fault assumption, i.e., interactions of simultaneously occurring independent faults are not investigated. As software systems containing more than a single fault often are the norm than an exception [1] and current safety standards require the consideration of “multi-point faults” [2], the validity of this single-fault assumption is at question for contemporary software systems. To address the issue and support simultaneous SWIFI (simFI), we analyze how independent faults can manifest in a generic software composition model and extend an existing SWIFI tool to support some characteristic simultaneous fault types. We implement three simultaneous fault models and demonstrate their utility in evaluating the robustness of the Windows CE kernel. Our findings indicate that simultaneous fault injections prove highly efficient in triggering robustness vulnerabilities.
软件实现故障注入(SWIFI)是一种成熟的评估软件系统鲁棒性的实验技术。虽然存在大量的SWIFI框架,但几乎所有框架都基于单故障假设,即未研究同时发生的独立故障的相互作用。由于软件系统包含一个以上的故障通常是常态而不是例外[1],并且当前的安全标准要求考虑“多点故障”[1],这种单故障假设的有效性对于当代软件系统来说是一个问题。为了解决这一问题并支持同时发生的SWIFI (simFI),我们分析了独立故障如何在通用软件组合模型中表现出来,并扩展了现有的SWIFI工具以支持一些特征的同时发生的故障类型。我们实现了三种同步故障模型,并演示了它们在评估Windows CE内核鲁棒性方面的实用性。我们的研究结果表明,同时错误注入在触发鲁棒性漏洞方面是非常有效的。
{"title":"simFI: From single to simultaneous software fault injections","authors":"Stefan Winter, Michael Tretter, Benjamin Sattler, N. Suri","doi":"10.1109/DSN.2013.6575310","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575310","url":null,"abstract":"Software-implemented fault injection (SWIFI) is an established experimental technique to evaluate the robustness of software systems. While a large number of SWIFI frameworks exist, virtually all are based on a single-fault assumption, i.e., interactions of simultaneously occurring independent faults are not investigated. As software systems containing more than a single fault often are the norm than an exception [1] and current safety standards require the consideration of “multi-point faults” [2], the validity of this single-fault assumption is at question for contemporary software systems. To address the issue and support simultaneous SWIFI (simFI), we analyze how independent faults can manifest in a generic software composition model and extend an existing SWIFI tool to support some characteristic simultaneous fault types. We implement three simultaneous fault models and demonstrate their utility in evaluating the robustness of the Windows CE kernel. Our findings indicate that simultaneous fault injections prove highly efficient in triggering robustness vulnerabilities.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"372 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114003909","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 35
Practical automated vulnerability monitoring using program state invariants 使用程序状态不变量的实际自动化漏洞监控
Cristiano Giuffrida, L. Cavallaro, A. Tanenbaum
Despite the growing attention to security concerns and advances in code verification tools, many memory errors still escape testing and plague production applications with security vulnerabilities. We present RCORE, an efficient dynamic program monitoring infrastructure to perform automated security vulnerability monitoring. Our approach is to perform extensive static analysis at compile time to automatically index program state invariants (PSIs). At runtime, our novel dynamic analysis continuously inspects the program state and produces a report when PSI violations are found. Our technique retrofits existing applications and is designed for both offline and production runs. To avoid slowing down production applications, we can perform our dynamic analysis on idle cores to detect suspicious behavior in the background. The alerts raised by our analysis are symptoms of memory corruption or other-potentially exploitable-dangerous behavior. Our experimental evaluation confirms that RCORE can report on several classes of vulnerabilities with very low overhead.
尽管越来越多的人关注安全性问题,并且代码验证工具也在不断进步,但是许多内存错误仍然逃过了测试,并通过安全漏洞困扰生产应用程序。我们提出了一种高效的动态程序监控基础设施RCORE,用于自动监控安全漏洞。我们的方法是在编译时执行广泛的静态分析,以自动索引程序状态不变量。在运行时,我们新颖的动态分析持续检查程序状态,并在发现PSI违规时生成报告。我们的技术改进了现有的应用程序,并为离线和生产运行而设计。为了避免降低生产应用程序的速度,我们可以对空闲内核执行动态分析,以检测后台的可疑行为。我们的分析提出的警报是内存损坏或其他潜在可利用的危险行为的症状。我们的实验评估证实,RCORE可以以非常低的开销报告几类漏洞。
{"title":"Practical automated vulnerability monitoring using program state invariants","authors":"Cristiano Giuffrida, L. Cavallaro, A. Tanenbaum","doi":"10.1109/DSN.2013.6575318","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575318","url":null,"abstract":"Despite the growing attention to security concerns and advances in code verification tools, many memory errors still escape testing and plague production applications with security vulnerabilities. We present RCORE, an efficient dynamic program monitoring infrastructure to perform automated security vulnerability monitoring. Our approach is to perform extensive static analysis at compile time to automatically index program state invariants (PSIs). At runtime, our novel dynamic analysis continuously inspects the program state and produces a report when PSI violations are found. Our technique retrofits existing applications and is designed for both offline and production runs. To avoid slowing down production applications, we can perform our dynamic analysis on idle cores to detect suspicious behavior in the background. The alerts raised by our analysis are symptoms of memory corruption or other-potentially exploitable-dangerous behavior. Our experimental evaluation confirms that RCORE can report on several classes of vulnerabilities with very low overhead.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"44 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131601181","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
PHYS: Profiled-HYbrid Sampling for soft error reliability benchmarking 用于软误差可靠性基准测试的轮廓混合采样
Jinho Suh, M. Annavaram, M. Dubois
In this paper, we introduce PHYS (Profiled-HYbrid Sampling), a sampling framework for soft-error benchmarking of caches. Reliability simulations of caches are much more complex than performance simulations and therefore exhibit large simulation slowdowns (two orders of magnitude) over performance simulations. The major problem is that the reliability lifetime of every accessed block must be tracked from beginning to end, on top of simulating the benchmark, in order to track the total number of vulnerability cycles (VCs) between two accesses to the block. Because of the need to track SDCs (silent error corruption) and to distinguish between true and false DUEs (detected but unrecoverable errors) vulnerability cycles cannot be truncated when data is written back from cache to main memory. Vulnerability cycles must be maintained even during a block's sojourn in main memory to track whether corrupted values in a block are used by the processor, until program termination. PHYS solves this problem by sampling intervals between accesses to each memory block, instead of sampling the execution of the processor in a time interval as is classically done in performance simulations. At first a statistical profiling phase captures the distribution of VCs for every block. This profiling step provides a statistical guarantee of the minimum sampling rate of access intervals needed to meet a desired FIT error target with a given confidence interval. Then, per cacheset sampling rates are dynamically adjusted to sample VCs with higher merit. We compare PHYS with many other possible sampling methods, some of which are widely used to accelerate performance-centric simulations but have also been applied in the past to track reliability lifetime. We demonstrate the superiority of PHYS in the context of reliability benchmarking through exhaustive evaluations of various sampling techniques.
本文介绍了一种用于缓存软误差基准测试的采样框架PHYS (profilped - hybrid Sampling)。缓存的可靠性模拟比性能模拟复杂得多,因此比性能模拟显示出较大的模拟减速(两个数量级)。主要问题是,在模拟基准测试的基础上,必须从头到尾跟踪每个访问块的可靠性生命周期,以便跟踪两次访问块之间的漏洞周期(VCs)总数。由于需要跟踪sdc(无声错误损坏)并区分真和假的ddc(检测到但不可恢复的错误),当数据从缓存写回主存储器时,漏洞周期不能被截断。即使在块暂存于主存期间,也必须保持漏洞周期,以跟踪处理器是否使用了块中的损坏值,直到程序终止。phy通过在访问每个内存块之间采样间隔来解决这个问题,而不是像在性能模拟中那样在一个时间间隔内采样处理器的执行。首先,统计分析阶段捕获每个区块的vc分布。这个分析步骤为在给定置信区间内满足FIT错误目标所需的访问间隔的最小采样率提供了统计保证。然后,对每个缓存集的采样率进行动态调整,以获得更高价值的样本vc。我们将PHYS与许多其他可能的采样方法进行了比较,其中一些方法广泛用于加速以性能为中心的模拟,但过去也应用于跟踪可靠性寿命。通过对各种采样技术的详尽评估,我们证明了物理学在可靠性基准测试方面的优越性。
{"title":"PHYS: Profiled-HYbrid Sampling for soft error reliability benchmarking","authors":"Jinho Suh, M. Annavaram, M. Dubois","doi":"10.1109/DSN.2013.6575352","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575352","url":null,"abstract":"In this paper, we introduce PHYS (Profiled-HYbrid Sampling), a sampling framework for soft-error benchmarking of caches. Reliability simulations of caches are much more complex than performance simulations and therefore exhibit large simulation slowdowns (two orders of magnitude) over performance simulations. The major problem is that the reliability lifetime of every accessed block must be tracked from beginning to end, on top of simulating the benchmark, in order to track the total number of vulnerability cycles (VCs) between two accesses to the block. Because of the need to track SDCs (silent error corruption) and to distinguish between true and false DUEs (detected but unrecoverable errors) vulnerability cycles cannot be truncated when data is written back from cache to main memory. Vulnerability cycles must be maintained even during a block's sojourn in main memory to track whether corrupted values in a block are used by the processor, until program termination. PHYS solves this problem by sampling intervals between accesses to each memory block, instead of sampling the execution of the processor in a time interval as is classically done in performance simulations. At first a statistical profiling phase captures the distribution of VCs for every block. This profiling step provides a statistical guarantee of the minimum sampling rate of access intervals needed to meet a desired FIT error target with a given confidence interval. Then, per cacheset sampling rates are dynamically adjusted to sample VCs with higher merit. We compare PHYS with many other possible sampling methods, some of which are widely used to accelerate performance-centric simulations but have also been applied in the past to track reliability lifetime. We demonstrate the superiority of PHYS in the context of reliability benchmarking through exhaustive evaluations of various sampling techniques.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"42 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130971273","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Lilliput meets brobdingnagian: Data center systems management through mobile devices 利力浦特与网络:通过移动设备管理数据中心系统
S. Bagchi, F. Arshad, Jan S. Rellermeyer, T. Osiecki, M. Kistler, A. Gheith
In this paper, we put forward the notion that systems management for large masses of virtual machines in data centers is going to be done differently in the short to medium term future-through smart phones and through controlled crowdsourcing to a variety of experts within an organization, rather than dedicated system administrators alone. We lay out the research and practitioner challenges this model raises and give some preliminary solution directions that are being developed, here at IBM and elsewhere.
在本文中,我们提出了这样一个概念,即数据中心中大量虚拟机的系统管理在中短期内将以不同的方式进行——通过智能手机和通过组织内各种专家的受控众包,而不仅仅是专门的系统管理员。我们列出了该模型提出的研究和实践挑战,并给出了IBM和其他地方正在开发的一些初步解决方案方向。
{"title":"Lilliput meets brobdingnagian: Data center systems management through mobile devices","authors":"S. Bagchi, F. Arshad, Jan S. Rellermeyer, T. Osiecki, M. Kistler, A. Gheith","doi":"10.1109/DSN.2013.6575327","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575327","url":null,"abstract":"In this paper, we put forward the notion that systems management for large masses of virtual machines in data centers is going to be done differently in the short to medium term future-through smart phones and through controlled crowdsourcing to a variety of experts within an organization, rather than dedicated system administrators alone. We lay out the research and practitioner challenges this model raises and give some preliminary solution directions that are being developed, here at IBM and elsewhere.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133382865","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Application-driven TCP recovery and non-stop BGP 应用驱动的TCP恢复和不停机BGP
Robert Surton, K. Birman, R. V. Renesse
Some network protocols tie application state to underlying TCP connections, leading to unacceptable service outages when an endpoint loses TCP state during fail-over or migration. For example, BGP ties forwarding tables to its control plane connections so that the failure of a BGP endpoint can lead to widespread routing disruption, even if it recovers all of its state but what was encapsulated by its TCP implementation. Although techniques exist for recovering TCP state transparently, they make assumptions that do not hold for applications such as BGP. We introduce application-driven TCP recovery, a technique that separates application recovery from TCP recovery. We evaluate our prototype, TCPR, and show that it outperforms existing BGP recovery techniques.
一些网络协议将应用程序状态与底层TCP连接绑定在一起,当端点在故障转移或迁移期间失去TCP状态时,会导致不可接受的服务中断。例如,BGP将转发表与其控制平面连接绑定在一起,因此即使BGP端点恢复了所有状态,但被TCP实现封装的状态也可能导致广泛的路由中断。尽管存在透明地恢复TCP状态的技术,但它们所做的假设并不适用于BGP等应用程序。我们介绍应用程序驱动的TCP恢复,这是一种将应用程序恢复与TCP恢复分开的技术。我们评估了我们的原型TCPR,并表明它优于现有的BGP恢复技术。
{"title":"Application-driven TCP recovery and non-stop BGP","authors":"Robert Surton, K. Birman, R. V. Renesse","doi":"10.1109/DSN.2013.6575313","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575313","url":null,"abstract":"Some network protocols tie application state to underlying TCP connections, leading to unacceptable service outages when an endpoint loses TCP state during fail-over or migration. For example, BGP ties forwarding tables to its control plane connections so that the failure of a BGP endpoint can lead to widespread routing disruption, even if it recovers all of its state but what was encapsulated by its TCP implementation. Although techniques exist for recovering TCP state transparently, they make assumptions that do not hold for applications such as BGP. We introduce application-driven TCP recovery, a technique that separates application recovery from TCP recovery. We evaluate our prototype, TCPR, and show that it outperforms existing BGP recovery techniques.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128885141","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Why is my smartphone slow? On the fly diagnosis of underperformance on the mobile Internet 为什么我的智能手机很慢?移动互联网性能不佳的动态诊断
Chaitrali Amrutkar, M. Hiltunen, T. Jim, Kaustubh R. Joshi, O. Spatscheck, Patrick Traynor, Shobha Venkataraman
The perceived end-to-end performance of the mobile Internet can be impacted by multiple factors including websites, devices, and network components. Constant changes in these factors and network complexity make identifying root causes of high latency difficult. In this paper, we propose a multidimensional diagnosis technique using passive IP flow data collected at ISPs for investigating factors that impact the performance of the mobile Internet. We implement and evaluate our technique over four days of data from a major US cellular provider's network. Our approach identifies several combinations of factors affecting performance. We investigate four combinations indepth to confirm the latency causes chosen by our technique. Our findings include a popular gaming website showing poor performance on a specific device type for over 50% of the flows and web browser traffic on older devices accounting for 99% of poorly performing traffic. Our technique can direct operators in choosing factors having high impact on latency in the mobile Internet.
移动互联网的感知端到端性能可能受到多种因素的影响,包括网站、设备和网络组件。这些因素和网络复杂性的不断变化使得很难确定高延迟的根本原因。在本文中,我们提出了一种多维诊断技术,利用从isp收集的被动IP流量数据来研究影响移动互联网性能的因素。我们对来自美国一家主要手机运营商网络的四天数据实施并评估了我们的技术。我们的方法确定了影响性能的几个因素组合。我们深入研究了四种组合,以确定我们的技术选择的延迟原因。我们的研究结果包括,某热门游戏网站在特定设备上的流量超过50%表现不佳,而旧设备上的网页浏览器流量占99%表现不佳的流量。我们的技术可以指导运营商在移动互联网中选择对时延影响较大的因素。
{"title":"Why is my smartphone slow? On the fly diagnosis of underperformance on the mobile Internet","authors":"Chaitrali Amrutkar, M. Hiltunen, T. Jim, Kaustubh R. Joshi, O. Spatscheck, Patrick Traynor, Shobha Venkataraman","doi":"10.1109/DSN.2013.6575301","DOIUrl":"https://doi.org/10.1109/DSN.2013.6575301","url":null,"abstract":"The perceived end-to-end performance of the mobile Internet can be impacted by multiple factors including websites, devices, and network components. Constant changes in these factors and network complexity make identifying root causes of high latency difficult. In this paper, we propose a multidimensional diagnosis technique using passive IP flow data collected at ISPs for investigating factors that impact the performance of the mobile Internet. We implement and evaluate our technique over four days of data from a major US cellular provider's network. Our approach identifies several combinations of factors affecting performance. We investigate four combinations indepth to confirm the latency causes chosen by our technique. Our findings include a popular gaming website showing poor performance on a specific device type for over 50% of the flows and web browser traffic on older devices accounting for 99% of poorly performing traffic. Our technique can direct operators in choosing factors having high impact on latency in the mobile Internet.","PeriodicalId":163407,"journal":{"name":"2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2013-06-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127387602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
期刊
2013 43rd Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1