2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)最新文献

英文中文

Reliability-Centered Maintenance of the Electrically Insulated Railway Joint via Fault Tree Analysis: A Practical Experience Report 基于故障树分析的铁路电绝缘接头可靠性维护实践报告

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.67

Enno Ruijters, Dennis Guck, M. Noort, M. Stoelinga

Maintenance is an important way to increase system dependability: timely inspections, repairs and renewals can significantly increase a system's reliability, availability and life time. At the same time, maintenance incurs costs and planned downtime. Thus, good maintenance planning has to balance between these factors. In this paper, we study the effect of different maintenance strategies on the electrically insulated railway joint (EI-joint), a critical asset in railroad tracks for train detection, and a relative frequent cause for train disruptions. Together with experts in maintenance engineering, we have modeled the EI-joint as a fault maintenance tree (FMT), i.e. a fault tree augmented with maintenance aspects. We show how complex maintenance concepts, such as condition-based maintenance with periodic inspections, are naturally modeled by FMTs, and how several key performance indicators, such as the system reliability, number of failures, and costs, can be analysed. The faithfulness of quantitative analyses heavily depend on the accuracy of the parameter values in the models. Here, we have been in the unique situation that extensive data could be collected, both from incident registration databases, as well as from interviews with domain experts from several companies. This made that we could construct a model that faithfully predicts the expected number of failures at system level. Our analysis shows that that the current maintenance policy is close to cost-optimal. It is possible to increase joint reliability, e.g. by performing more inspections, but the additional maintenance costs outweigh the reduced cost of failures.

维护是提高系统可靠性的重要方法:及时检查、维修和更新可以显著提高系统的可靠性、可用性和使用寿命。同时，维护会产生成本和计划的停机时间。因此，良好的维护计划必须在这些因素之间取得平衡。本文研究了不同维修策略对铁路电绝缘接头(EI-joint)的影响，ei -接头是铁路轨道上用于列车检测的关键资产，也是列车中断的相对常见原因。与维修工程专家一起，我们将ei连接建模为故障维护树(FMT)，即故障树与维护方面的增强。我们展示了复杂的维护概念，如定期检查的基于状态的维护，是如何通过fmt自然建模的，以及如何分析几个关键性能指标，如系统可靠性、故障数量和成本。定量分析的准确性在很大程度上取决于模型中参数值的准确性。在这里，我们一直处于一种独特的情况，即可以从事件注册数据库中收集大量数据，也可以从与几家公司的领域专家的访谈中收集数据。这使得我们可以构建一个模型，忠实地预测系统级别的预期故障数量。我们的分析表明，当前的维护策略接近成本最优。提高关节可靠性是可能的，例如，通过执行更多的检查，但是额外的维护成本超过了减少的故障成本。

{"title":"Reliability-Centered Maintenance of the Electrically Insulated Railway Joint via Fault Tree Analysis: A Practical Experience Report","authors":"Enno Ruijters, Dennis Guck, M. Noort, M. Stoelinga","doi":"10.1109/DSN.2016.67","DOIUrl":"https://doi.org/10.1109/DSN.2016.67","url":null,"abstract":"Maintenance is an important way to increase system dependability: timely inspections, repairs and renewals can significantly increase a system's reliability, availability and life time. At the same time, maintenance incurs costs and planned downtime. Thus, good maintenance planning has to balance between these factors. In this paper, we study the effect of different maintenance strategies on the electrically insulated railway joint (EI-joint), a critical asset in railroad tracks for train detection, and a relative frequent cause for train disruptions. Together with experts in maintenance engineering, we have modeled the EI-joint as a fault maintenance tree (FMT), i.e. a fault tree augmented with maintenance aspects. We show how complex maintenance concepts, such as condition-based maintenance with periodic inspections, are naturally modeled by FMTs, and how several key performance indicators, such as the system reliability, number of failures, and costs, can be analysed. The faithfulness of quantitative analyses heavily depend on the accuracy of the parameter values in the models. Here, we have been in the unique situation that extensive data could be collected, both from incident registration databases, as well as from interviews with domain experts from several companies. This made that we could construct a model that faithfully predicts the expected number of failures at system level. Our analysis shows that that the current maintenance policy is close to cost-optimal. It is possible to increase joint reliability, e.g. by performing more inspections, but the additional maintenance costs outweigh the reduced cost of failures.","PeriodicalId":102292,"journal":{"name":"2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122665475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 16

Elastic Parity Logging for SSD RAID Arrays 支持SSD RAID的弹性奇偶校验日志

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.14

Yongkun Li, H. Chan, P. Lee, Yinlong Xu

Parity-based RAID poses a design trade-off issue for large-scale SSD storage systems: it improves reliability against SSD failures through redundancy, yet its parity updates incur extra I/Os and garbage collection operations, thereby degrading the endurance and performance of SSDs. We propose EPLOG, a storage layer that reduces parity traffic to SSDs, so as to provide endurance, reliability, and performance guarantees for SSD RAID arrays. EPLOG mitigates parity update overhead via elastic parity logging, which redirects parity traffic to separate log devices (to improve endurance and reliability) and eliminates the need of pre-reading data in parity computations (to improve performance). We design EPLOG as a user-level implementation that is fully compatible with commodity hardware and general erasure coding schemes. We evaluate EPLOG through reliability analysis and trace-driven testbed experiments. Compared to the Linux software RAID implementation, our experimental results show that our EPLOG prototype reduces the total write traffic to SSDs, reduces the number of garbage collection operations, and increases the I/O throughput. In addition, EPLOG significantly improves the I/O performance over the original parity logging design, and incurs low metadata overhead.

基于奇偶校验的RAID对大规模SSD存储系统提出了一个设计权衡问题:它通过冗余提高了SSD故障的可靠性，但其奇偶校验更新会导致额外的I/ o和垃圾收集操作，从而降低SSD的耐用性和性能。我们提出了EPLOG存储层，减少对SSD的奇偶校验流量，为SSD RAID阵列提供耐久性、可靠性和性能保障。EPLOG通过弹性奇偶记录减轻了奇偶更新开销，它将奇偶流量重定向到单独的日志设备(以提高耐久性和可靠性)，并消除了在奇偶计算中预读取数据的需要(以提高性能)。我们将EPLOG设计为用户级实现，它与商用硬件和通用擦除编码方案完全兼容。我们通过可靠性分析和跟踪驱动的试验台实验来评估EPLOG。与Linux软件RAID实现相比，我们的实验结果表明，我们的EPLOG原型减少了对ssd的总写流量，减少了垃圾收集操作的数量，并提高了I/O吞吐量。此外，与原始奇偶校验日志设计相比，EPLOG显著提高了I/O性能，并且减少了元数据开销。

{"title":"Elastic Parity Logging for SSD RAID Arrays","authors":"Yongkun Li, H. Chan, P. Lee, Yinlong Xu","doi":"10.1109/DSN.2016.14","DOIUrl":"https://doi.org/10.1109/DSN.2016.14","url":null,"abstract":"Parity-based RAID poses a design trade-off issue for large-scale SSD storage systems: it improves reliability against SSD failures through redundancy, yet its parity updates incur extra I/Os and garbage collection operations, thereby degrading the endurance and performance of SSDs. We propose EPLOG, a storage layer that reduces parity traffic to SSDs, so as to provide endurance, reliability, and performance guarantees for SSD RAID arrays. EPLOG mitigates parity update overhead via elastic parity logging, which redirects parity traffic to separate log devices (to improve endurance and reliability) and eliminates the need of pre-reading data in parity computations (to improve performance). We design EPLOG as a user-level implementation that is fully compatible with commodity hardware and general erasure coding schemes. We evaluate EPLOG through reliability analysis and trace-driven testbed experiments. Compared to the Linux software RAID implementation, our experimental results show that our EPLOG prototype reduces the total write traffic to SSDs, reduces the number of garbage collection operations, and increases the I/O throughput. In addition, EPLOG significantly improves the I/O performance over the original parity logging design, and incurs low metadata overhead.","PeriodicalId":102292,"journal":{"name":"2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"14 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128843706","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 12

Methuselah Flash: Rewriting Codes for Extra Long Storage Lifetime 玛土撒拉闪存:重写代码以获得超长存储寿命

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.25

Georgios Mappouras, Alireza Vahid, A. Calderbank, Daniel J. Sorin

Motivated by embedded systems and datacenters that require long-life components, we extend the lifetime of Flash memory using rewriting codes that allow for multiple writes to a page before it needs to be erased. Although researchers have previously explored rewriting codes for this purpose, we make two significant contributions beyond prior work. First, we remove the assumption of idealized -- and unrealistically optimistic -- Flash cells used in prior work on endurance codes. Unfortunately, current Flash technology has a non-ideal interface, due to its underlying physical design, and does not, for example, allow all seemingly possible increases in a cell's level. We show how to provide the ideal multi-level cell interface, by developing a virtual Flash cell, and we evaluate its impact on existing endurance codes. Our second contribution is our development of novel endurance codes, called Methuselah Flash Codes (MFC), that provide better cost/lifetime trade-offs than previously studied codes.

受嵌入式系统和数据中心需要长寿命组件的启发，我们使用重写代码来延长闪存的寿命，重写代码允许在需要擦除之前对页面进行多次写入。尽管研究人员之前已经为此目的探索了重写代码，但我们在之前的工作之外做出了两个重大贡献。首先，我们删除了在先前的耐力编码工作中使用的理想化和不切实际的乐观的闪光单元的假设。不幸的是，当前的Flash技术有一个非理想的接口，由于其底层的物理设计，并且不能，例如，允许所有看似可能的细胞水平的增加。我们展示了如何提供理想的多级单元接口，通过开发一个虚拟的Flash单元，我们评估其对现有的耐久性代码的影响。我们的第二个贡献是我们开发了新的耐久性代码，称为玛士撒拉闪光代码(MFC)，它提供了比以前研究过的代码更好的成本/寿命权衡。

引用次数: 3

Targeted Attacks on Teleoperated Surgical Robots: Dynamic Model-Based Detection and Mitigation 远程手术机器人的目标攻击:基于动态模型的检测和缓解

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.43

H. Alemzadeh, Daniel Chen, Xiao Li, T. Kesavadas, Z. Kalbarczyk, R. Iyer

This paper demonstrates targeted cyber-physical attacks on teleoperated surgical robots. These attacks exploit vulnerabilities in the robot's control system to infer a critical time during surgery to drive injection of malicious control commands to the robot. We show that these attacks can evade the safety checks of the robot, lead to catastrophic consequences in the physical system (e.g., sudden jumps of robotic arms or system's transition to an unwanted halt state), and cause patient injury, robot damage, or system unavailability in the middle of a surgery. We present a model-based analysis framework that can estimate the consequences of control commands through real-time computation of robot's dynamics. Our experiments on the RAVEN II robot demonstrate that this framework can detect and mitigate the malicious commands before they manifest in the physical system with an average accuracy of 90%.

本文演示了针对远程手术机器人的针对性网络物理攻击。这些攻击利用机器人控制系统中的漏洞来推断手术期间的关键时间，从而驱动向机器人注入恶意控制命令。我们表明，这些攻击可以逃避机器人的安全检查，导致物理系统的灾难性后果(例如，机器人手臂的突然跳跃或系统过渡到不想要的半状态)，并导致患者受伤，机器人损坏或系统在手术过程中不可用。我们提出了一个基于模型的分析框架，可以通过实时计算机器人的动力学来估计控制命令的后果。我们在RAVEN II机器人上的实验表明，该框架可以在恶意命令在物理系统中显现之前检测并减轻它们，平均准确率为90%。

引用次数: 63

PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM 一种有效的系统级技术来检测DRAM中与数据相关的故障

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.30

S. Khan, Donghyuk Lee, O. Mutlu

System-level detection and mitigation of DRAM failures offer a variety of system enhancements, such as better reliability, scalability, energy, and performance. Unfortunately, system-level detection is challenging for DRAM failures that depend on the data content of neighboring cells (data-dependent failures). DRAM vendors internally scramble/remap the system-level address space. Therefore, testing data-dependent failures using neighboring system-level addresses does not actually test the cells that are physically adjacent. In this work, we argue that one promising way to uncover data-dependent failures in the system is to determine the location of physically neighboring cells in the system address space. Unfortunately, if done naively, such a test takes 49 days to detect neighboring addresses even in a single memory row, making it infeasible in real systems. We develop PARBOR, an efficient system-level technique that determines the locations of the physically neighboring DRAM cells in the system address space and uses this information to detect data-dependent failures. To our knowledge, this is the first work that solves the challenge of detecting data-dependent failures in DRAM in the presence of DRAM-internal scrambling of system-level addresses. We experimentally demonstrate the effectiveness of PARBOR using 144 real DRAM chips from three major vendors. Our experimental evaluation shows that PARBOR 1) detects neighboring cell locations with only 66-90 tests, a 745,654X reduction compared to the naive test, and 2) uncovers 21.9% more failures compared to a random-pattern test that is unaware of the neighbor cell locations. We introduce a new mechanism that utilizes PARBOR to reduce refresh rate based on the data content of memory locations, thereby improving system performance and efficiency. We hope that our fast and efficient system-level detection technique enables other new ideas and mechanisms that improve the reliability, performance, and energy efficiency of DRAM-based memory systems.

系统级的DRAM故障检测和缓解提供了各种系统增强功能，例如更好的可靠性、可伸缩性、能源和性能。不幸的是，对于依赖于相邻单元的数据内容的DRAM故障(数据相关故障)，系统级检测是具有挑战性的。DRAM供应商在内部打乱/重新映射系统级地址空间。因此，使用相邻的系统级地址测试与数据相关的故障实际上并没有测试物理上相邻的单元。在这项工作中，我们认为发现系统中数据相关故障的一种有希望的方法是确定系统地址空间中物理相邻单元的位置。不幸的是，如果简单地执行这样的测试，即使在单个内存行中检测相邻地址也需要49天，这使得它在实际系统中不可行的。我们开发了PARBOR，这是一种有效的系统级技术，可以确定系统地址空间中物理相邻的DRAM单元的位置，并使用该信息检测与数据相关的故障。据我们所知，这是第一个解决在存在系统级地址的DRAM内部乱置的情况下检测DRAM中数据相关故障的挑战的工作。我们通过实验证明了PARBOR的有效性，使用了来自三个主要供应商的144个真实DRAM芯片。我们的实验评估表明，PARBOR 1)仅用66-90次测试就能检测到邻近的细胞位置，与初始测试相比减少了745,654倍;2)与不知道邻近细胞位置的随机模式测试相比，发现的失败多21.9%。我们引入了一种新的机制，利用PARBOR根据内存位置的数据内容来降低刷新率，从而提高系统性能和效率。我们希望我们的快速和高效的系统级检测技术能够带来其他新的想法和机制，以提高基于dram的存储系统的可靠性，性能和能源效率。

{"title":"PARBOR: An Efficient System-Level Technique to Detect Data-Dependent Failures in DRAM","authors":"S. Khan, Donghyuk Lee, O. Mutlu","doi":"10.1109/DSN.2016.30","DOIUrl":"https://doi.org/10.1109/DSN.2016.30","url":null,"abstract":"System-level detection and mitigation of DRAM failures offer a variety of system enhancements, such as better reliability, scalability, energy, and performance. Unfortunately, system-level detection is challenging for DRAM failures that depend on the data content of neighboring cells (data-dependent failures). DRAM vendors internally scramble/remap the system-level address space. Therefore, testing data-dependent failures using neighboring system-level addresses does not actually test the cells that are physically adjacent. In this work, we argue that one promising way to uncover data-dependent failures in the system is to determine the location of physically neighboring cells in the system address space. Unfortunately, if done naively, such a test takes 49 days to detect neighboring addresses even in a single memory row, making it infeasible in real systems. We develop PARBOR, an efficient system-level technique that determines the locations of the physically neighboring DRAM cells in the system address space and uses this information to detect data-dependent failures. To our knowledge, this is the first work that solves the challenge of detecting data-dependent failures in DRAM in the presence of DRAM-internal scrambling of system-level addresses. We experimentally demonstrate the effectiveness of PARBOR using 144 real DRAM chips from three major vendors. Our experimental evaluation shows that PARBOR 1) detects neighboring cell locations with only 66-90 tests, a 745,654X reduction compared to the naive test, and 2) uncovers 21.9% more failures compared to a random-pattern test that is unaware of the neighbor cell locations. We introduce a new mechanism that utilizes PARBOR to reduce refresh rate based on the data content of memory locations, thereby improving system performance and efficiency. We hope that our fast and efficient system-level detection technique enables other new ideas and mechanisms that improve the reliability, performance, and energy efficiency of DRAM-based memory systems.","PeriodicalId":102292,"journal":{"name":"2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134440831","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 123

Rekeying for Encrypted Deduplication Storage 重密钥加密重复数据删除存储

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.62

Jingwei Li, Chuan Qin, P. Lee, Jin Li

Rekeying refers to an operation of replacing an existing key with a new key for encryption. It renews security protection, so as to protect against key compromise and enable dynamic access control in cryptographic storage. However, it is non-trivial to realize efficient rekeying in encrypted deduplication storage systems, which use deterministic content-derived encryption keys to allow deduplication on ciphertexts. We design and implement REED, a rekeying-aware encrypted deduplication storage system. REED builds on a deterministic version of all-or-nothing transform (AONT), such that it enables secure and lightweight rekeying, while preserving the deduplication capability. We propose two REED encryption schemes that trade between performance and security, and extend REED for dynamic access control. We implement a REED prototype with various performance optimization techniques. Our trace-driven testbed evaluation shows that our REED prototype maintains high performance and storage efficiency.

重设密钥是指用新密钥替换现有密钥进行加密的操作。它更新了安全保护，从而防止密钥泄露，并在加密存储中实现动态访问控制。然而，在加密的重复数据删除存储系统中，实现高效的密钥重置并非易事，它使用确定性的内容派生的加密密钥来允许对密文进行重复数据删除。我们设计并实现了一个重键感知的重复数据删除加密存储系统REED。REED建立在全有或全无转换(AONT)的确定性版本之上，因此它支持安全和轻量级的密钥更新，同时保留了重复数据删除功能。我们提出了两种在性能和安全性之间权衡的REED加密方案，并将REED扩展为动态访问控制。我们使用各种性能优化技术实现了一个REED原型。我们的跟踪驱动测试平台评估表明，我们的REED原型保持了高性能和存储效率。

引用次数: 48

BAYWATCH: Robust Beaconing Detection to Identify Infected Hosts in Large-Scale Enterprise Networks BAYWATCH:在大型企业网络中识别受感染主机的稳健信标检测

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.50

Xin Hu, Jiyong Jang, M. Stoecklin, Ting Wang, D. Schales, Dhilung Kirat, J. Rao

Sophisticated cyber security threats, such as advanced persistent threats, rely on infecting end points within a targeted security domain and embedding malware. Typically, such malware periodically reaches out to the command and control infrastructures controlled by adversaries. Such callback behavior, called beaconing, is challenging to detect as (a) detection requires long-term temporal analysis of communication patterns at several levels of granularity, (b) malware authors employ various strategies to hide beaconing behavior, and (c) it is also employed by legitimate applications (such as updates checks). In this paper, we develop a comprehensive methodology to identify stealthy beaconing behavior from network traffic observations. We use an 8-step filtering approach to iteratively refine and eliminate legitimate beaconing traffic and pinpoint malicious beaconing cases for in-depth investigation and takedown. We provide a systematic evaluation of our core beaconing detection algorithm and conduct a large-scale evaluation of web proxy data (more than 30 billion events) collected over a 5-month period at a corporate network comprising over 130,000 end-user devices. Our findings indicate that our approach reliably exposes malicious beaconing behavior, which may be overlooked by traditional security mechanisms.

复杂的网络安全威胁，如高级持续性威胁，依赖于目标安全域内的感染端点和嵌入恶意软件。通常，这类恶意软件会周期性地访问由对手控制的命令和控制基础设施。这种被称为信标的回调行为很难检测，因为(a)检测需要对多个粒度级别的通信模式进行长期的临时分析，(b)恶意软件作者使用各种策略来隐藏信标行为，(c)合法应用程序也使用信标行为(例如更新检查)。在本文中，我们开发了一种综合的方法来从网络流量观察中识别隐形信标行为。我们使用8步过滤方法来迭代优化和消除合法信标流量，并查明恶意信标案例，以进行深入调查和删除。我们对我们的核心信标检测算法进行了系统的评估，并对一个由超过13万终端用户设备组成的企业网络在5个月内收集的web代理数据(超过300亿次事件)进行了大规模的评估。我们的研究结果表明，我们的方法可靠地暴露了恶意信标行为，这可能被传统的安全机制所忽视。

{"title":"BAYWATCH: Robust Beaconing Detection to Identify Infected Hosts in Large-Scale Enterprise Networks","authors":"Xin Hu, Jiyong Jang, M. Stoecklin, Ting Wang, D. Schales, Dhilung Kirat, J. Rao","doi":"10.1109/DSN.2016.50","DOIUrl":"https://doi.org/10.1109/DSN.2016.50","url":null,"abstract":"Sophisticated cyber security threats, such as advanced persistent threats, rely on infecting end points within a targeted security domain and embedding malware. Typically, such malware periodically reaches out to the command and control infrastructures controlled by adversaries. Such callback behavior, called beaconing, is challenging to detect as (a) detection requires long-term temporal analysis of communication patterns at several levels of granularity, (b) malware authors employ various strategies to hide beaconing behavior, and (c) it is also employed by legitimate applications (such as updates checks). In this paper, we develop a comprehensive methodology to identify stealthy beaconing behavior from network traffic observations. We use an 8-step filtering approach to iteratively refine and eliminate legitimate beaconing traffic and pinpoint malicious beaconing cases for in-depth investigation and takedown. We provide a systematic evaluation of our core beaconing detection algorithm and conduct a large-scale evaluation of web proxy data (more than 30 billion events) collected over a 5-month period at a corporate network comprising over 130,000 end-user devices. Our findings indicate that our approach reliably exposes malicious beaconing behavior, which may be overlooked by traditional security mechanisms.","PeriodicalId":102292,"journal":{"name":"2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116444703","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 25

SuperGlue: IDL-Based, System-Level Fault Tolerance for Embedded Systems 超级胶:基于idl的嵌入式系统级容错

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.29

Jiguo Song, Gedare Bloom, Gabriel Parmer

As the processor feature sizes shrink, mitigating faults in low level system services has become a critical aspect of dependable system design. In this paper we introduce SuperGlue, an interface description language (IDL) and compiler for recovery from transient faults in a component-based operating system. SuperGlue generates code for interface-driven recovery that uses commodity hardware isolation, micro-rebooting, and interface-directed fault recovery to provide predictable and efficient recovery from faults that impact low-level system services. SuperGlue decreases the amount of recovery code system designers need to implement by an order of magnitude, and replaces it with declarative specifications. We evaluate SuperGlue with a fault injection campaign in low-level system components (e.g., memory mapping manager and scheduler). Additionally, we evaluate the performance of SuperGlue in a web-server application. Results show that SuperGlue improves system reliability with only a small performance degradation of 11.84%.

随着处理器特征尺寸的不断缩小，降低底层系统服务中的故障已成为可靠系统设计的一个重要方面。本文介绍了一种接口描述语言(IDL)和编译器SuperGlue，用于基于组件的操作系统的瞬态故障恢复。SuperGlue为接口驱动的恢复生成代码，它使用商用硬件隔离、微重启和接口导向的故障恢复，从影响底层系统服务的故障中提供可预测和高效的恢复。SuperGlue将系统设计人员需要实现的恢复代码数量减少了一个数量级，并将其替换为声明性规范。我们通过在低级系统组件(例如，内存映射管理器和调度程序)中进行故障注入活动来评估SuperGlue。此外，我们在一个web服务器应用程序中评估了SuperGlue的性能。结果表明，SuperGlue在提高系统可靠性的同时，性能仅下降了11.84%。

引用次数: 5

FTP: The Forgotten Cloud FTP:被遗忘的云

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.52

Drew Springall, Z. Durumeric, J. A. Halderman

Once pervasive, the File Transfer Protocol (FTP) has been largely supplanted by HTTP, SCP, and BitTorrent for transferring data between hosts. Yet, in a comprehensive analysis of the FTP ecosystem as of 2015, we find that there are still more than 13~million FTP servers in the IPv4 address space, 1.1~million of which allow "anonymous" (public) access. These anonymous FTP servers leak sensitive information, such as tax documents and cryptographic secrets. More than 20,000 FTP servers allow public write access, which has facilitated malicious actors' use of free storage as well as malware deployment and click-fraud attacks. We further investigate real-world attacks by deploying eight FTP honeypots, shedding light on how attackers are abusing and exploiting vulnerable servers. We conclude with lessons and recommendations for securing FTP.

文件传输协议(FTP)一旦普及，就在很大程度上被HTTP、SCP和BitTorrent所取代，用于在主机之间传输数据。然而，综合分析截至2015年的FTP生态系统，我们发现IPv4地址空间中仍有1300多万台FTP服务器，其中110多万台允许“匿名”(公共)访问。这些匿名FTP服务器泄露敏感信息，如税务文件和加密秘密。超过20,000个FTP服务器允许公共写访问，这为恶意行为者使用免费存储以及恶意软件部署和点击欺诈攻击提供了便利。我们通过部署8个FTP蜜罐进一步调查现实世界的攻击，揭示攻击者如何滥用和利用易受攻击的服务器。最后，我们给出了保护FTP的经验教训和建议。

引用次数: 22

Managing Data Center Tickets: Prediction and Active Sizing 管理数据中心门票:预测和活动分级

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

Pub Date : 2016-06-01 DOI: 10.1109/DSN.2016.38

Ji Xue, R. Birke, L. Chen, E. Smirni

Performance ticket handling is an expensive operation in highly virtualized cloud data centers where physical boxes host multiple virtual machines (VMs). A large body of tickets arise from the resource usage warnings, e.g., CPU and RAM usages that exceed predefined thresholds. The transient nature of CPU and RAM usage as well as their strong correlation across time among co-located VMs drastically increase the complexity in ticket management. Based on a large resource usage data collected from production data centers, amount to 6K physical machines and more than 80K VMs, we first discover patterns of spatial dependency among co-located virtual resources. Leveraging our key findings, we develop an Active Ticket Managing(ATM) system that consists of (i) a novel time series prediction methodology and (ii) a proactive VM resizing policy for CPU and RAM resources for co-located VMs on a physical box that aims to drastically reduce usage tickets. ATM exploits the spatial dependency across multiple resources of co-located VMs for usage prediction and proactive VM resizing. Evaluation results on traces of 6K physical boxes and a prototype of a MediaWiki system show that ATM is able to achieve excellent prediction accuracy of a large number of VM time series and significant usage ticket reduction, i.e., up to 60%, at low computational overhead.

在高度虚拟化的云数据中心中，性能票证处理是一项昂贵的操作，其中物理盒托管多个虚拟机(vm)。大量的票证来自资源使用警告，例如，CPU和RAM使用超过预定义的阈值。CPU和RAM使用的瞬态特性以及它们在同一位置的vm之间的强时间相关性极大地增加了票据管理的复杂性。基于从生产数据中心收集的大量资源使用数据(总计6K个物理机和80K多个虚拟机)，我们首先发现了共置虚拟资源之间的空间依赖模式。利用我们的主要发现，我们开发了一个主动票据管理(ATM)系统，该系统由(i)一种新颖的时间序列预测方法和(ii)一种针对物理盒上共存的VM的CPU和RAM资源的主动VM调整策略组成，旨在大幅减少使用票据。ATM利用多虚拟机资源间的空间依赖性进行使用预测和主动调整虚拟机大小。对6K物理盒的轨迹和MediaWiki系统原型的评估结果表明，ATM能够在较低的计算开销下实现对大量VM时间序列的出色预测精度和显著的使用票据减少，即高达60%。

{"title":"Managing Data Center Tickets: Prediction and Active Sizing","authors":"Ji Xue, R. Birke, L. Chen, E. Smirni","doi":"10.1109/DSN.2016.38","DOIUrl":"https://doi.org/10.1109/DSN.2016.38","url":null,"abstract":"Performance ticket handling is an expensive operation in highly virtualized cloud data centers where physical boxes host multiple virtual machines (VMs). A large body of tickets arise from the resource usage warnings, e.g., CPU and RAM usages that exceed predefined thresholds. The transient nature of CPU and RAM usage as well as their strong correlation across time among co-located VMs drastically increase the complexity in ticket management. Based on a large resource usage data collected from production data centers, amount to 6K physical machines and more than 80K VMs, we first discover patterns of spatial dependency among co-located virtual resources. Leveraging our key findings, we develop an Active Ticket Managing(ATM) system that consists of (i) a novel time series prediction methodology and (ii) a proactive VM resizing policy for CPU and RAM resources for co-located VMs on a physical box that aims to drastically reduce usage tickets. ATM exploits the spatial dependency across multiple resources of co-located VMs for usage prediction and proactive VM resizing. Evaluation results on traces of 6K physical boxes and a prototype of a MediaWiki system show that ATM is able to achieve excellent prediction accuracy of a large number of VM time series and significant usage ticket reduction, i.e., up to 60%, at low computational overhead.","PeriodicalId":102292,"journal":{"name":"2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125902952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 22

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2016 46th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀