首页 > 最新文献

2021 IEEE International Conference on Networking, Architecture and Storage (NAS)最新文献

英文 中文
Machine Reasoning — Improving AIOps for Intent Based Networks 机器推理——改进基于意图网络的AIOps
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605403
Intent-based networking (IBN) captures and translates business intent into network policies that can be automated and applied consistently across the network. The end goal is for the network to continuously monitor and adjust its performance to assure the desired business outcome. For network operators (NetOps), IBN simplifies management, improves security, and provides data and telemetry for assurance and diagnostics.
基于意图的网络(IBN)捕获业务意图并将其转换为网络策略,这些策略可以在整个网络中自动化和一致地应用。最终目标是让网络持续监控和调整其性能,以确保预期的业务结果。对于网络运营商(NetOps)来说,IBN简化了管理,提高了安全性,并为保证和诊断提供了数据和遥测。
{"title":"Machine Reasoning — Improving AIOps for Intent Based Networks","authors":"","doi":"10.1109/nas51552.2021.9605403","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605403","url":null,"abstract":"Intent-based networking (IBN) captures and translates business intent into network policies that can be automated and applied consistently across the network. The end goal is for the network to continuously monitor and adjust its performance to assure the desired business outcome. For network operators (NetOps), IBN simplifies management, improves security, and provides data and telemetry for assurance and diagnostics.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123906387","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Implementing Flash-Cached Storage Systems Using Computational Storage Drive with Built-in Transparent Compression 使用内置透明压缩的计算存储驱动器实现闪存缓存存储系统
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605383
Jingpeng Hao, Xubin Chen, Yifan Qiao, Yuyang Zhang, Tong Zhang
This paper studies utilizing the growing family of solid-state drives (SSDs) with built-in transparent compression to simplify the data structure of cache design. Such storage hardware allows the user applications to intentionally under-utilize logical storage space (i.e., sparse LBA utilization, and sparse storage block content) without sacrificing the physical storage space. Accordingly, this work proposed an index-less cache management approach to largely simplify the flash-based cache management by leveraging SSDs with built-in transparent compression. We carried out various experiments to evaluate the write amplification and read performance of the proposed cache management, and the results show that our proposed indexless cache management can achieve comparable or much better performance than the conventional policies while consuming much less host computing and memory resources.
本文研究了如何利用越来越多的内置透明压缩的固态硬盘(ssd)来简化缓存设计的数据结构。这样的存储硬件允许用户应用程序在不牺牲物理存储空间的情况下有意地充分利用逻辑存储空间(即稀疏的LBA利用率和稀疏的存储块内容)。因此,本工作提出了一种无索引缓存管理方法,通过利用内置透明压缩的ssd,极大地简化了基于闪存的缓存管理。我们进行了各种实验来评估所提出的缓存管理的写放大和读性能,结果表明,所提出的无索引缓存管理可以达到与传统策略相当或更好的性能,同时消耗更少的主机计算和内存资源。
{"title":"Implementing Flash-Cached Storage Systems Using Computational Storage Drive with Built-in Transparent Compression","authors":"Jingpeng Hao, Xubin Chen, Yifan Qiao, Yuyang Zhang, Tong Zhang","doi":"10.1109/nas51552.2021.9605383","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605383","url":null,"abstract":"This paper studies utilizing the growing family of solid-state drives (SSDs) with built-in transparent compression to simplify the data structure of cache design. Such storage hardware allows the user applications to intentionally under-utilize logical storage space (i.e., sparse LBA utilization, and sparse storage block content) without sacrificing the physical storage space. Accordingly, this work proposed an index-less cache management approach to largely simplify the flash-based cache management by leveraging SSDs with built-in transparent compression. We carried out various experiments to evaluate the write amplification and read performance of the proposed cache management, and the results show that our proposed indexless cache management can achieve comparable or much better performance than the conventional policies while consuming much less host computing and memory resources.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122082785","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimizing the Number of Rules to Mitigate Link Congestion in SDN-based Datacenters 最小化规则数量以缓解基于sdn的数据中心的链路拥塞
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605365
Rajorshi Biswas, Jie Wu
Link congestion due to regular traffic and link flooding attacks (LFA) are two major problems in datacenters. Recent usage growth of software defined networking (SDN) in datacenters enables dynamic and convenient configuration management that makes it easy to reconfigure the network to mitigate the LFA. The reconfiguration that redirects some of the traffic can be done in two ways: the shortest alternative path and the minimum changes in rule path. The SDN switches have a limited capacity for the rules and the performance dramatically drops when the number of stored rules is higher. Besides, it takes some time to adopt the changes by the SDN switches which causes interruption in flow. In this paper, we aim at minimizing the number of rule changes while redirecting some of the traffic from the congested link. We formulate two problems to minimize the number of rule changes to redirect traffic. The first problem is the basic and it considers a congested link and a flow to direct. We provide a Dijkstra-based and a rule merging based solution to the problems. The second problem considers multiple flows and we propose flow grouping and rule merging based solutions. We conduct extensive simulations and experiments in our datacenter to support our model.
常规流量引起的链路拥塞和链路泛洪攻击(LFA)是数据中心面临的两大问题。近年来,软件定义网络(SDN)在数据中心的应用越来越多,它支持动态和方便的配置管理,可以轻松地重新配置网络以减轻LFA。重定向某些流量的重新配置可以通过两种方式完成:最短的替代路径和最小的规则路径更改。SDN交换机存储规则的容量有限,当存储的规则数量增加时,性能会急剧下降。此外,SDN交换机的变化需要一定的时间来适应,这会导致流量中断。在本文中,我们的目标是最小化规则更改的数量,同时重定向来自拥塞链路的一些流量。我们制定了两个问题,以尽量减少重定向流量的规则更改数量。第一个问题是基本的,它考虑了一个拥塞的链接和一个流向。我们提供了基于dijkstra和基于规则合并的问题解决方案。第二个问题考虑了多个流,我们提出了基于流分组和规则合并的解决方案。为了支持我们的模型,我们在数据中心进行了大量的模拟和实验。
{"title":"Minimizing the Number of Rules to Mitigate Link Congestion in SDN-based Datacenters","authors":"Rajorshi Biswas, Jie Wu","doi":"10.1109/nas51552.2021.9605365","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605365","url":null,"abstract":"Link congestion due to regular traffic and link flooding attacks (LFA) are two major problems in datacenters. Recent usage growth of software defined networking (SDN) in datacenters enables dynamic and convenient configuration management that makes it easy to reconfigure the network to mitigate the LFA. The reconfiguration that redirects some of the traffic can be done in two ways: the shortest alternative path and the minimum changes in rule path. The SDN switches have a limited capacity for the rules and the performance dramatically drops when the number of stored rules is higher. Besides, it takes some time to adopt the changes by the SDN switches which causes interruption in flow. In this paper, we aim at minimizing the number of rule changes while redirecting some of the traffic from the congested link. We formulate two problems to minimize the number of rule changes to redirect traffic. The first problem is the basic and it considers a congested link and a flow to direct. We provide a Dijkstra-based and a rule merging based solution to the problems. The second problem considers multiple flows and we propose flow grouping and rule merging based solutions. We conduct extensive simulations and experiments in our datacenter to support our model.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130298640","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
ICAP: Designing Inrush Current Aware Power Gating Switch for GPGPU GPGPU的浪涌电流感知电源门控开关设计
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605434
Hadi Zamani, Devashree Tripathy, A. Jahanshahi, Daniel Wong
The leakage energy of GPGPU can be reduced by power gating the idle logic or undervolting the storage structures; however, the performance and reliability of the system degrades due to large wake up time and inrush current at time of activation. In this paper, we thoroughly analyze the realistic Break-Even Time (BET) and inrush current for various components in GPGPU architecture considering the recent design of multi-modal Power Gating Switch (PGS). Then, we introduce a new PGS which covers the current PGS drawbacks. Our redesigned PGS is carefully tailored to minimize the inrush current and BET. GPGPU-Sim simulation results for various applications, show that, with incorporating the proposed PGS into GPGPU-Sim, we can save leakage energy up to 82%, 38%, and 60% for register files, integer units, and floating units respectively.
通过对空闲逻辑进行电源门控或对存储结构进行欠压,可以降低GPGPU的泄漏能量;但是,由于唤醒时间和激活时的浪涌电流大,系统的性能和可靠性下降。本文结合当前多模态功率门控开关(PGS)的设计,深入分析了GPGPU架构中各元件的实际损益平衡时间(BET)和浪涌电流。然后,我们介绍了一种新的PGS,它涵盖了当前PGS的缺点。我们重新设计的PGS经过精心定制,以最大限度地减少浪涌电流和BET。GPGPU-Sim在各种应用中的仿真结果表明,将所提出的PGS集成到GPGPU-Sim中,对于寄存器文件、整数单元和浮动单元,分别可以节省82%、38%和60%的泄漏能量。
{"title":"ICAP: Designing Inrush Current Aware Power Gating Switch for GPGPU","authors":"Hadi Zamani, Devashree Tripathy, A. Jahanshahi, Daniel Wong","doi":"10.1109/nas51552.2021.9605434","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605434","url":null,"abstract":"The leakage energy of GPGPU can be reduced by power gating the idle logic or undervolting the storage structures; however, the performance and reliability of the system degrades due to large wake up time and inrush current at time of activation. In this paper, we thoroughly analyze the realistic Break-Even Time (BET) and inrush current for various components in GPGPU architecture considering the recent design of multi-modal Power Gating Switch (PGS). Then, we introduce a new PGS which covers the current PGS drawbacks. Our redesigned PGS is carefully tailored to minimize the inrush current and BET. GPGPU-Sim simulation results for various applications, show that, with incorporating the proposed PGS into GPGPU-Sim, we can save leakage energy up to 82%, 38%, and 60% for register files, integer units, and floating units respectively.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122424022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LocalityGuru: A PTX Analyzer for Extracting Thread Block-level Locality in GPGPUs LocalityGuru:一个用于提取gpgpu中线程块级局部性的PTX分析器
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605411
Devashree Tripathy, AmirAli Abdolrashidi, Quan Fan, Daniel Wong, M. Satpathy
Exploiting data locality in GPGPUs is critical for efficiently using the smaller data caches and handling the memory bottleneck problem. This paper proposes a thread block-centric locality analysis, which identifies the locality among the thread blocks (TBs) in terms of a number of common data references. In LocalityGuru, we seek to employ a detailed just-in-time (JIT) compilation analysis of the static memory accesses in the source code and derive the mapping between the threads and data indices at kernel-launch-time. Our locality analysis technique can be employed at multiple granularities such as threads, warps, and thread blocks in a GPU Kernel. This information can be leveraged to help make smarter decisions for locality-aware data-partition, memory page data placement, cache management, and scheduling in single-GPU and multi-GPU systems.The results of the LocalityGuru PTX analyzer are then validated by comparing with the Locality graph obtained through profiling. Since the entire analysis is carried out by the compiler before the kernel launch time, it does not introduce any timing overhead to the kernel execution time.
利用gpgpu中的数据局部性对于有效地使用较小的数据缓存和处理内存瓶颈问题至关重要。本文提出了一种以线程块为中心的局部性分析方法,该方法根据一些常见的数据引用来确定线程块之间的局部性。在LocalityGuru中,我们试图对源代码中的静态内存访问使用详细的即时(JIT)编译分析,并在内核启动时导出线程和数据索引之间的映射。我们的局部性分析技术可以应用于多个粒度,例如GPU内核中的线程、扭曲和线程块。可以利用这些信息帮助在单gpu和多gpu系统中对位置感知的数据分区、内存页数据放置、缓存管理和调度做出更明智的决策。然后将LocalityGuru PTX分析器的结果与通过剖析获得的Locality图进行比较,验证结果。由于整个分析是由编译器在内核启动时间之前执行的,因此它不会给内核执行时间带来任何计时开销。
{"title":"LocalityGuru: A PTX Analyzer for Extracting Thread Block-level Locality in GPGPUs","authors":"Devashree Tripathy, AmirAli Abdolrashidi, Quan Fan, Daniel Wong, M. Satpathy","doi":"10.1109/nas51552.2021.9605411","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605411","url":null,"abstract":"Exploiting data locality in GPGPUs is critical for efficiently using the smaller data caches and handling the memory bottleneck problem. This paper proposes a thread block-centric locality analysis, which identifies the locality among the thread blocks (TBs) in terms of a number of common data references. In LocalityGuru, we seek to employ a detailed just-in-time (JIT) compilation analysis of the static memory accesses in the source code and derive the mapping between the threads and data indices at kernel-launch-time. Our locality analysis technique can be employed at multiple granularities such as threads, warps, and thread blocks in a GPU Kernel. This information can be leveraged to help make smarter decisions for locality-aware data-partition, memory page data placement, cache management, and scheduling in single-GPU and multi-GPU systems.The results of the LocalityGuru PTX analyzer are then validated by comparing with the Locality graph obtained through profiling. Since the entire analysis is carried out by the compiler before the kernel launch time, it does not introduce any timing overhead to the kernel execution time.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116749802","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
PLMC: A Predictable Tail Latency Mode Coordinator for Shared NVMe SSD with Multiple Hosts PLMC:多主机共享NVMe SSD的可预测尾部延迟模式协调器
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605470
Tanay Roy, Jit Gupta, K. Kant, Amitangshu Pal, D. Minturn, Arash Tavakkol
Solid-State Drives (SSDs) involve a complex set of management activities in the background, resulting in unpredictable delays and occasional extended access latencies. However, there is an increasing demand for "deterministic" access latency in a growing number of scenarios. This demand has prompted a new feature in the NVMe storage access protocol called Predictable Latency Mode (PLM), which provides a way to tighten tail latency in SSDs. This paper presents the first study of the PLM feature in a single-host environment and its extension to multi-host settings. We propose a PLM Coordinator (PLMC) that regulates access to the PLM of a shared SSD device based on the hosts’ traffic characteristics. Our simulation experiments show that the proposed PLMC can achieve 82% improvement in 99.99% tail latency compared to a bare SSD without PLM feature. Moreover, the proposed coordinator with simple traffic prediction can perform 93.2% better than without coordinator on the 99%-tail latency values.
固态硬盘(ssd)在后台涉及一组复杂的管理活动,导致不可预测的延迟和偶尔的扩展访问延迟。然而,在越来越多的场景中,对“确定性”访问延迟的需求越来越大。这种需求促使NVMe存储访问协议中出现了一种名为可预测延迟模式(PLM)的新功能,该功能提供了一种缩短ssd尾部延迟的方法。本文首次研究了单主机环境下的PLM特性,并将其扩展到多主机环境。我们提出了一个PLM协调器(PLMC),它根据主机的流量特征来调节对共享SSD设备PLM的访问。我们的仿真实验表明,与没有PLM特性的裸SSD相比,所提出的PLMC可以在99.99%的尾部延迟下实现82%的改进。此外,在99%尾延迟值上,具有简单流量预测的协调器比不具有协调器的协调器的性能提高了93.2%。
{"title":"PLMC: A Predictable Tail Latency Mode Coordinator for Shared NVMe SSD with Multiple Hosts","authors":"Tanay Roy, Jit Gupta, K. Kant, Amitangshu Pal, D. Minturn, Arash Tavakkol","doi":"10.1109/nas51552.2021.9605470","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605470","url":null,"abstract":"Solid-State Drives (SSDs) involve a complex set of management activities in the background, resulting in unpredictable delays and occasional extended access latencies. However, there is an increasing demand for \"deterministic\" access latency in a growing number of scenarios. This demand has prompted a new feature in the NVMe storage access protocol called Predictable Latency Mode (PLM), which provides a way to tighten tail latency in SSDs. This paper presents the first study of the PLM feature in a single-host environment and its extension to multi-host settings. We propose a PLM Coordinator (PLMC) that regulates access to the PLM of a shared SSD device based on the hosts’ traffic characteristics. Our simulation experiments show that the proposed PLMC can achieve 82% improvement in 99.99% tail latency compared to a bare SSD without PLM feature. Moreover, the proposed coordinator with simple traffic prediction can perform 93.2% better than without coordinator on the 99%-tail latency values.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114837888","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
A Congestion Aware Multi-Path Label Switching in Data Centers Using Programmable Switches 使用可编程交换机的数据中心中感知拥塞的多路径标签交换
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605422
Yeim-Kuan Chang, Hung-Yen Wang, Yu-Hsiang Lin
The equal-cost multi-path routing (ECMP) [4] achieves load balance in data centers network. Without network’s congestion status, ECMP may cause significant imbalance between paths. In this paper, we propose a better congestion aware routing protocol for Software Defined Network (SDN) to provide a better average link utilization. We follow the idea of In-band Network Telemetry (INT) to collect link congestion status in data center networks. Edge switches are responsible for detecting elephant flows by running a heavy hitter detection algorithm. When an elephant flow is reported to the controller by an edge switch, controller will use the collected congestion status to find the least congested path. In order to make the switches forward packets more efficiently and reduce the number of rules in switches’ forwarding table, we adopt label switching. We develop a Programming Protocol-independent Packet Processors (P4) program to design our novel routing scheme, which contains a heavy hitter detection algorithm. We further validate that our heavy hitter detection algorithm can run on Banzai machine. We also write a Python controller to communicate with P4 switches through P4 Runtime protocol. Our experimental results shows that the probing process in CAMP minimizes the bandwidth overhead in data centers. We use Mininet to construct fat-tree topologies and the emulated software P4switches run BMv2. The data mining workload is used to generate the traffic in our experiment. CAMP achieves better FCT compared to ECMP and HULA [6]. Also, the number of routing rules in CAMP maintains the smallest when network grows.
等价多路径路由(equal-cost multi-path routing, ECMP)[4]在数据中心网络中实现负载均衡。在没有网络拥塞状态的情况下,ECMP可能会造成严重的路径不平衡。在本文中,我们为软件定义网络(SDN)提出了一个更好的拥塞感知路由协议,以提供更好的平均链路利用率。我们遵循带内网络遥测(INT)的思想来收集数据中心网络中的链路拥塞状态。边缘交换机负责通过运行重磅检测算法来检测大象流。当边缘交换机向控制器报告大象流时,控制器将使用收集到的拥塞状态来寻找拥塞最少的路径。为了提高交换机转发报文的效率,减少交换机转发表中的规则数,我们采用了标签交换。我们开发了一个独立于编程协议的数据包处理器(P4)程序来设计我们的新路由方案,该方案包含一个重磅攻击检测算法。我们进一步验证了我们的重拳检测算法可以在板载机上运行。我们还编写了一个Python控制器,通过P4运行时协议与P4交换机通信。实验结果表明,CAMP中的探测过程使数据中心的带宽开销最小化。我们使用Mininet构建胖树拓扑,仿真软件p4交换机运行BMv2。在我们的实验中,使用数据挖掘工作负载来生成流量。与ECMP和HULA相比,CAMP获得了更好的FCT[6]。此外,当网络增长时,CAMP中的路由规则数量保持最小。
{"title":"A Congestion Aware Multi-Path Label Switching in Data Centers Using Programmable Switches","authors":"Yeim-Kuan Chang, Hung-Yen Wang, Yu-Hsiang Lin","doi":"10.1109/nas51552.2021.9605422","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605422","url":null,"abstract":"The equal-cost multi-path routing (ECMP) [4] achieves load balance in data centers network. Without network’s congestion status, ECMP may cause significant imbalance between paths. In this paper, we propose a better congestion aware routing protocol for Software Defined Network (SDN) to provide a better average link utilization. We follow the idea of In-band Network Telemetry (INT) to collect link congestion status in data center networks. Edge switches are responsible for detecting elephant flows by running a heavy hitter detection algorithm. When an elephant flow is reported to the controller by an edge switch, controller will use the collected congestion status to find the least congested path. In order to make the switches forward packets more efficiently and reduce the number of rules in switches’ forwarding table, we adopt label switching. We develop a Programming Protocol-independent Packet Processors (P4) program to design our novel routing scheme, which contains a heavy hitter detection algorithm. We further validate that our heavy hitter detection algorithm can run on Banzai machine. We also write a Python controller to communicate with P4 switches through P4 Runtime protocol. Our experimental results shows that the probing process in CAMP minimizes the bandwidth overhead in data centers. We use Mininet to construct fat-tree topologies and the emulated software P4switches run BMv2. The data mining workload is used to generate the traffic in our experiment. CAMP achieves better FCT compared to ECMP and HULA [6]. Also, the number of routing rules in CAMP maintains the smallest when network grows.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122963711","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Reducing the Training Overhead of the HPC Compression Autoencoder via Dataset Proportioning 通过数据集比例化降低HPC压缩自编码器的训练开销
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605407
Tong Liu, Shakeel Alibhai, Jinzhen Wang, Qing Liu, Xubin He
As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or even exabyte scale, it could be useful to find new methods of compressing such data. The compression autoencoder (CAE) has recently been proposed to compress HPC data with a very high compression ratio. However, this machine learning-based method suffers from the major drawback of lengthy training time. In this paper, we attempt to mitigate this problem by proposing a proportioning scheme to reduce the amount of data that is used for training relative to the amount of data to be compressed. We show that this method drastically reduces the training time without, in most cases, significantly increasing the error. We further explain how this scheme can even improve the accuracy of the CAE on certain datasets. Finally, we provide some guidance on how to determine a suitable proportion of the training dataset to use in order to train the CAE for a given dataset.
随着高性能计算(HPC)数据的存储开销达到pb甚至eb级,找到压缩此类数据的新方法可能会很有用。压缩自编码器(CAE)最近被提出用于压缩高性能计算数据,具有非常高的压缩比。然而,这种基于机器学习的方法的主要缺点是训练时间长。在本文中,我们试图通过提出一个比例方案来缓解这个问题,以减少用于训练的数据量相对于要压缩的数据量。我们表明,这种方法大大减少了训练时间,而在大多数情况下,显著增加了误差。我们进一步解释了该方案如何在某些数据集上提高CAE的准确性。最后,我们提供了一些指导,说明如何确定训练数据集的合适比例,以便为给定数据集训练CAE。
{"title":"Reducing the Training Overhead of the HPC Compression Autoencoder via Dataset Proportioning","authors":"Tong Liu, Shakeel Alibhai, Jinzhen Wang, Qing Liu, Xubin He","doi":"10.1109/nas51552.2021.9605407","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605407","url":null,"abstract":"As the storage overhead of high-performance computing (HPC) data reaches into the petabyte or even exabyte scale, it could be useful to find new methods of compressing such data. The compression autoencoder (CAE) has recently been proposed to compress HPC data with a very high compression ratio. However, this machine learning-based method suffers from the major drawback of lengthy training time. In this paper, we attempt to mitigate this problem by proposing a proportioning scheme to reduce the amount of data that is used for training relative to the amount of data to be compressed. We show that this method drastically reduces the training time without, in most cases, significantly increasing the error. We further explain how this scheme can even improve the accuracy of the CAE on certain datasets. Finally, we provide some guidance on how to determine a suitable proportion of the training dataset to use in order to train the CAE for a given dataset.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123007187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Making Storage and SSD Smarter and Faster 使存储和SSD更智能,更快
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605388
As digitalization continues and sensors proliferate across every part of our world, the amount of data grows exponentially and demands for 3V (Volume, Velocity, Variety) storage systems accelerate. Cloud computing and big data applications have created a huge storage market that grows faster than ever before. The emerging of 5G cellular networks will make data growth even much faster. Besides the exponential growth of data volumes, big data applications require high performance, high reliability, security, high availability, and recoverability.
随着数字化的持续发展和传感器在世界各地的普及,数据量呈指数级增长,对3V (Volume, Velocity, Variety)存储系统的需求也在加速。云计算和大数据应用创造了一个增长速度前所未有的巨大存储市场。5G蜂窝网络的出现将使数据增长更快。在数据量呈指数级增长的同时,大数据应用对高性能、高可靠性、高安全性、高可用性和可恢复性的要求也越来越高。
{"title":"Making Storage and SSD Smarter and Faster","authors":"","doi":"10.1109/nas51552.2021.9605388","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605388","url":null,"abstract":"As digitalization continues and sensors proliferate across every part of our world, the amount of data grows exponentially and demands for 3V (Volume, Velocity, Variety) storage systems accelerate. Cloud computing and big data applications have created a huge storage market that grows faster than ever before. The emerging of 5G cellular networks will make data growth even much faster. Besides the exponential growth of data volumes, big data applications require high performance, high reliability, security, high availability, and recoverability.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124444072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
EFLOG: A Full Stream-Logging Scheme with Erasure Coding in Cloud Storage Systems EFLOG:云存储系统中带有Erasure编码的完整流日志记录方案
Pub Date : 2021-10-01 DOI: 10.1109/nas51552.2021.9605428
Lei Sun, Q. Cao, Shucheng Wang, Changsheng Xie
Large-scale cloud storage systems use the logging mechanism to sequentially write data in an append-only manner. The write stream needs to be first appended and persisted into logging files, and then encoded with erasure coding (EC) in underlying storage. This introduces significant overhead to small write operations. To solve this problem, we propose EFLOG, a full-streaming storage framework that combines Logging and inter-log EC mechanisms. EFLOG evenly schedules front-end write streams across log files in each disk with append-only manner. In background, EFLOG determines unprotected logged data and seals them into ECblocks. Afterwards, EFLOG concurrently encodes data with multi-threads and stores parity data into parity disks. Results of our trace-driven evaluation show that, EFLOG can achieve up to 1.01GB/s write throughput with RS(4, 2) codes built upon 6 SSD disks.
大型云存储系统使用日志记录机制以仅追加的方式顺序写入数据。首先需要将写流附加并持久化到日志文件中,然后在底层存储中使用擦除编码(EC)进行编码。这给小的写操作带来了巨大的开销。为了解决这个问题,我们提出了EFLOG,一个结合了日志和日志间EC机制的全流存储框架。EFLOG以只追加的方式在每个磁盘的日志文件上均匀调度前端写流。在后台,EFLOG确定未受保护的日志数据并将其密封到ecblock中。然后,EFLOG对数据进行多线程并发编码,并将校验数据保存到校验磁盘中。我们的跟踪驱动评估结果表明,在6个SSD磁盘上构建RS(4,2)代码时,EFLOG可以实现高达1.01GB/s的写入吞吐量。
{"title":"EFLOG: A Full Stream-Logging Scheme with Erasure Coding in Cloud Storage Systems","authors":"Lei Sun, Q. Cao, Shucheng Wang, Changsheng Xie","doi":"10.1109/nas51552.2021.9605428","DOIUrl":"https://doi.org/10.1109/nas51552.2021.9605428","url":null,"abstract":"Large-scale cloud storage systems use the logging mechanism to sequentially write data in an append-only manner. The write stream needs to be first appended and persisted into logging files, and then encoded with erasure coding (EC) in underlying storage. This introduces significant overhead to small write operations. To solve this problem, we propose EFLOG, a full-streaming storage framework that combines Logging and inter-log EC mechanisms. EFLOG evenly schedules front-end write streams across log files in each disk with append-only manner. In background, EFLOG determines unprotected logged data and seals them into ECblocks. Afterwards, EFLOG concurrently encodes data with multi-threads and stores parity data into parity disks. Results of our trace-driven evaluation show that, EFLOG can achieve up to 1.01GB/s write throughput with RS(4, 2) codes built upon 6 SSD disks.","PeriodicalId":135930,"journal":{"name":"2021 IEEE International Conference on Networking, Architecture and Storage (NAS)","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2021-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132101241","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2021 IEEE International Conference on Networking, Architecture and Storage (NAS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1