Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems最新文献

英文中文

Trusted Heterogeneous Disaggregated Architectures 可信异构分解架构

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609812

Atsushi Koshiba, Felix Gust, Julian Pritzi, Anjo Vahldiek-Oberwagner, Nuno Santos, Pramod Bhatotia

The rising performance demands and increasing heterogeneity in cloud data centers lead to a paradigm shift in the cloud infrastructure, from monolithic servers to a disaggregated architecture. In a multi-tenant cloud, users should be able to leverage trusted computing to protect their applications from untrusted parties. While Trusted Execution Environments (TEEs) are a well-known technique to realize trusted computing on monolithic servers, we cannot adopt existing TEE technologies to the disaggregated architecture due to their distributed nature and heterogeneity of devices. To address these challenges, we propose trusted heterogeneous disaggregated architectures, which allows cloud users to construct virtual TEEs (vTEEs): TEE-based, secure, isolated environments assembled with any combination of disaggregated components.

云数据中心中不断增长的性能需求和日益增加的异构性导致了云基础设施的范式转变，从单片服务器到分解架构。在多租户云中，用户应该能够利用可信计算来保护其应用程序免受不可信方的攻击。可信执行环境(TEE)是一种众所周知的在单片服务器上实现可信计算的技术，但由于现有TEE技术的分布式特性和设备的异构性，我们无法将其应用到分解体系结构中。为了应对这些挑战，我们提出了可信的异构分解架构，它允许云用户构建虚拟tee (vtee):基于tee的、安全的、隔离的环境，由分解组件的任何组合组装而成。

引用次数: 0

Understanding the Security of Linux eBPF Subsystem 了解Linux eBPF子系统的安全性

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609822

Mohamed Husain Noor Mohamed, Xiaoguang Wang, B. Ravindran

Linux eBPF allows a userspace application to execute code inside the Linux kernel without modifying the kernel code or inserting a kernel module. An in-kernel eBPF verifier pre-verifies any untrusted eBPF bytecode before running it in kernel context. Currently, users trust the verifier to block malicious bytecode from being executed. This paper studied the potential security issues from existing eBPF-related CVEs. Next, we present a generation-based eBPF fuzzer that generates syntactically and semantically valid eBPF programs to find bugs in the verifier component of the Linux kernel eBPF subsystem. The fuzzer extends the Linux Kernel Library (LKL) project to run multiple lightweight Linux instances simultaneously, with inputs from the automatically generated eBPF instruction sequences. Using this fuzzer, we can outperform the bpf-fuzzer [10] from the iovisor GitHub repository regarding fuzzing speed and the success rate of passing the eBPF verifier (valid generated code). We also found two existing ALU range-tracking bugs that appeared in an older Linux kernel (v5.10).

Linux eBPF允许用户空间应用程序在Linux内核中执行代码，而无需修改内核代码或插入内核模块。内核内的eBPF验证器在内核上下文中运行任何不受信任的eBPF字节码之前对其进行预验证。目前，用户信任验证器能够阻止恶意字节码的执行。本文研究了现有ebp相关cve的潜在安全问题。接下来，我们提出了一个基于生成的eBPF模糊器，它生成语法和语义上有效的eBPF程序，以查找Linux内核eBPF子系统的验证器组件中的错误。fuzzer扩展了Linux内核库(LKL)项目，可以同时运行多个轻量级Linux实例，并使用自动生成的eBPF指令序列输入。使用这个模糊器，我们可以在模糊测试速度和通过eBPF验证器(有效生成的代码)的成功率方面优于iovisor GitHub存储库中的bpf-fuzzer[10]。我们还发现了两个出现在旧Linux内核(v5.10)中的现有ALU距离跟踪错误。

引用次数: 0

ZapRAID: Toward High-Performance RAID for ZNS SSDs via Zone Append 通过Zone Append实现ZNS ssd的高性能RAID

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609810

Qiuping Wang, P. Lee

Zoned Namespace (ZNS) provides the Zone Append primitive to boost the write performance of ZNS SSDs via intrazone parallelism. However, making Zone Append effective for a RAID array of multiple ZNS SSDs is non-trivial, since Zone Append offloads address management to ZNS SSDs and requires hosts to dedicatedly manage RAID stripes across multiple drives. We propose ZapRAID, a high-performance software RAID layer for ZNS SSDs by carefully using Zone Append to achieve high write parallelism and lightweight stripe management. ZapRAID's core idea is a group-based data layout with coarse-grained ordering across multiple groups of stripes, such that it can use small-size metadata for stripe management on a per-group basis. Our prototype evaluation shows that ZapRAID achieves a 2.34x write throughput gain compared with using the Zone Write primitive.

分区命名空间(ZNS)提供了分区追加原语，通过分区内并行性来提高ZNS ssd的写性能。然而，让Zone Append对多个ZNS ssd组成的RAID阵列有效是很重要的，因为Zone Append将地址管理转移到ZNS ssd上，并要求主机专门管理多个驱动器上的RAID条带。我们提出了ZNS ssd的高性能软件RAID层ZapRAID，通过谨慎地使用Zone Append来实现高写并行性和轻量级的条带管理。ZapRAID的核心思想是基于组的数据布局，在多个条带组之间进行粗粒度排序，这样它就可以在每个组的基础上使用小尺寸的元数据进行条带管理。我们的原型评估表明，与使用Zone write原语相比，ZapRAID实现了2.34倍的写入吞吐量增益。

引用次数: 0

Towards OS Heterogeneity Aware Cluster Management for HPC 面向操作系统异构的高性能计算集群管理

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609819

Zhida An, Ding Li, Yao Guo, Guijin Gao, Yuxin Ren, Ning Jia, Xinwei Hu

To achieve extremely high performance in HPC, many researchers have proposed customized operating systems that are tailored to HPC workload characteristics and emerging hardware. Hence, we argue that the HPC cluster will move away from the single OS environment to a cluster with numerous heterogeneous OSes. However, existing HPC cluster management still assumes that all nodes are equipped with the same OS and fails to consider OS heterogeneity during job scheduling. As a result, such unawareness loses most performance benefits provided by specialized OSes. This paper quantitatively investigates the problem of ignoring OS heterogeneity in the current HPC cluster management and analyzes performance trade-offs inside heterogeneous OSes. Preliminary results on a variety of HPC OSes and applications confirm the performance penalty of the existing cluster scheduler. We then propose a cluster scheduler prototype that incorporates OS heterogeneity into cluster configuration, resource monitoring, and job placement. We also present open challenges for future research on OS heterogeneity aware HPC clusters.

为了在HPC中实现极高的性能，许多研究人员提出了针对HPC工作负载特征和新兴硬件量身定制的操作系统。因此，我们认为HPC集群将从单一操作系统环境转向具有众多异构操作系统的集群。然而，现有的HPC集群管理仍然假设所有节点都配备相同的操作系统，在作业调度时没有考虑操作系统的异构性。其结果是，这种无意识丧失了专用操作系统提供的大部分性能优势。本文定量研究了当前高性能计算集群管理中忽略操作系统异构性的问题，并分析了异构操作系统内部的性能权衡。在各种HPC操作系统和应用程序上的初步结果证实了现有集群调度器的性能损失。然后，我们提出了一个集群调度器原型，该原型将操作系统异构性集成到集群配置、资源监控和作业安置中。我们还提出了对操作系统异构感知的高性能计算集群的未来研究的公开挑战。

引用次数: 0

First steps in verifying the seL4 Core Platform 验证seL4核心平台的第一步

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609821

Mathieu Paturel, Isitha Subasinghe, G. Heiser

We report on our initial effort to formally verify the seL4 Core Platform, an OS framework for the verified seL4 microkernel. This includes a formal specification of the seL4 Core Platform library, an automated proof of its functional correctness, and a verified mapping of the seL4 Core Platform's System Description to the CapDL formalism that describes seL4 access rights and enables verified system initialisation.

我们报告了我们正式验证seL4核心平台的初步努力，这是一个经过验证的seL4微内核的操作系统框架。这包括seL4核心平台库的正式规范，其功能正确性的自动证明，以及seL4核心平台的系统描述到CapDL形式化的经过验证的映射，CapDL形式化描述了seL4访问权限并启用经过验证的系统初始化。

引用次数: 0

Family Classification based on Tree Representations for Malware 基于树表示的恶意软件族分类

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609818

Yang Xu, Zhuotai Chen

Malware classification is helpful for malware detection and analysis. Family classification of malware is a multi-classification task. Many studies have exploited API call sequences as malware features. However, API call sequences do not explicitly express the information about control structures between API calls, which may be useful to represent malware behavior features more accurately. In this paper, we propose a novel malware familial classification method. We model each malware as a Behavioral Tree from API call sequence obtained from dynamic analysis, which describes the control structure between the API calls. To reduce the computational complexity, we capture a set of binary relations, called as Heighted Behavior Relations, from the behavior tree as behavior features of malware. The TF-IDF technology is used to calculate the family behavior features from the behavior features of malware. Then the similarity vector of each malware is constructed based on the similarity between it and all the families. For family classification purpose, the similarity vectors of malware are fed into Naive Bayes algorithm to train a classifier. The experiments on dataset with 10620 malware samples from 43 malware families show that the classification accuracy of our approach is 10% higher than that of the classical methods based on API call sequences.

恶意软件分类有助于恶意软件的检测和分析。恶意软件族分类是一项多分类任务。许多研究利用API调用序列作为恶意软件的特征。然而，API调用序列并没有显式地表达API调用之间的控制结构信息，这可能有助于更准确地表示恶意软件的行为特征。本文提出了一种新的恶意软件家族分类方法。我们将每个恶意软件建模为动态分析得到的API调用序列的行为树，它描述了API调用之间的控制结构。为了降低计算复杂度，我们从行为树中捕获一组称为高度行为关系的二进制关系作为恶意软件的行为特征。利用TF-IDF技术从恶意软件的行为特征中计算出家族行为特征。然后根据每个恶意软件与所有家族的相似度构造其相似度向量。为了进行家族分类，将恶意软件的相似向量输入朴素贝叶斯算法训练分类器。在43个恶意软件家族的10620个样本数据集上进行的实验表明，该方法的分类准确率比基于API调用序列的经典方法提高了10%。

{"title":"Family Classification based on Tree Representations for Malware","authors":"Yang Xu, Zhuotai Chen","doi":"10.1145/3609510.3609818","DOIUrl":"https://doi.org/10.1145/3609510.3609818","url":null,"abstract":"Malware classification is helpful for malware detection and analysis. Family classification of malware is a multi-classification task. Many studies have exploited API call sequences as malware features. However, API call sequences do not explicitly express the information about control structures between API calls, which may be useful to represent malware behavior features more accurately. In this paper, we propose a novel malware familial classification method. We model each malware as a Behavioral Tree from API call sequence obtained from dynamic analysis, which describes the control structure between the API calls. To reduce the computational complexity, we capture a set of binary relations, called as Heighted Behavior Relations, from the behavior tree as behavior features of malware. The TF-IDF technology is used to calculate the family behavior features from the behavior features of malware. Then the similarity vector of each malware is constructed based on the similarity between it and all the families. For family classification purpose, the similarity vectors of malware are fed into Naive Bayes algorithm to train a classifier. The experiments on dataset with 10620 malware samples from 43 malware families show that the classification accuracy of our approach is 10% higher than that of the classical methods based on API call sequences.","PeriodicalId":149629,"journal":{"name":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","volume":"129 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121492622","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Cost-Efficient Serverless Inference Serving with Joint Batching and Multi-Processing 具有联合批处理和多处理的低成本无服务器推理服务

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609816

Shen Cai, Zhi Zhou, Kongyange Zhao, Xu Chen

With the emerging of machine learning, many commercial companies increasingly utilize machine learning inference systems as backend services to improve their products. Serverless computing is a modern paradigm that provides auto-scaling, event-driven services, making it particularly well-suited for various domains, including video stream analysis, IoT serving and machine learning applications. The flexible scaling feature of serverless computing is adept at handling the burstiness of ML workloads. However, despite its compatibility with ML inference tasks, the cost of serverless inference systems remain relatively high in comparison to traditional serving paradigms, primarily due to the under-utilization of CPU resources offered by serverless platforms. To tackle this challenge, we design and deploy a serverless inference serving system that incorporates batching and multi-process mechanisms to enhance cost efficiency. By applying a change-point detection algorithm to manage bursty workloads, it optimizes resource usage and achieves lower costs. We employ an Amazon EC2 server for handling request packaging and running the core Bayesian Optimization algorithm without any prior information. The preliminary system, implemented on AWS Lambda, can significantly reduce expenses and save up to 62% compared to the original serverless inference system.

随着机器学习的兴起，许多商业公司越来越多地利用机器学习推理系统作为后端服务来改进他们的产品。无服务器计算是一种现代范例，提供自动扩展，事件驱动服务，使其特别适合各种领域，包括视频流分析，物联网服务和机器学习应用程序。无服务器计算的灵活扩展特性擅长处理ML工作负载的突发性。然而，尽管它与ML推理任务兼容，与传统的服务范式相比，无服务器推理系统的成本仍然相对较高，这主要是由于无服务器平台提供的CPU资源利用率不足。为了应对这一挑战，我们设计并部署了一个无服务器推理服务系统，该系统结合了批处理和多进程机制，以提高成本效率。通过应用变更点检测算法来管理突发工作负载，它可以优化资源使用并实现更低的成本。我们使用Amazon EC2服务器来处理请求打包和运行核心贝叶斯优化算法，而无需任何先验信息。与原始的无服务器推理系统相比，在AWS Lambda上实现的初步系统可以显着降低成本并节省高达62%的成本。

{"title":"Cost-Efficient Serverless Inference Serving with Joint Batching and Multi-Processing","authors":"Shen Cai, Zhi Zhou, Kongyange Zhao, Xu Chen","doi":"10.1145/3609510.3609816","DOIUrl":"https://doi.org/10.1145/3609510.3609816","url":null,"abstract":"With the emerging of machine learning, many commercial companies increasingly utilize machine learning inference systems as backend services to improve their products. Serverless computing is a modern paradigm that provides auto-scaling, event-driven services, making it particularly well-suited for various domains, including video stream analysis, IoT serving and machine learning applications. The flexible scaling feature of serverless computing is adept at handling the burstiness of ML workloads. However, despite its compatibility with ML inference tasks, the cost of serverless inference systems remain relatively high in comparison to traditional serving paradigms, primarily due to the under-utilization of CPU resources offered by serverless platforms. To tackle this challenge, we design and deploy a serverless inference serving system that incorporates batching and multi-process mechanisms to enhance cost efficiency. By applying a change-point detection algorithm to manage bursty workloads, it optimizes resource usage and achieves lower costs. We employ an Amazon EC2 server for handling request packaging and running the core Bayesian Optimization algorithm without any prior information. The preliminary system, implemented on AWS Lambda, can significantly reduce expenses and save up to 62% compared to the original serverless inference system.","PeriodicalId":149629,"journal":{"name":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129600020","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Quantifying the Security Profile of Linux Applications 量化Linux应用的安全配置文件

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609814

Georgios C. Androutsopoulos, Giorgos Kappes, S. Anastasiadis

There is an increasing interest to quantify and improve the isolation provided by containers to competing applications on multitenant hosts. As a first step to address this need, we introduce several metrics that quantify the exposure of the applications to the source code of the kernel subsystems. Based on existing tracing tools, we develop a common framework and build two toolchains that automate the extraction of the metrics. We experimentally compare the tracing accuracy of the toolchains by calculating the metrics across different workloads and demonstrate the importance of separating the application execution from unrelated system activity.

人们对量化和改进容器为多租户主机上的竞争应用程序提供的隔离越来越感兴趣。作为解决这一需求的第一步，我们引入了几个量化应用程序对内核子系统源代码的暴露的度量。基于现有的跟踪工具，我们开发了一个公共框架，并构建了两个工具链，用于自动提取度量。我们通过计算跨不同工作负载的度量，实验地比较了工具链的跟踪准确性，并演示了将应用程序执行与不相关的系统活动分离的重要性。

引用次数: 0

Reducing Attack Surface with Container Transplantation for Lightweight Sandboxing 用容器移植减少轻量级沙箱的攻击面

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609820

Yuki Nakata, Shintaro Suzuki, Katsuya Matsubara

Containers, which have evolved in Linux primarily, have become a significant trend in the cloud due to their lightweight virtualization and growing convenient ecosystem. However, the laxer isolation of containerization also introduces attack surfaces on the underlying Linux kernel. Unfortunately, combining other virtualizations, such as the traditional VM and interposition by application kernel, for sandboxing could spoil the lightweight and scalable nature of the containers. In this study, we propose another approach to lightweight sandboxing that focuses on the fact that such attackers have mostly assumed containers rely on Linux. It can avert major vulnerability exploits derived from Linux by transplanting Linux containers onto the FreeBSD kernel. Furthermore, it can fortify the isolation by transparently applying "Capsicum," a unique sandbox mechanism that is nonstandard in Linux, to the transplanted containers. This paper analyzes vulnerabilities faced by Linux containers, identifies technical issues in transplanting Linux containers onto FreeBSD, and designs a mechanism to transparently apply the Capsicum sandbox to Linux applications to explore the feasibility of our approach.

容器主要是在Linux中发展起来的，由于其轻量级虚拟化和日益增长的便利生态系统，它已经成为云计算中的一个重要趋势。然而，容器化的松散隔离也会在底层Linux内核上引入攻击面。不幸的是，将其他虚拟化(如传统VM和应用程序内核的介入)结合起来用于沙箱可能会破坏容器的轻量级和可伸缩特性。在本研究中，我们提出了另一种轻量级沙箱的方法，该方法主要关注这样一个事实，即攻击者大多认为容器依赖于Linux。通过将Linux容器移植到FreeBSD内核，它可以避免来自Linux的主要漏洞利用。此外，它可以通过透明地将“Capsicum”(一种在Linux中非标准的独特沙盒机制)应用于移植的容器来加强隔离。本文分析了Linux容器所面临的漏洞，确定了将Linux容器移植到FreeBSD上的技术问题，并设计了一种机制来透明地将Capsicum沙盒应用到Linux应用程序中，以探索我们方法的可行性。

{"title":"Reducing Attack Surface with Container Transplantation for Lightweight Sandboxing","authors":"Yuki Nakata, Shintaro Suzuki, Katsuya Matsubara","doi":"10.1145/3609510.3609820","DOIUrl":"https://doi.org/10.1145/3609510.3609820","url":null,"abstract":"Containers, which have evolved in Linux primarily, have become a significant trend in the cloud due to their lightweight virtualization and growing convenient ecosystem. However, the laxer isolation of containerization also introduces attack surfaces on the underlying Linux kernel. Unfortunately, combining other virtualizations, such as the traditional VM and interposition by application kernel, for sandboxing could spoil the lightweight and scalable nature of the containers. In this study, we propose another approach to lightweight sandboxing that focuses on the fact that such attackers have mostly assumed containers rely on Linux. It can avert major vulnerability exploits derived from Linux by transplanting Linux containers onto the FreeBSD kernel. Furthermore, it can fortify the isolation by transparently applying \"Capsicum,\" a unique sandbox mechanism that is nonstandard in Linux, to the transplanted containers. This paper analyzes vulnerabilities faced by Linux containers, identifies technical issues in transplanting Linux containers onto FreeBSD, and designs a mechanism to transparently apply the Capsicum sandbox to Linux applications to explore the feasibility of our approach.","PeriodicalId":149629,"journal":{"name":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134628060","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Liquid: Mix-and-Match Multiple Image Formats to Balance DNN Training Pipeline 液体:混合和匹配多种图像格式来平衡DNN训练管道

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

Pub Date : 2023-08-24 DOI: 10.1145/3609510.3609811

W. Baek, Jonghyun Bae, Donghyun Lee, Hyun-Cheol Bae, Yeonhong Park, Jae W. Lee

Today's deep neural network (DNN) training pipeline utilizes hardware resources holistically, including host CPUs and storage devices for preprocessing the input data and accelerators like GPUs for computing gradients. As the performance of the accelerator scales rapidly, the frontend data preparation stages are becoming a new performance bottleneck to yield suboptimal training throughput. Since the bottleneck in the pipeline may vary depending on hardware configurations, DNN models, and datasets, overprovisioning hardware resources for data preparation such as CPU cores and disk bandwidth is not a cost-effective solution. Instead, we make a case for leveraging multiple data formats, possibly with opposing characteristics in resource utilization, to balance the training pipeline. This idea is realized by Liquid, a new system for building an efficient training pipeline with multi-format datasets. Our evaluation on three distinct execution environments demonstrates that Liquid achieves up to 3.05x and 1.54x higher data preparation throughput on Cityscapes/CityPersons (PNG) and ImageNet (JPEG) datasets, respectively, over the baseline single-format pipeline. This leads up to 2.02x and 1.25x higher end-to-end geomean training throughput with no accuracy drop.

今天的深度神经网络(DNN)训练管道全面利用硬件资源，包括主机cpu和存储设备，用于预处理输入数据，以及gpu等加速器，用于计算梯度。随着加速器性能的快速发展，前端数据准备阶段成为导致训练吞吐量不理想的新性能瓶颈。由于管道中的瓶颈可能因硬件配置、DNN模型和数据集而异，因此为数据准备(如CPU内核和磁盘带宽)过多配置硬件资源并不是一种经济有效的解决方案。相反，我们提出了利用多种数据格式(可能在资源利用方面具有相反的特征)来平衡培训管道的案例。这个想法是由Liquid实现的，它是一个新的系统，用于构建具有多格式数据集的高效训练管道。我们对三种不同执行环境的评估表明，Liquid在cityscape /CityPersons (PNG)和ImageNet (JPEG)数据集上的数据准备吞吐量分别比基线单一格式管道高3.05倍和1.54倍。这使得端到端几何训练吞吐量提高了2.02倍和1.25倍，而精度没有下降。

{"title":"Liquid: Mix-and-Match Multiple Image Formats to Balance DNN Training Pipeline","authors":"W. Baek, Jonghyun Bae, Donghyun Lee, Hyun-Cheol Bae, Yeonhong Park, Jae W. Lee","doi":"10.1145/3609510.3609811","DOIUrl":"https://doi.org/10.1145/3609510.3609811","url":null,"abstract":"Today's deep neural network (DNN) training pipeline utilizes hardware resources holistically, including host CPUs and storage devices for preprocessing the input data and accelerators like GPUs for computing gradients. As the performance of the accelerator scales rapidly, the frontend data preparation stages are becoming a new performance bottleneck to yield suboptimal training throughput. Since the bottleneck in the pipeline may vary depending on hardware configurations, DNN models, and datasets, overprovisioning hardware resources for data preparation such as CPU cores and disk bandwidth is not a cost-effective solution. Instead, we make a case for leveraging multiple data formats, possibly with opposing characteristics in resource utilization, to balance the training pipeline. This idea is realized by Liquid, a new system for building an efficient training pipeline with multi-format datasets. Our evaluation on three distinct execution environments demonstrates that Liquid achieves up to 3.05x and 1.54x higher data preparation throughput on Cityscapes/CityPersons (PNG) and ImageNet (JPEG) datasets, respectively, over the baseline single-format pipeline. This leads up to 2.02x and 1.25x higher end-to-end geomean training throughput with no accuracy drop.","PeriodicalId":149629,"journal":{"name":"Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130168735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 14th ACM SIGOPS Asia-Pacific Workshop on Systems

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀