Proceedings of the 16th ACM International Conference on Systems and Storage最新文献

英文中文

Anomaly Detection on IBM Z Mainframes: Performance Analysis and More IBM Z大型机上的异常检测:性能分析等

Proceedings of the 16th ACM International Conference on Systems and Storage

Pub Date : 2023-06-05 DOI: 10.1145/3579370.3594770

Erik Altman, Benjamin Segal

Anomalous events can signal a variety of problems in any system. As such, robust, fast detection of anomalies is important so issues can be fixed before they cascade to create larger problems. In this paper we focus on IBM Z mainframes, although most of the problems addressed and techniques used are broadly applicable. For example, anomalies can signal issues such as disk malfunctions, slow or unresponsive modules, crashes and latent bugs, lock contention, excessive retries, the need to allocate more resources to reduce contention, etc. Although there are specific techniques for addressing individual issues, anomaly detection is useful in its broad spectrum utility, and its ability to identify combinations of problems for which there may not be a specific approach implemented. In addition, anomaly detection serves as a backstop: truly anomalous events suggest that normal mechanisms did not work. Our input for detecting anomalies is low-level, summarized information available in time series to the zOS operating system. Although such information lacks some high-level context, it does provide an operating system awareness that benefits from universal applicability to any zOS system and any code running on such a system. The data is also quite rich with 100 - 100,000 metrics per sample depending on how "metric" is defined. As might be expected the data contains metrics such as CPU utilization, execution priorities, internal locking behavior, bytes read and written by an executing process, etc. It also contains higher-level information such as the executing process names or transactional identifiers from online transactional processing facilities. Names are useful not only in detecting anomalies, but in conveying context to users trying to isolate and fix problems. Our techniques build on KL divergence [21] and learn continuously without supervision and with low overhead. Continuous learning is important. The first instance of an aberrant behavior is an anomaly. The 10th instance probably is not. This point also illustrates the utility of anomaly detection in pinpointing root cause: early detection is essential and broad-spectrum anomaly detection provides excellent capability to do just that. This paper outlines these techniques and demonstrates their efficacy in detecting and resolving key problems.

异常事件可以表明任何系统中的各种问题。因此，强大、快速的异常检测非常重要，这样可以在问题级联产生更大的问题之前解决问题。在本文中，我们主要关注IBM Z大型机，尽管所处理的大多数问题和使用的技术都是广泛适用的。例如，异常可以表示诸如磁盘故障、模块缓慢或无响应、崩溃和潜在错误、锁争用、过多重试、需要分配更多资源以减少争用等问题。尽管存在解决个别问题的特定技术，但异常检测在其广泛的实用程序中是有用的，并且它能够识别可能没有实现特定方法的问题的组合。此外，异常检测可以作为后盾:真正的异常事件表明正常机制不起作用。我们用于检测异常的输入是以时间序列提供给zOS操作系统的低级汇总信息。尽管此类信息缺乏一些高级上下文，但它确实提供了一种操作系统感知，这种感知受益于对任何zOS系统和在这种系统上运行的任何代码的普遍适用性。数据也非常丰富，每个样本有100 - 100,000个指标，具体取决于“指标”的定义方式。正如预期的那样，数据包含诸如CPU利用率、执行优先级、内部锁定行为、执行进程读取和写入的字节等指标。它还包含高级信息，如执行进程名或来自在线事务处理设施的事务标识符。名称不仅在检测异常时很有用，而且在向试图隔离和修复问题的用户传达上下文时也很有用。我们的技术建立在KL散度的基础上[21]，并且在没有监督和低开销的情况下持续学习。持续学习很重要。异常行为的第一个实例就是异常。第十例可能不是。这一点还说明了异常检测在确定根本原因方面的效用:早期检测是必不可少的，广谱异常检测提供了出色的功能。本文概述了这些技术，并论证了它们在检测和解决关键问题方面的有效性。

{"title":"Anomaly Detection on IBM Z Mainframes: Performance Analysis and More","authors":"Erik Altman, Benjamin Segal","doi":"10.1145/3579370.3594770","DOIUrl":"https://doi.org/10.1145/3579370.3594770","url":null,"abstract":"Anomalous events can signal a variety of problems in any system. As such, robust, fast detection of anomalies is important so issues can be fixed before they cascade to create larger problems. In this paper we focus on IBM Z mainframes, although most of the problems addressed and techniques used are broadly applicable. For example, anomalies can signal issues such as disk malfunctions, slow or unresponsive modules, crashes and latent bugs, lock contention, excessive retries, the need to allocate more resources to reduce contention, etc. Although there are specific techniques for addressing individual issues, anomaly detection is useful in its broad spectrum utility, and its ability to identify combinations of problems for which there may not be a specific approach implemented. In addition, anomaly detection serves as a backstop: truly anomalous events suggest that normal mechanisms did not work. Our input for detecting anomalies is low-level, summarized information available in time series to the zOS operating system. Although such information lacks some high-level context, it does provide an operating system awareness that benefits from universal applicability to any zOS system and any code running on such a system. The data is also quite rich with 100 - 100,000 metrics per sample depending on how \"metric\" is defined. As might be expected the data contains metrics such as CPU utilization, execution priorities, internal locking behavior, bytes read and written by an executing process, etc. It also contains higher-level information such as the executing process names or transactional identifiers from online transactional processing facilities. Names are useful not only in detecting anomalies, but in conveying context to users trying to isolate and fix problems. Our techniques build on KL divergence [21] and learn continuously without supervision and with low overhead. Continuous learning is important. The first instance of an aberrant behavior is an anomaly. The 10th instance probably is not. This point also illustrates the utility of anomaly detection in pinpointing root cause: early detection is essential and broad-spectrum anomaly detection provides excellent capability to do just that. This paper outlines these techniques and demonstrates their efficacy in detecting and resolving key problems.","PeriodicalId":180024,"journal":{"name":"Proceedings of the 16th ACM International Conference on Systems and Storage","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121098981","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

F3: Serving Files Efficiently in Serverless Computing F3:在无服务器计算中高效地提供文件服务

Proceedings of the 16th ACM International Conference on Systems and Storage

Pub Date : 2023-06-05 DOI: 10.1145/3579370.3594771

Alex Merenstein, Vasily Tarasov, Ali Anwar, S. Guthridge, E. Zadok

Serverless platforms offer on-demand computation and represent a significant shift from previous platforms that typically required resources to be pre-allocated (e.g., virtual machines). As serverless platforms have evolved, they have become suitable for a much wider range of applications than their original use cases. However, storage access remains a pain point that holds serverless back from becoming a completely generic computation platform. Existing storage for serverless typically uses an object interface. Although object APIs are simple to use, they lack the richness, versatility, and performance of file based APIs. Additionally, there is a large body of existing applications that relies on file-based interfaces. The lack of file based storage options prevents these applications from being ported to serverless environments. In this paper, we present F3, a file system that offers features to improve file access in serverless platforms: (1) efficient handling of ephemeral data, by placing ephemeral and non-ephemeral data on storage that exists at a different points along the durability-performance tradeoff continuum, (2) locality-aware data scheduling, and (3) efficient reading while writing. We modified OpenWhisk to support attaching file-based storage and to leverage F3's features using hints. Our prototype evaluation of F3 shows improved performance of up to 1.5--6.5× compared to existing storage systems.

无服务器平台提供按需计算，与以前通常需要预先分配资源的平台(例如，虚拟机)相比，这是一个重大转变。随着无服务器平台的发展，它们比原来的用例适用于更广泛的应用程序。然而，存储访问仍然是阻碍无服务器成为完全通用计算平台的痛点。无服务器的现有存储通常使用对象接口。尽管对象api易于使用，但它们缺乏基于文件的api的丰富性、多功能性和性能。此外，有大量现有的应用程序依赖于基于文件的接口。由于缺乏基于文件的存储选项，无法将这些应用程序移植到无服务器环境中。在本文中，我们介绍了F3，这是一个文件系统，它提供了改进无服务器平台中的文件访问的功能:(1)通过将临时数据和非临时数据放在持久性-性能权衡连续体中不同点的存储上，有效地处理临时数据，(2)位置感知数据调度，以及(3)在写入时有效地读取。我们修改了OpenWhisk，以支持附加基于文件的存储，并使用提示来利用F3的特性。我们对F3的原型评估显示，与现有存储系统相比，F3的性能提高了1.5- 6.5倍。

{"title":"F3: Serving Files Efficiently in Serverless Computing","authors":"Alex Merenstein, Vasily Tarasov, Ali Anwar, S. Guthridge, E. Zadok","doi":"10.1145/3579370.3594771","DOIUrl":"https://doi.org/10.1145/3579370.3594771","url":null,"abstract":"Serverless platforms offer on-demand computation and represent a significant shift from previous platforms that typically required resources to be pre-allocated (e.g., virtual machines). As serverless platforms have evolved, they have become suitable for a much wider range of applications than their original use cases. However, storage access remains a pain point that holds serverless back from becoming a completely generic computation platform. Existing storage for serverless typically uses an object interface. Although object APIs are simple to use, they lack the richness, versatility, and performance of file based APIs. Additionally, there is a large body of existing applications that relies on file-based interfaces. The lack of file based storage options prevents these applications from being ported to serverless environments. In this paper, we present F3, a file system that offers features to improve file access in serverless platforms: (1) efficient handling of ephemeral data, by placing ephemeral and non-ephemeral data on storage that exists at a different points along the durability-performance tradeoff continuum, (2) locality-aware data scheduling, and (3) efficient reading while writing. We modified OpenWhisk to support attaching file-based storage and to leverage F3's features using hints. Our prototype evaluation of F3 shows improved performance of up to 1.5--6.5× compared to existing storage systems.","PeriodicalId":180024,"journal":{"name":"Proceedings of the 16th ACM International Conference on Systems and Storage","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2023-06-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115576327","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

BOOSTER: Rethinking the erase operation of low-latency SSDs to achieve high throughput and less long latency BOOSTER:重新考虑低延迟ssd的擦除操作，以实现高吞吐量和更少的长延迟

Proceedings of the 16th ACM International Conference on Systems and Storage

Pub Date : 2023-06-05 DOI: 10.1145/3579370.3594774

Takumi Fujimori, Shuou Nomura

The disregarded performance of erase operation in low-latency NAND flash memory relatively increases the impact of garbage collection interferences, which is a challenge when considering the use for storage class memory applications. Conventional erase-related techniques do not sufficiently conceal the impact of erase operation and incur a performance trade-off between throughput and latency. In this paper, we propose a boost and stop erase (BOOSTER) technique consisting of a multi-block erase technique and an adaptive erase suspension technique. The proposed technique improves both throughput and latency of low-latency NAND flash-based SSDs. Our experiments show 1.15× higher throughput and 27.2% lower latency than the conventional techniques in the small random workload. The proposed technique also passes the certification test with 1.8 to 4.9× higher load in Aerospike Certification Tool evaluations.

低延迟NAND闪存中被忽视的擦除操作性能相对增加了垃圾收集干扰的影响，当考虑使用存储类内存应用程序时，这是一个挑战。传统的擦除相关技术不能充分隐藏擦除操作的影响，并导致吞吐量和延迟之间的性能权衡。本文提出了一种由多块擦除技术和自适应擦除悬浮技术组成的boost and stop erase (BOOSTER)技术。该技术提高了基于NAND闪存的低延迟ssd的吞吐量和延迟。我们的实验表明，在小的随机工作负载下，吞吐量比传统技术提高了1.15倍，延迟降低了27.2%。该技术还通过了Aerospike认证工具评估的认证测试，负载提高了1.8到4.9倍。

引用次数: 0

Proceedings of the 16th ACM International Conference on Systems and Storage 第16届ACM系统与存储国际会议论文集

Proceedings of the 16th ACM International Conference on Systems and Storage

Pub Date : 1900-01-01 DOI: 10.1145/3579370

引用次数: 0

首页上一页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings of the 16th ACM International Conference on Systems and Storage

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀