首页 > 最新文献

2014 9th Parallel Data Storage Workshop最新文献

英文 中文
Automatic Generation of I/O Kernels for HPC Applications 用于HPC应用程序的I/O内核自动生成
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.6
Babak Behzad, Hoang-Vu Dang, Farah Hariri, Weizhe Zhang, M. Snir
The study of the I/O performance of a parallel application can be facilitated by the use of an I/O kernel -- a program that generates the same I/O calls as the original application, but can be executed much faster. Such I/O kernels are especially important when the programs under study are proprietary or classified, and only available in binary form.In this paper, we show how to create automatically such an I/O kernel, by executing the target application with an instrumented I/O library, next "compressing" the resulting I/O traces into a compact C program that generates those traces.
使用I/O内核可以方便地研究并行应用程序的I/O性能——一个程序生成与原始应用程序相同的I/O调用,但执行速度要快得多。当所研究的程序是专有的或分类的,并且只能以二进制形式提供时,这样的I/O内核尤为重要。在本文中,我们将展示如何自动创建这样一个I/O内核,方法是通过使用仪表化的I/O库执行目标应用程序,然后将生成的I/O跟踪“压缩”到生成这些跟踪的紧凑C程序中。
{"title":"Automatic Generation of I/O Kernels for HPC Applications","authors":"Babak Behzad, Hoang-Vu Dang, Farah Hariri, Weizhe Zhang, M. Snir","doi":"10.1109/PDSW.2014.6","DOIUrl":"https://doi.org/10.1109/PDSW.2014.6","url":null,"abstract":"The study of the I/O performance of a parallel application can be facilitated by the use of an I/O kernel -- a program that generates the same I/O calls as the original application, but can be executed much faster. Such I/O kernels are especially important when the programs under study are proprietary or classified, and only available in binary form.In this paper, we show how to create automatically such an I/O kernel, by executing the target application with an instrumented I/O library, next \"compressing\" the resulting I/O traces into a compact C program that generates those traces.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"178 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133849642","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Using Property Graphs for Rich Metadata Management in HPC Systems 在HPC系统中使用属性图进行富元数据管理
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.11
Dong Dai, R. Ross, P. Carns, D. Kimpe, Yong Chen
HPC platforms are capable of generating huge amounts of metadata about different entities including jobs, users, and files. Simple metadata, which describe the attributes of these entities (e.g., file size, name, and permissions mode), has been well recorded and used in current systems. However, only a limited amount of rich metadata, which records not only the attributes of entities but also relationships between them, are captured in current HPC systems. Rich metadata may include information from many sources, including users and applications, and must be integrated into a unified framework. Collecting, integrating, processing, and querying such a large volume of metadata pose considerable challenges for HPC systems. In this paper, we propose a rich metadata management approach that unifies metadata into one generic property graph. We argue that this approach supports not only simple metadata operations such as directory traversal and permission validation but also rich metadata operations such as provenance query and security auditing. The property graph approach provides an extensible method to store diverse metadata and presents an opportunity to leverage rapidly evolving graph storage and processing techniques.
HPC平台能够生成大量关于不同实体的元数据,包括作业、用户和文件。描述这些实体的属性(例如,文件大小、名称和权限模式)的简单元数据在当前系统中已经被很好地记录和使用。然而,在当前的HPC系统中,只捕获了有限数量的丰富元数据,这些元数据不仅记录了实体的属性,还记录了实体之间的关系。富元数据可能包括来自许多来源的信息,包括用户和应用程序,并且必须集成到一个统一的框架中。收集、集成、处理和查询如此大量的元数据对HPC系统构成了相当大的挑战。本文提出了一种富元数据管理方法,将元数据统一到一个通用的属性图中。我们认为这种方法不仅支持简单的元数据操作,如目录遍历和权限验证,还支持丰富的元数据操作,如来源查询和安全审计。属性图方法提供了一种可扩展的方法来存储不同的元数据,并提供了利用快速发展的图存储和处理技术的机会。
{"title":"Using Property Graphs for Rich Metadata Management in HPC Systems","authors":"Dong Dai, R. Ross, P. Carns, D. Kimpe, Yong Chen","doi":"10.1109/PDSW.2014.11","DOIUrl":"https://doi.org/10.1109/PDSW.2014.11","url":null,"abstract":"HPC platforms are capable of generating huge amounts of metadata about different entities including jobs, users, and files. Simple metadata, which describe the attributes of these entities (e.g., file size, name, and permissions mode), has been well recorded and used in current systems. However, only a limited amount of rich metadata, which records not only the attributes of entities but also relationships between them, are captured in current HPC systems. Rich metadata may include information from many sources, including users and applications, and must be integrated into a unified framework. Collecting, integrating, processing, and querying such a large volume of metadata pose considerable challenges for HPC systems. In this paper, we propose a rich metadata management approach that unifies metadata into one generic property graph. We argue that this approach supports not only simple metadata operations such as directory traversal and permission validation but also rich metadata operations such as provenance query and security auditing. The property graph approach provides an extensible method to store diverse metadata and presents an opportunity to leverage rapidly evolving graph storage and processing techniques.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131082854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 24
Feign: In-Silico Laboratory for Researching I/O Strategies 目的:研究I/O策略的计算机实验室
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.9
Jakob Lüttgau, J. Kunkel
Evaluating I/O performance of an application across different systems is a daunting task because it requires preparation of the software dependencies and required input data. Feign aims to be an extensible trace replay solution for parallel applications that supports arbitrary software and library layers. The tool abstracts and streamlines the replay process while allowing plug-ins to provide, manipulate and interpret trace data. Therewith, the application's behavior can be evaluated without potentially proprietary or confidential software and input data.Even more interesting is the potential of Feign as a virtual laboratory for I/O research: by manipulating trace data, experiments can be conducted; for example, it becomes possible to evaluate the benefit of optimization strategies. Since a plug-in could determine "future" activities, this enables us to develop optimal strategies as baselines for any run-time heuristics, but also eases testing of a developed strategy on many applications without modifying them.The paper proposes and evaluates a workflow to automatically apply optimization candidates to application traces and approximate potential performance gains. By using Feign's reporting facilities, an automatic optimization engine can then independently conduct experiments by feeding traces and strategies to compare the results.
评估跨不同系统的应用程序的I/O性能是一项艰巨的任务,因为它需要准备软件依赖项和所需的输入数据。Feign的目标是为支持任意软件和库层的并行应用程序提供可扩展的跟踪重放解决方案。该工具抽象并简化了重放过程,同时允许插件提供、操作和解释跟踪数据。因此,应用程序的行为可以在没有潜在的专有或机密软件和输入数据的情况下进行评估。更有趣的是,Feign作为I/O研究的虚拟实验室的潜力:通过操纵跟踪数据,可以进行实验;例如,可以评估优化策略的好处。由于插件可以确定“未来”的活动,这使我们能够开发最佳策略作为任何运行时启发式的基线,但也简化了在许多应用程序上测试已开发策略而无需修改它们。本文提出并评估了一个工作流,以自动将优化候选应用于应用程序跟踪并估计潜在的性能增益。通过使用Feign的报告功能,自动优化引擎可以独立地进行实验,通过输入痕迹和策略来比较结果。
{"title":"Feign: In-Silico Laboratory for Researching I/O Strategies","authors":"Jakob Lüttgau, J. Kunkel","doi":"10.1109/PDSW.2014.9","DOIUrl":"https://doi.org/10.1109/PDSW.2014.9","url":null,"abstract":"Evaluating I/O performance of an application across different systems is a daunting task because it requires preparation of the software dependencies and required input data. Feign aims to be an extensible trace replay solution for parallel applications that supports arbitrary software and library layers. The tool abstracts and streamlines the replay process while allowing plug-ins to provide, manipulate and interpret trace data. Therewith, the application's behavior can be evaluated without potentially proprietary or confidential software and input data.Even more interesting is the potential of Feign as a virtual laboratory for I/O research: by manipulating trace data, experiments can be conducted; for example, it becomes possible to evaluate the benefit of optimization strategies. Since a plug-in could determine \"future\" activities, this enables us to develop optimal strategies as baselines for any run-time heuristics, but also eases testing of a developed strategy on many applications without modifying them.The paper proposes and evaluates a workflow to automatically apply optimization candidates to application traces and approximate potential performance gains. By using Feign's reporting facilities, an automatic optimization engine can then independently conduct experiments by feeding traces and strategies to compare the results.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"2 6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124717643","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
VSFS: A Searchable Distributed File System 一个可搜索的分布式文件系统
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.10
Lei Xu, Ziling Huang, Hong Jiang, Lei Tian, D. Swanson
In this paper, we propose a Versatile Searchable File System, VSFS, which builds POSIX-compatible namespace using a novel Namespace-based File Query Language (NFQL). This enables analytics applications to utilize VSFS high-performance file-search service without changing their data model. VSFS versatile file-indexing mechanism is designed to offer great flexibility for applications to control indices to satisfy analytics needs. The evaluations driven by two real-world analytics applications demonstrate VSFS' high scalability and powerful data-filtering functionality.
在本文中,我们提出了一个通用的可搜索文件系统,VSFS,它使用一种新颖的基于命名空间的文件查询语言(NFQL)构建posix兼容的命名空间。这使得分析应用程序可以利用VSFS高性能文件搜索服务,而无需更改其数据模型。VSFS多功能文件索引机制旨在为应用程序控制索引以满足分析需求提供极大的灵活性。由两个实际分析应用程序驱动的评估证明了VSFS的高可伸缩性和强大的数据过滤功能。
{"title":"VSFS: A Searchable Distributed File System","authors":"Lei Xu, Ziling Huang, Hong Jiang, Lei Tian, D. Swanson","doi":"10.1109/PDSW.2014.10","DOIUrl":"https://doi.org/10.1109/PDSW.2014.10","url":null,"abstract":"In this paper, we propose a Versatile Searchable File System, VSFS, which builds POSIX-compatible namespace using a novel Namespace-based File Query Language (NFQL). This enables analytics applications to utilize VSFS high-performance file-search service without changing their data model. VSFS versatile file-indexing mechanism is designed to offer great flexibility for applications to control indices to satisfy analytics needs. The evaluations driven by two real-world analytics applications demonstrate VSFS' high scalability and powerful data-filtering functionality.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125260880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Evaluating Lustre's Metadata Server on a Multi-Socket Platform 在多套接字平台上评估Lustre的元数据服务器
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.5
Konstantinos Chasapis, M. F. Dolz, Michael Kuhn, T. Ludwig
With the emergence of multi-core and multi-socket non-uniform memory access (NUMA) platforms in recent years, new software challenges have arisen to use them efficiently. In the field of high performance computing (HPC), parallel programming has always been the key factor to improve applications performance. However, the implications of parallel architectures in the system software has been overlooked until recently. In this work, we examine the implications of such platforms in the performance scalability of the Lustre parallel distributed file system's metadata server (MDS). We run our experiments on a four socket NUMA platform that has 48 cores. We leverage the mdtest benchmark to generate appropriate metadata workloads and include configurations with varying numbers of active cores and mount points. Additionally, we compare Lustre's metadata scalability with the local file systems ext4 and XFS. The results demonstrate that Lustre's metadata performance is limited to a single socket and decreases when more sockets are used. We also observe that the MDS's back-end device is not a limiting factor regarding the performance.
近年来,随着多核多套接字非统一内存访问(NUMA)平台的出现,如何有效地利用这些平台,对软件开发提出了新的挑战。在高性能计算领域,并行编程一直是提高应用程序性能的关键因素。然而,并行架构在系统软件中的含义直到最近才被忽视。在这项工作中,我们研究了这些平台对Lustre并行分布式文件系统元数据服务器(MDS)的性能可扩展性的影响。我们在一个有48个内核的四个插槽NUMA平台上运行我们的实验。我们利用mdtest基准来生成适当的元数据工作负载,并包含具有不同数量的活动内核和挂载点的配置。此外,我们将Lustre的元数据可伸缩性与本地文件系统ext4和XFS进行比较。结果表明,Lustre的元数据性能仅限于单个套接字,并且当使用更多套接字时性能会下降。我们还观察到MDS的后端设备不是性能方面的限制因素。
{"title":"Evaluating Lustre's Metadata Server on a Multi-Socket Platform","authors":"Konstantinos Chasapis, M. F. Dolz, Michael Kuhn, T. Ludwig","doi":"10.1109/PDSW.2014.5","DOIUrl":"https://doi.org/10.1109/PDSW.2014.5","url":null,"abstract":"With the emergence of multi-core and multi-socket non-uniform memory access (NUMA) platforms in recent years, new software challenges have arisen to use them efficiently. In the field of high performance computing (HPC), parallel programming has always been the key factor to improve applications performance. However, the implications of parallel architectures in the system software has been overlooked until recently. In this work, we examine the implications of such platforms in the performance scalability of the Lustre parallel distributed file system's metadata server (MDS). We run our experiments on a four socket NUMA platform that has 48 cores. We leverage the mdtest benchmark to generate appropriate metadata workloads and include configurations with varying numbers of active cores and mount points. Additionally, we compare Lustre's metadata scalability with the local file systems ext4 and XFS. The results demonstrate that Lustre's metadata performance is limited to a single socket and decreases when more sockets are used. We also observe that the MDS's back-end device is not a limiting factor regarding the performance.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121068571","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance 在不降低迁移性能的前提下,通过缓存和速率控制预取减轻I/O干扰
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.8
Morgan Stuart, Tao Lu, Xubin He
The process of migrating a virtual machine and its virtual storage can greatly degrade the performance of other guests and applications running on the same host, including the migrating machine itself. Through experimental evaluation, we investigate the I/O performance degradation imposed by storage migration on co-located machines. We examine naive approaches for mitigating this interference by adjusting host system settings and migration parameters. While effective in some contexts, our analysis demonstrates that performing a migration using these I/O constraining techniques will increase migration latency and limit its ability to converge. Therefore, we present a design and analysis of Storage Migration Offloading, a migration method that reduces I/O interference, maintains lower migration latency, and converges under higher dirty rates. Storage Migration Offloading utilizes a buffer store populated during migration using a dynamic cache policy and rate controlled prefetching. Data is transferred to the destination host from both the buffer and primary disks in a way that minimizes interference on the primary disk while attempting to maintain the desired migration speed.
迁移虚拟机及其虚拟存储的过程可能会大大降低在同一主机上运行的其他客户机和应用程序的性能,包括迁移机器本身。通过实验评估,我们研究了存储迁移在同址机器上造成的I/O性能下降。我们研究了通过调整主机系统设置和迁移参数来减轻这种干扰的朴素方法。虽然在某些上下文中是有效的,但我们的分析表明,使用这些I/O约束技术执行迁移将增加迁移延迟并限制其收敛能力。因此,我们提出了一种存储迁移卸载的设计和分析,这种迁移方法可以减少I/O干扰,保持更低的迁移延迟,并在更高的脏速率下收敛。存储迁移卸载利用在迁移过程中使用动态缓存策略和速率控制预取填充的缓冲区存储。数据从缓冲磁盘和主磁盘传输到目标主机的方式是尽量减少对主磁盘的干扰,同时尽量保持所需的迁移速度。
{"title":"Alleviating I/O Interference via Caching and Rate-Controlled Prefetching without Degrading Migration Performance","authors":"Morgan Stuart, Tao Lu, Xubin He","doi":"10.1109/PDSW.2014.8","DOIUrl":"https://doi.org/10.1109/PDSW.2014.8","url":null,"abstract":"The process of migrating a virtual machine and its virtual storage can greatly degrade the performance of other guests and applications running on the same host, including the migrating machine itself. Through experimental evaluation, we investigate the I/O performance degradation imposed by storage migration on co-located machines. We examine naive approaches for mitigating this interference by adjusting host system settings and migration parameters. While effective in some contexts, our analysis demonstrates that performing a migration using these I/O constraining techniques will increase migration latency and limit its ability to converge. Therefore, we present a design and analysis of Storage Migration Offloading, a migration method that reduces I/O interference, maintains lower migration latency, and converges under higher dirty rates. Storage Migration Offloading utilizes a buffer store populated during migration using a dynamic cache policy and rate controlled prefetching. Data is transferred to the destination host from both the buffer and primary disks in a way that minimizes interference on the primary disk while attempting to maintain the desired migration speed.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128428713","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O and Storage Systems 面向混合并行I/O和存储系统的高性能模拟器
Pub Date : 2014-11-16 DOI: 10.1109/PDSW.2014.12
Bo Feng, Ning Liu, Shuibing He, Xian-He Sun
The performance gap between processor and storage device has continuously increased during the past few decades. The gap is further exacerbated recently because applications are becoming more data-intensive in both industry and academia. Traditional storage devices, such as hard disk drives (HDD), fail to keep up with the paces of this growth. A known solution is to use solid state drives (SSD) as fast storage. Due to high cost of SSD, data and supercomputing centers usually adopt a hybrid storage system, which consists of a combination of HDD and SSD I/O servers. However, hybrid I/O and storage systems have increased the complexity, making SSD often underutilized. The configuration and utilization of HDD/SSD hybrid systems is a lasting phenomenon. In this study, we propose a high performance hybrid parallel I/O and storage simulator, HPIS3. As a co-design tool, HPIS3 is capable of simulating a variety of parallel storage systems, especially under hybrid scenarios. The experimental results show that the lowest error rate is 2%, and the average is 11.98%.
在过去的几十年里,处理器和存储设备之间的性能差距不断扩大。最近,由于工业界和学术界的应用程序变得更加数据密集型,这一差距进一步加剧。传统的存储设备,如硬盘驱动器(HDD),无法跟上这种增长的步伐。一个已知的解决方案是使用固态驱动器(SSD)作为快速存储。由于SSD的成本较高,数据中心和超级计算中心通常采用混合存储系统,即HDD和SSD I/O服务器的组合。但是,混合I/O和存储系统增加了复杂性,使得SSD经常得不到充分利用。HDD/SSD混合系统的配置和使用是一个持久的现象。在这项研究中,我们提出了一个高性能混合并行I/O和存储模拟器,HPIS3。作为协同设计工具,HPIS3能够模拟各种并行存储系统,特别是在混合场景下。实验结果表明,该方法的最低错误率为2%,平均错误率为11.98%。
{"title":"HPIS3: Towards a High-Performance Simulator for Hybrid Parallel I/O and Storage Systems","authors":"Bo Feng, Ning Liu, Shuibing He, Xian-He Sun","doi":"10.1109/PDSW.2014.12","DOIUrl":"https://doi.org/10.1109/PDSW.2014.12","url":null,"abstract":"The performance gap between processor and storage device has continuously increased during the past few decades. The gap is further exacerbated recently because applications are becoming more data-intensive in both industry and academia. Traditional storage devices, such as hard disk drives (HDD), fail to keep up with the paces of this growth. A known solution is to use solid state drives (SSD) as fast storage. Due to high cost of SSD, data and supercomputing centers usually adopt a hybrid storage system, which consists of a combination of HDD and SSD I/O servers. However, hybrid I/O and storage systems have increased the complexity, making SSD often underutilized. The configuration and utilization of HDD/SSD hybrid systems is a lasting phenomenon. In this study, we propose a high performance hybrid parallel I/O and storage simulator, HPIS3. As a co-design tool, HPIS3 is capable of simulating a variety of parallel storage systems, especially under hybrid scenarios. The experimental results show that the lowest error rate is 2%, and the average is 11.98%.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127037718","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
BatchFS: Scaling the File System Control Plane with Client-Funded Metadata Servers BatchFS:使用客户端资助的元数据服务器扩展文件系统控制平面
Pub Date : 2014-11-01 DOI: 10.1109/PDSW.2014.7
Qing Zheng, Kai Ren, Garth A. Gibson
Parallel file systems are often characterized by a layered architecture that decouples metadata management from I/O operations, allowing file systems to facilitate fast concurrent access to file contents. However, metadata intensive workloads are still likely to bottleneck at the file system control plane due to namespace synchronization, which taxes application performance through lock contention on directories, transaction serialization, and RPC overheads. In this paper, we propose a client-driven file system metadata architecture, BatchFS, that is optimized for noninteractive, or batch, workloads. To avoid metadata bottlenecks, BatchFS features a relaxed consistency model marked by lazy namespace synchronization and optimistic metadata verification. Capable of executing namespace operations on client-provisioned resources without contacting any metadata server, BatchFS clients are able to delay namespace synchronization until synchronization is really needed. Our goal in this vision paper is to handle these delayed operations securely and efficiently with metadata verification and bulk insertion. Preliminary experiments demonstrate that our client-funded metadata architecture outperforms a traditional synchronous file system by orders of magnitude.
并行文件系统通常以分层体系结构为特征,该体系结构将元数据管理与I/O操作分离,从而允许文件系统促进对文件内容的快速并发访问。但是,由于名称空间同步,元数据密集型工作负载仍然可能在文件系统控制平面上遇到瓶颈,这会通过目录上的锁争用、事务序列化和RPC开销来提高应用程序性能。在本文中,我们提出了一种客户端驱动的文件系统元数据架构BatchFS,它针对非交互式或批处理工作负载进行了优化。为了避免元数据瓶颈,BatchFS采用了一种宽松的一致性模型,其特征是惰性命名空间同步和乐观的元数据验证。BatchFS客户机能够在客户端提供的资源上执行名称空间操作,而无需联系任何元数据服务器,因此可以延迟名称空间同步,直到真正需要同步。本文的目标是通过元数据验证和批量插入安全有效地处理这些延迟操作。初步实验表明,我们的客户端资助的元数据体系结构在数量级上优于传统的同步文件系统。
{"title":"BatchFS: Scaling the File System Control Plane with Client-Funded Metadata Servers","authors":"Qing Zheng, Kai Ren, Garth A. Gibson","doi":"10.1109/PDSW.2014.7","DOIUrl":"https://doi.org/10.1109/PDSW.2014.7","url":null,"abstract":"Parallel file systems are often characterized by a layered architecture that decouples metadata management from I/O operations, allowing file systems to facilitate fast concurrent access to file contents. However, metadata intensive workloads are still likely to bottleneck at the file system control plane due to namespace synchronization, which taxes application performance through lock contention on directories, transaction serialization, and RPC overheads. In this paper, we propose a client-driven file system metadata architecture, BatchFS, that is optimized for noninteractive, or batch, workloads. To avoid metadata bottlenecks, BatchFS features a relaxed consistency model marked by lazy namespace synchronization and optimistic metadata verification. Capable of executing namespace operations on client-provisioned resources without contacting any metadata server, BatchFS clients are able to delay namespace synchronization until synchronization is really needed. Our goal in this vision paper is to handle these delayed operations securely and efficiently with metadata verification and bulk insertion. Preliminary experiments demonstrate that our client-funded metadata architecture outperforms a traditional synchronous file system by orders of magnitude.","PeriodicalId":151633,"journal":{"name":"2014 9th Parallel Data Storage Workshop","volume":"60 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125953950","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 37
期刊
2014 9th Parallel Data Storage Workshop
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1