首页 > 最新文献

2008 3rd Petascale Data Storage Workshop最新文献

英文 中文
Revisiting the metadata architecture of parallel file systems 回顾并行文件系统的元数据体系结构
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811892
N. Ali, A. Devulapalli, D. Dalessandro, P. Wyckoff, P. Sadayappan
As the types of problems we solve in high-performance computing and other areas become more complex, the amount of data generated and used is growing at a rapid rate. Today many terabytes of data are common; tomorrow petabytes of data will be the norm. Much work has been put into increasing capacity and I/O performance for large-scale storage systems. However, one often ignored area is metadata management. Metadata can have a significant impact on the performance of a system. Past approaches have moved metadata activities to a separate server in order to avoid potential interference with data operations. However, with the advent of object-based storage technology, there is a compelling argument to re-couple metadata and data. In this paper we present two metadata management schemes, both of which remove the need for a separate metadata server and replace it with object-based storage.
随着我们在高性能计算和其他领域解决的问题类型变得越来越复杂,生成和使用的数据量正在快速增长。今天,许多tb的数据是常见的;明天拍字节的数据将成为常态。为了提高大规模存储系统的容量和I/O性能,人们做了大量的工作。然而,一个经常被忽略的领域是元数据管理。元数据可以对系统的性能产生重大影响。过去的方法是将元数据活动移到单独的服务器上,以避免对数据操作的潜在干扰。然而,随着基于对象的存储技术的出现,重新耦合元数据和数据是一个令人信服的论点。在本文中,我们提出了两种元数据管理方案,这两种方案都不需要单独的元数据服务器,而是用基于对象的存储取代它。
{"title":"Revisiting the metadata architecture of parallel file systems","authors":"N. Ali, A. Devulapalli, D. Dalessandro, P. Wyckoff, P. Sadayappan","doi":"10.1109/PDSW.2008.4811892","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811892","url":null,"abstract":"As the types of problems we solve in high-performance computing and other areas become more complex, the amount of data generated and used is growing at a rapid rate. Today many terabytes of data are common; tomorrow petabytes of data will be the norm. Much work has been put into increasing capacity and I/O performance for large-scale storage systems. However, one often ignored area is metadata management. Metadata can have a significant impact on the performance of a system. Past approaches have moved metadata activities to a separate server in order to avoid potential interference with data operations. However, with the advent of object-based storage technology, there is a compelling argument to re-couple metadata and data. In this paper we present two metadata management schemes, both of which remove the need for a separate metadata server and replace it with object-based storage.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"134546361","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 16
Logan: Automatic management for evolvable, large-scale, archival storage Logan:自动管理可进化的、大规模的档案存储
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811890
M. Storer, K. Greenan, I. Adams, E. L. Miller, D. Long, K. Voruganti
Archival storage systems designed to preserve scientific data, business data, and consumer data must maintain and safeguard tens to hundreds of petabytes of data on tens of thousands of media for decades. Such systems are currently designed in the same way as higher-performance, shorter-term storage systems, which have a useful lifetime but must be replaced in their entirety via a ldquofork-liftrdquo upgrade. Thus, while existing solutions can provide good energy efficiency and relatively low cost, they do not adapt well to continuous improvements in technology, becoming less efficient relative to current technology as they age. In an archival storage environment, this paradigm implies an endless series of wholesale migrations and upgrades to remain efficient and up to date. Our approach, Logan, manages node addition, removal, and failure on a distributed network of intelligent storage appliances, allowing the system to gradually evolve as device technology advances. By automatically handling most of the common administration chores-integrating new devices into the system, managing groups of devices that work together to provide redundancy, and recovering from failed devices-Logan reduces management overhead and thus cost. Logan can also improve cost and space efficiency by identifying and decommissioning outdated devices, thus reducing space and power requirements for the archival storage system.
用于保存科学数据、商业数据和消费者数据的归档存储系统必须在数十年内维护和保护数万个介质上的数十到数百pb的数据。目前,此类系统的设计方式与高性能、短期存储系统相同,后者具有使用寿命,但必须通过ldquofork-liftrdquo升级来全部替换。因此,虽然现有的解决方案可以提供良好的能源效率和相对较低的成本,但它们不能很好地适应技术的不断改进,随着年龄的增长,相对于当前的技术,它们的效率会降低。在归档存储环境中,此范例意味着一系列的大规模迁移和升级,以保持效率和最新。我们的方法,Logan,管理智能存储设备分布式网络上的节点添加、删除和故障,允许系统随着设备技术的进步而逐渐发展。通过自动处理大多数常见的管理工作(将新设备集成到系统中、管理一起工作以提供冗余的设备组以及从故障设备中恢复),logan减少了管理开销,从而降低了成本。Logan还可以通过识别和淘汰过时的设备来提高成本和空间效率,从而减少档案存储系统的空间和电力需求。
{"title":"Logan: Automatic management for evolvable, large-scale, archival storage","authors":"M. Storer, K. Greenan, I. Adams, E. L. Miller, D. Long, K. Voruganti","doi":"10.1109/PDSW.2008.4811890","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811890","url":null,"abstract":"Archival storage systems designed to preserve scientific data, business data, and consumer data must maintain and safeguard tens to hundreds of petabytes of data on tens of thousands of media for decades. Such systems are currently designed in the same way as higher-performance, shorter-term storage systems, which have a useful lifetime but must be replaced in their entirety via a ldquofork-liftrdquo upgrade. Thus, while existing solutions can provide good energy efficiency and relatively low cost, they do not adapt well to continuous improvements in technology, becoming less efficient relative to current technology as they age. In an archival storage environment, this paradigm implies an endless series of wholesale migrations and upgrades to remain efficient and up to date. Our approach, Logan, manages node addition, removal, and failure on a distributed network of intelligent storage appliances, allowing the system to gradually evolve as device technology advances. By automatically handling most of the common administration chores-integrating new devices into the system, managing groups of devices that work together to provide redundancy, and recovering from failed devices-Logan reduces management overhead and thus cost. Logan can also improve cost and space efficiency by identifying and decommissioning outdated devices, thus reducing space and power requirements for the archival storage system.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"26 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127859479","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Comparing performance of solid state devices and mechanical disks 比较固态器件和机械磁盘的性能
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811886
Milo Polte, J. Simsa, Garth A. Gibson
In terms of performance, solid state devices promise to be superior technology to mechanical disks. This study investigates performance of several up-to-date high-end consumer and enterprise Flash solid state devices (SSDs) and relates their performance to that of mechanical disks. For the purpose of this evaluation, the IOZone benchmark is run in single-threaded mode with varying request size and access pattern on an ext3 filesystem mounted on these devices. The price of the measured devices is then used to allow for comparison of price per performance. Measurements presented in this study offer an evaluation of cost-effectiveness of a Flash based SSD storage solution over a range of workloads. In particular, for sequential access pattern the SSDs are up to 10 times faster for reads and up to 5 times faster than the disks. For random reads, the SSDs provide up to 200times performance advantage. For random writes the SSDs provide up to 135times performance advantage. After weighting these numbers against the prices of the tested devices, we can conclude that SSDs are approaching price per performance of magnetic disks for sequential access patterns workloads and are superior technology to magnetic disks for random access patterns.
就性能而言,固态器件有望成为优于机械磁盘的技术。本研究调查了几种最新的高端消费者和企业闪存固态设备(ssd)的性能,并将其性能与机械磁盘的性能联系起来。出于评估的目的,IOZone基准测试在单线程模式下运行,在这些设备上安装的ext3文件系统上具有不同的请求大小和访问模式。然后使用测量设备的价格来比较每个性能的价格。本研究中提出的测量方法对基于闪存的SSD存储解决方案在各种工作负载下的成本效益进行了评估。特别是,对于顺序访问模式,ssd的读取速度比磁盘快10倍,快5倍。对于随机读取,ssd提供高达200倍的性能优势。对于随机写入,ssd提供高达135倍的性能优势。在将这些数字与测试设备的价格进行加权后,我们可以得出结论,对于顺序访问模式工作负载,ssd接近磁盘的每性能价格,并且是优于随机访问模式的磁盘的技术。
{"title":"Comparing performance of solid state devices and mechanical disks","authors":"Milo Polte, J. Simsa, Garth A. Gibson","doi":"10.1109/PDSW.2008.4811886","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811886","url":null,"abstract":"In terms of performance, solid state devices promise to be superior technology to mechanical disks. This study investigates performance of several up-to-date high-end consumer and enterprise Flash solid state devices (SSDs) and relates their performance to that of mechanical disks. For the purpose of this evaluation, the IOZone benchmark is run in single-threaded mode with varying request size and access pattern on an ext3 filesystem mounted on these devices. The price of the measured devices is then used to allow for comparison of price per performance. Measurements presented in this study offer an evaluation of cost-effectiveness of a Flash based SSD storage solution over a range of workloads. In particular, for sequential access pattern the SSDs are up to 10 times faster for reads and up to 5 times faster than the disks. For random reads, the SSDs provide up to 200times performance advantage. For random writes the SSDs provide up to 135times performance advantage. After weighting these numbers against the prices of the tested devices, we can conclude that SSDs are approaching price per performance of magnetic disks for sequential access patterns workloads and are superior technology to magnetic disks for random access patterns.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"45 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130827696","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 55
Input/output APIs and data organization for high performance scientific computing 高性能科学计算的输入/输出api和数据组织
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811881
J. Lofstead, F. Zheng, S. Klasky, K. Schwan
Scientific Data Management has become essential to the productivity of scientists using ever larger machines and running applications that produce ever more data. There are several specific issues when running on petascale (and beyond) machines. One is the need for massively parallel data output, which in part, depends on the data formats and semantics being used. Here, the inhibition of parallelism by file system notions of strict and immediate consistency can be addressed with ldrdelayed data consistencypsila methods. Such methods can also be used to remove the runtime coordination steps required for immediate consistency from machine resources like Bluegene's separate networks for barrier calls and its dedicated IO nodes, thereby freeing them to instead, perform alternate tasks that enhance data output performance and/or richness. Second, once data is generated, it is important to be able to efficiently access it, which implies the need for rapid data characterization and indexing. This can be achieved by adding small amounts of metadata to the output process, thereby permitting scientists to quickly make informed decisions about which files to process from large-scale science runs. Third, failure probabilities increase with an increasing number of nodes, which suggests the need for organizing output data to be resilient to failures in which the output from a single or from a small number of nodes is lost or corrupted. This paper demonstrates the utility of using delayed consistency methods for the process of data output from the compute nodes of petascale machines. It also demonstrates the advantages derived from resilient data organization coupled with lightweight methods for data indexing. An implementation of these techniques is realized in ADIOS, the Adaptable IO System, and its BP intermediate file format. The implementation is designed to be compatible with existing, well-known file formats like HDF-5 and NetCDF, thereby permitting end users to exploit the rich tool chains for these formats. Initial performance evaluations of the approach exhibit substantial performance advantages over using native parallel HDF-5 in the Chimera supernova code.
科学数据管理对于使用越来越大的机器和运行产生越来越多数据的应用程序的科学家的生产力已经变得至关重要。在千兆级(甚至更高)机器上运行时有几个特定的问题。一个是需要大规模并行数据输出,这在一定程度上取决于所使用的数据格式和语义。在这里,文件系统严格和即时一致性概念对并行性的抑制可以用ldrdelayed data consistencsila方法解决。这些方法还可以用于从机器资源中移除即时一致性所需的运行时协调步骤,例如Bluegene的屏障调用和专用IO节点的单独网络,从而释放它们来执行增强数据输出性能和/或丰富性的替代任务。其次,一旦生成数据,重要的是能够有效地访问它,这意味着需要快速的数据表征和索引。这可以通过向输出过程中添加少量元数据来实现,从而允许科学家快速做出明智的决定,决定从大规模科学运行中处理哪些文件。第三,故障概率随着节点数量的增加而增加,这表明需要组织输出数据以适应单个或少数节点的输出丢失或损坏的故障。本文演示了使用延迟一致性方法处理千万亿级计算机计算节点的数据输出过程的实用性。它还演示了弹性数据组织与轻量级数据索引方法相结合所带来的优势。这些技术在ADIOS (adaptive IO System)及其BP中间文件格式中实现。该实现旨在与现有的知名文件格式(如HDF-5和NetCDF)兼容,从而允许最终用户利用这些格式的丰富工具链。该方法的初步性能评估显示,与在Chimera超新星代码中使用本地并行HDF-5相比,该方法具有显著的性能优势。
{"title":"Input/output APIs and data organization for high performance scientific computing","authors":"J. Lofstead, F. Zheng, S. Klasky, K. Schwan","doi":"10.1109/PDSW.2008.4811881","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811881","url":null,"abstract":"Scientific Data Management has become essential to the productivity of scientists using ever larger machines and running applications that produce ever more data. There are several specific issues when running on petascale (and beyond) machines. One is the need for massively parallel data output, which in part, depends on the data formats and semantics being used. Here, the inhibition of parallelism by file system notions of strict and immediate consistency can be addressed with ldrdelayed data consistencypsila methods. Such methods can also be used to remove the runtime coordination steps required for immediate consistency from machine resources like Bluegene's separate networks for barrier calls and its dedicated IO nodes, thereby freeing them to instead, perform alternate tasks that enhance data output performance and/or richness. Second, once data is generated, it is important to be able to efficiently access it, which implies the need for rapid data characterization and indexing. This can be achieved by adding small amounts of metadata to the output process, thereby permitting scientists to quickly make informed decisions about which files to process from large-scale science runs. Third, failure probabilities increase with an increasing number of nodes, which suggests the need for organizing output data to be resilient to failures in which the output from a single or from a small number of nodes is lost or corrupted. This paper demonstrates the utility of using delayed consistency methods for the process of data output from the compute nodes of petascale machines. It also demonstrates the advantages derived from resilient data organization coupled with lightweight methods for data indexing. An implementation of these techniques is realized in ADIOS, the Adaptable IO System, and its BP intermediate file format. The implementation is designed to be compatible with existing, well-known file formats like HDF-5 and NetCDF, thereby permitting end users to exploit the rich tool chains for these formats. Initial performance evaluations of the approach exhibit substantial performance advantages over using native parallel HDF-5 in the Chimera supernova code.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"22 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133143641","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
Scalable full-text search for petascale file systems 可伸缩的全文搜索千万亿级文件系统
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811884
A. Leung, E. L. Miller
As file system capacities reach the petascale, it is becoming increasingly difficult for users to organize, find, and manage their data. File system search has the potential to greatly improve how users manage and access files. Unfortunately, existing file system search is designed for smaller scale systems, making it difficult for existing solutions to scale to petascale files systems. In this paper, we motivate the importance of file system search in petascale file systems and present a new full text file system search design for petascale file systems. Unlike existing solutions, our design exploits file system properties. Using a novel index partitioning mechanism that utilizes file system namespace locality, we are able to improve search scalability and performance and we discuss how such a design can potentially improve search security and ranking.We describe how our design can be implemented within the Ceph petascale file system.
随着文件系统容量达到千兆级,用户组织、查找和管理数据变得越来越困难。文件系统搜索有可能极大地改善用户管理和访问文件的方式。不幸的是,现有的文件系统搜索是为较小规模的系统设计的,这使得现有的解决方案很难扩展到千兆级的文件系统。本文阐述了文件系统搜索在千兆级文件系统中的重要性,提出了一种新的千兆级文件系统全文搜索设计。与现有的解决方案不同,我们的设计利用了文件系统属性。使用一种利用文件系统名称空间局部性的新颖索引分区机制,我们能够提高搜索可伸缩性和性能,并讨论了这种设计如何潜在地提高搜索安全性和排名。我们描述了如何在Ceph千兆级文件系统中实现我们的设计。
{"title":"Scalable full-text search for petascale file systems","authors":"A. Leung, E. L. Miller","doi":"10.1109/PDSW.2008.4811884","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811884","url":null,"abstract":"As file system capacities reach the petascale, it is becoming increasingly difficult for users to organize, find, and manage their data. File system search has the potential to greatly improve how users manage and access files. Unfortunately, existing file system search is designed for smaller scale systems, making it difficult for existing solutions to scale to petascale files systems. In this paper, we motivate the importance of file system search in petascale file systems and present a new full text file system search design for petascale file systems. Unlike existing solutions, our design exploits file system properties. Using a novel index partitioning mechanism that utilizes file system namespace locality, we are able to improve search scalability and performance and we discuss how such a design can potentially improve search security and ranking.We describe how our design can be implemented within the Ceph petascale file system.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124937299","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Introducing map-reduce to high end computing 将map-reduce引入高端计算
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811889
Grant Mackey, S. Sehrish, John Bent, J. López, S. Habib, J. Wang
In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply data mining, but that it is not a panacea for all data intensive applications. We provide an example of how the halo finding application, when applied to large astrophysics datasets, benefits from the model of the Hadoop architecture. The halo finding application uses a friends of friends algorithm to quickly cluster together large sets of particles to output files which a visualization software can interpret. The current implementation requires that large datasets be moved from storage to computation resources for every simulation of astronomy data. Our Hadoop implementation allows for an in-place halo finding application on the datasets, which removes the time consuming process of transferring data between resources.
在这项工作中,我们提出了一个科学应用程序,该应用程序已经实现了Hadoop MapReduce。我们还讨论了其他可以从MapReduce实现中受益的超级计算科学领域。在这项工作中,我们认识到Hadoop对更多应用程序有潜在的好处,而不仅仅是数据挖掘,但它并不是所有数据密集型应用程序的灵丹妙药。我们提供了一个例子,说明光晕查找应用程序在应用于大型天体物理数据集时,如何从Hadoop架构模型中受益。光晕查找应用程序使用朋友的朋友算法将大量粒子快速聚类到一起,输出文件,可视化软件可以解释这些文件。目前的实现需要将大型数据集从存储转移到计算资源,以进行每次天文数据模拟。我们的Hadoop实现允许在数据集上进行就地光环查找应用程序,从而消除了在资源之间传输数据的耗时过程。
{"title":"Introducing map-reduce to high end computing","authors":"Grant Mackey, S. Sehrish, John Bent, J. López, S. Habib, J. Wang","doi":"10.1109/PDSW.2008.4811889","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811889","url":null,"abstract":"In this work we present an scientific application that has been given a Hadoop MapReduce implementation. We also discuss other scientific fields of supercomputing that could benefit from a MapReduce implementation. We recognize in this work that Hadoop has potential benefit for more applications than simply data mining, but that it is not a panacea for all data intensive applications. We provide an example of how the halo finding application, when applied to large astrophysics datasets, benefits from the model of the Hadoop architecture. The halo finding application uses a friends of friends algorithm to quickly cluster together large sets of particles to output files which a visualization software can interpret. The current implementation requires that large datasets be moved from storage to computation resources for every simulation of astronomy data. Our Hadoop implementation allows for an in-place halo finding application on the datasets, which removes the time consuming process of transferring data between resources.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"332 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124679709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 75
Arbitrary dimension Reed-Solomon coding and decoding for extended RAID on GPUs 任意维度里德-所罗门编码和解码扩展RAID在gpu上
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811887
M. Curry, A. Skjellum, H. Ward, R. Brightwell
Reed-Solomon coding is a method of generating arbitrary amounts of checksum information from original data via matrix-vector multiplication in finite fields. Previous work has shown that CPUs are not well-matched to this type of computation, but recent graphical processing units (GPUs) have been shown through a case study to perform this encoding quickly for the 3 + 3 (three data + three parity) case. In order to be utilized in a true RAID-like system, it is important to understand how well this computation can scale in the number of data disks supported. This paper details the performance of a general Reed-Solomon encoding and decoding library that is suitable for use in RAID-like systems. Both generation and recovery are performance-tested and discussed.
里德-所罗门编码是一种在有限域内通过矩阵-向量乘法从原始数据生成任意数量的校验和信息的方法。以前的工作表明,cpu不能很好地匹配这种类型的计算,但最近的图形处理单元(gpu)已经通过一个案例研究显示,可以为3 + 3(三个数据+三个奇偶校验)的情况快速执行这种编码。为了在真正的类似raid的系统中得到利用,重要的是要了解这种计算在支持的数据磁盘数量上的可伸缩性。本文详细介绍了适用于类raid系统的通用Reed-Solomon编码和解码库的性能。生成和回收都进行了性能测试和讨论。
{"title":"Arbitrary dimension Reed-Solomon coding and decoding for extended RAID on GPUs","authors":"M. Curry, A. Skjellum, H. Ward, R. Brightwell","doi":"10.1109/PDSW.2008.4811887","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811887","url":null,"abstract":"Reed-Solomon coding is a method of generating arbitrary amounts of checksum information from original data via matrix-vector multiplication in finite fields. Previous work has shown that CPUs are not well-matched to this type of computation, but recent graphical processing units (GPUs) have been shown through a case study to perform this encoding quickly for the 3 + 3 (three data + three parity) case. In order to be utilized in a true RAID-like system, it is important to understand how well this computation can scale in the number of data disks supported. This paper details the performance of a general Reed-Solomon encoding and decoding library that is suitable for use in RAID-like systems. Both generation and recovery are performance-tested and discussed.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"53 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129423769","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Fast log-based concurrent writing of checkpoints 基于日志的检查点的快速并发写入
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811882
Milo Polte, Jiri Simsa, Wittawat Tantisiriroj, Garth A. Gibson, Shobhit Dayal, Mikhail Chainani, Dilip Kumar Uppugandla
This report describes how a file system level log-based technique can improve the write performance of many-to-one write checkpoint workload typical for high performance computations. It is shown that a simple log-based organization can provide for substantial improvements in the write performance while retaining the convenience of a single flat file abstraction. The improvement of the write performance comes at the cost of degraded read performance however. Techniques to alleviate the read performance penalty, such as file reconstruction on the first read, are discussed.
本报告描述了基于文件系统级日志的技术如何提高多对一写检查点工作负载的写性能,这种工作负载通常用于高性能计算。结果表明,简单的基于日志的组织可以在保留单个平面文件抽象的便利性的同时,大幅提高写性能。但是,写性能的提高是以读性能下降为代价的。本文还讨论了减轻读取性能损失的技术,例如在第一次读取时进行文件重构。
{"title":"Fast log-based concurrent writing of checkpoints","authors":"Milo Polte, Jiri Simsa, Wittawat Tantisiriroj, Garth A. Gibson, Shobhit Dayal, Mikhail Chainani, Dilip Kumar Uppugandla","doi":"10.1109/PDSW.2008.4811882","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811882","url":null,"abstract":"This report describes how a file system level log-based technique can improve the write performance of many-to-one write checkpoint workload typical for high performance computations. It is shown that a simple log-based organization can provide for substantial improvements in the write performance while retaining the convenience of a single flat file abstraction. The improvement of the write performance comes at the cost of degraded read performance however. Techniques to alleviate the read performance penalty, such as file reconstruction on the first read, are discussed.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129366709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 23
Zest Checkpoint storage system for large supercomputers 用于大型超级计算机的Zest Checkpoint存储系统
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811883
P. Nowoczynski, N. Stone, J. Yanovich, J. Sommerfield
The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.
PSC已经开发了一个原型分布式文件系统基础设施,它极大地加速了大型计算平台上的聚合写带宽。由于写入检查点数据、可视化数据和后处理(多阶段)数据,写带宽比读带宽更成为HPC I/O场景中的主要瓶颈。我们已经设计了一个可扩展的解决方案原型,它将直接适用于未来具有10^6个内核的千万亿次计算平台。我们的设计强调高效的可扩展性、低成本的商品组件、轻量级软件层、端到端并行、客户端缓存和软件奇偶性,以及一个独特的负载平衡模型,即在高速中间存储上输出I/O,然后异步重构到第三方并行文件系统。
{"title":"Zest Checkpoint storage system for large supercomputers","authors":"P. Nowoczynski, N. Stone, J. Yanovich, J. Sommerfield","doi":"10.1109/PDSW.2008.4811883","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811883","url":null,"abstract":"The PSC has developed a prototype distributed file system infrastructure that vastly accelerates aggregated write bandwidth on large compute platforms. Write bandwidth, more than read bandwidth, is the dominant bottleneck in HPC I/O scenarios due to writing checkpoint data, visualization data and post-processing (multi-stage) data. We have prototyped a scalable solution that will be directly applicable to future petascale compute platforms having of order 10^6 cores. Our design emphasizes high-efficiency scalability, low-cost commodity components, lightweight software layers, end-to-end parallelism, client-side caching and software parity, and a unique model of load-balancing outgoing I/O onto high-speed intermediate storage followed by asynchronous reconstruction to a 3rd-party parallel file system.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"31 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131564676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 34
Just-in-time staging of large input data for supercomputing jobs 为超级计算工作提供大输入数据的实时分期
Pub Date : 2008-11-01 DOI: 10.1109/PDSW.2008.4811891
H. M. Monti, A. Butt, S. Vazhkudai
High performance computing is facing a data deluge from state-of-the-art colliders and observatories. Large data-sets from these facilities, and other end-user sites, are often inputs to intensive analyses on modern supercomputers. Timely staging in of input data at the supercomputer's local storage can not only optimize space usage, but also protect against delays due to storage system failures. To this end, we propose a just-in-time staging framework that uses a combination of batch-queue predictions, user-specified intermediate nodes, and decentralized data delivery to coincide input data staging with job startup. Our preliminary prototype has been integrated with widely used tools such as the PBS job submission system, BitTorrent data delivery, and Network Weather Service network monitoring facility.
高性能计算正面临着来自最先进的对撞机和天文台的数据洪流。来自这些设施和其他终端用户站点的大量数据集经常被输入到现代超级计算机上进行深入分析。在超级计算机的本地存储中及时地分段输入数据,不仅可以优化空间使用,而且可以防止由于存储系统故障而导致的延迟。为此,我们提出了一个即时分段框架,该框架结合了批处理队列预测、用户指定的中间节点和分散的数据交付,以使输入数据分段与作业启动相一致。我们的初步原型已经集成了广泛使用的工具,如PBS作业提交系统、BitTorrent数据传输和网络气象服务网络监测设施。
{"title":"Just-in-time staging of large input data for supercomputing jobs","authors":"H. M. Monti, A. Butt, S. Vazhkudai","doi":"10.1109/PDSW.2008.4811891","DOIUrl":"https://doi.org/10.1109/PDSW.2008.4811891","url":null,"abstract":"High performance computing is facing a data deluge from state-of-the-art colliders and observatories. Large data-sets from these facilities, and other end-user sites, are often inputs to intensive analyses on modern supercomputers. Timely staging in of input data at the supercomputer's local storage can not only optimize space usage, but also protect against delays due to storage system failures. To this end, we propose a just-in-time staging framework that uses a combination of batch-queue predictions, user-specified intermediate nodes, and decentralized data delivery to coincide input data staging with job startup. Our preliminary prototype has been integrated with widely used tools such as the PBS job submission system, BitTorrent data delivery, and Network Weather Service network monitoring facility.","PeriodicalId":227342,"journal":{"name":"2008 3rd Petascale Data Storage Workshop","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122380106","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
期刊
2008 3rd Petascale Data Storage Workshop
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1