首页 > 最新文献

2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os最新文献

英文 中文
DIG: Rapid Characterization of Modern Hard Disk Drive and Its Performance Implication DIG:现代硬盘驱动器的快速表征及其性能含义
J. Gim, Y. Won, Jae-Sung Chang
In this work, we develop novel disk characterization suite, DIG (disk geometry analyzer), which allows us to rapidly extract and to characterize the key performance metric of modern hard disk drive. Development of this tool is accompanied by thorough examination of four off-the-shelf hard disk drives. DIG consists of three key ingredients: O(1) track boundary detection algorithm,O (log n) zone boundary detection algorithm, and hybrid sampling based seek time profiling. We particularly focus on addressing the scalability aspect of disk characterization. With DIG, we are able to extract key metrics of hard disk drive within 3-20 min. DIG allows us to determine the sector layout mechanism of the underlying hard disk drive, e.g. hybrid serpentine, cylinder serpentine and surface serpentine, and to build complete sector map from LBN to three dimensional space of (cylinder, head, sector). Examining the disks with DIG, we found a number of important observations. Modern hard disk drive puts great emphasis on minimizing the head switch overhead. This is done via sector layout mechanism and and surface serpentine and hybrid serpentine is the typical way of avoiding it. Legacy disk seek time model leaves much to be desired to be used in modern hard disk drive especially in short seeks (less than 5000 tracks).
在这项工作中,我们开发了新的磁盘表征套件,DIG(磁盘几何分析仪),它使我们能够快速提取和表征现代硬盘驱动器的关键性能指标。该工具的开发伴随着对四个现成硬盘驱动器的彻底检查。DIG由三个关键部分组成:O(1)航迹边界检测算法、O(log n)区域边界检测算法和基于混合采样的寻道时间分析。我们特别关注磁盘特性的可伸缩性方面。使用DIG,我们可以在3-20分钟内提取硬盘驱动器的关键指标。DIG使我们能够确定底层硬盘驱动器的扇区布局机制,例如混合蛇形,圆柱形蛇形和表面蛇形,并构建从LBN到(圆柱形,磁头,扇区)三维空间的完整扇区映射。用DIG检查磁盘,我们发现了许多重要的观察结果。现代硬盘驱动器非常强调最小化磁头开关开销。这是通过扇形布局机制实现的,而表面蛇形和混合蛇形是典型的避免方式。传统的磁盘寻道时间模型在现代硬盘驱动器中,特别是在短寻道(少于5000磁道)中,有很多需要改进的地方。
{"title":"DIG: Rapid Characterization of Modern Hard Disk Drive and Its Performance Implication","authors":"J. Gim, Y. Won, Jae-Sung Chang","doi":"10.1109/SNAPI.2008.13","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.13","url":null,"abstract":"In this work, we develop novel disk characterization suite, DIG (disk geometry analyzer), which allows us to rapidly extract and to characterize the key performance metric of modern hard disk drive. Development of this tool is accompanied by thorough examination of four off-the-shelf hard disk drives. DIG consists of three key ingredients: O(1) track boundary detection algorithm,O (log n) zone boundary detection algorithm, and hybrid sampling based seek time profiling. We particularly focus on addressing the scalability aspect of disk characterization. With DIG, we are able to extract key metrics of hard disk drive within 3-20 min. DIG allows us to determine the sector layout mechanism of the underlying hard disk drive, e.g. hybrid serpentine, cylinder serpentine and surface serpentine, and to build complete sector map from LBN to three dimensional space of (cylinder, head, sector). Examining the disks with DIG, we found a number of important observations. Modern hard disk drive puts great emphasis on minimizing the head switch overhead. This is done via sector layout mechanism and and surface serpentine and hybrid serpentine is the typical way of avoiding it. Legacy disk seek time model leaves much to be desired to be used in modern hard disk drive especially in short seeks (less than 5000 tracks).","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"39 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123865637","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Data Structure Consistency Using Atomic Operations in Storage Devices 使用原子操作实现存储设备数据结构一致性
A. Devulapalli, D. Dalessandro, P. Wyckoff
Managing concurrency is a fundamental requirement for any multi-threaded system, frequently implemented by serializing critical code regions or using object locks on shared resources. Storage systems are one case of this, where multiple clients may wish to access or modify on-disk objects concurrently yet safely. Data consistency may be provided by an inter-client protocol, or it can be implemented in the file system server or storage device. In this work we demonstrate ways of enabling atomic operations on object-based storage devices (OSDs), in particular, the compare-and-swap and fetch-and-add atomic primitives. With examples from basic disk resident data structures to higher level applications like file systems, we show how atomics-capable storage devices can be used to solve consistency requirements of distributed algorithms. Offloading consistency management to storage devices obviates the need for dedicated lock manager servers.
管理并发性是任何多线程系统的基本要求,通常通过序列化关键代码区域或在共享资源上使用对象锁来实现。存储系统就是这样一种情况,其中多个客户机可能希望同时安全地访问或修改磁盘上的对象。数据一致性可以由客户端间协议提供,也可以在文件系统服务器或存储设备中实现。在本文中,我们演示了在基于对象的存储设备(osd)上启用原子操作的方法,特别是比较-交换和获取-添加原子原语。通过从基本磁盘驻留数据结构到高级应用程序(如文件系统)的示例,我们展示了如何使用支持原子的存储设备来解决分布式算法的一致性需求。将一致性管理卸载到存储设备上,就不需要专用的锁管理器服务器了。
{"title":"Data Structure Consistency Using Atomic Operations in Storage Devices","authors":"A. Devulapalli, D. Dalessandro, P. Wyckoff","doi":"10.1109/SNAPI.2008.14","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.14","url":null,"abstract":"Managing concurrency is a fundamental requirement for any multi-threaded system, frequently implemented by serializing critical code regions or using object locks on shared resources. Storage systems are one case of this, where multiple clients may wish to access or modify on-disk objects concurrently yet safely. Data consistency may be provided by an inter-client protocol, or it can be implemented in the file system server or storage device. In this work we demonstrate ways of enabling atomic operations on object-based storage devices (OSDs), in particular, the compare-and-swap and fetch-and-add atomic primitives. With examples from basic disk resident data structures to higher level applications like file systems, we show how atomics-capable storage devices can be used to solve consistency requirements of distributed algorithms. Offloading consistency management to storage devices obviates the need for dedicated lock manager servers.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129588423","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
A Model for Storage Processes in Network Environment and Its Implementation 网络环境下存储过程模型及其实现
T. Bilski
Computer networks are frequently used as a communication channel between applications and storage resources. The diversity of the architectures, communication protocols and user objectives induces vast research agenda. Both networks and storage systems have their models, used to analyze and optimize separately the two different aspects (transmission and storage) of information systems. On the other hand the crucial features of the services are similar - throughput, security, reliability. There are instances of the convergence between the tools and methods used in storage and transmission to achieve the same purposes. For example hard disk sector structure is similar to Ethernet frame. So, it should be possible to provide a single, universal model for transmission and storage processes. Such model may be used to analyze both processes jointly. Furthermore the model may be used to provide formal definitions of some storage processes such as backup, replication, migration.
计算机网络经常被用作应用程序和存储资源之间的通信通道。架构、通信协议和用户目标的多样性引发了大量的研究议程。网络和存储系统都有各自的模型,用于分别分析和优化信息系统的两个不同方面(传输和存储)。另一方面,这些服务的关键特征是相似的——吞吐量、安全性、可靠性。为了达到同样的目的,在存储和传输中使用的工具和方法之间存在着趋同的情况。例如硬盘扇区结构就类似于以太网帧。因此,为传输和存储过程提供一个单一的、通用的模型应该是可能的。该模型可用于两个过程的联合分析。此外,该模型还可用于提供一些存储过程(如备份、复制、迁移)的正式定义。
{"title":"A Model for Storage Processes in Network Environment and Its Implementation","authors":"T. Bilski","doi":"10.1109/SNAPI.2008.8","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.8","url":null,"abstract":"Computer networks are frequently used as a communication channel between applications and storage resources. The diversity of the architectures, communication protocols and user objectives induces vast research agenda. Both networks and storage systems have their models, used to analyze and optimize separately the two different aspects (transmission and storage) of information systems. On the other hand the crucial features of the services are similar - throughput, security, reliability. There are instances of the convergence between the tools and methods used in storage and transmission to achieve the same purposes. For example hard disk sector structure is similar to Ethernet frame. So, it should be possible to provide a single, universal model for transmission and storage processes. Such model may be used to analyze both processes jointly. Furthermore the model may be used to provide formal definitions of some storage processes such as backup, replication, migration.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"81 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125890462","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Implementation and Evaluation of iSCSI over RDMA 基于RDMA的iSCSI实现与评价
E. Burns, R. Russell
The Internet small computer systems interface (iSCSI) is a storage network technology that allows block-level access to storage devices, such as disks, over a computer network. Because iSCSI runs over the ubiquitous TCP/IP protocol, it has many advantages over proprietary alternatives. Due to the recent introduction of 10 gigabit Ethernet, storage vendors are interested in the benefits this large increase in network bandwidth could bring to iSCSI. To make full use of the bandwidth provided by a 10 gigabit Ethernet link, specialized remote direct memory access (RDMA) hardware is being developed to offload processing and reduce the data-copy-overhead found in a standard TCP/IP network stack. This paper focuses on the development of an iSCSI software implementation capable of supporting this new hardware, and a preliminary evaluation ofits performance. We describe an approach used to implement iSCSI extensions for remote direct memory access (iSER) with the UNH iSCSI reference implementation. This involved a threestep process: moving UNH-iSCSI software from the Linux kernel to user-space, adding support for the iSER extensions to the user-space iSCSI, and finally moving everything back into the Linux kernel. Results are given that show improved performance of the completed iSER-assisted iSCSI implementation on RDMA hardware.
Internet小型计算机系统接口(iSCSI)是一种存储网络技术,它允许通过计算机网络对存储设备(如磁盘)进行块级访问。由于iSCSI运行在无处不在的TCP/IP协议上,因此它比专有替代方案有许多优点。由于最近引入了10千兆以太网,存储供应商对这种网络带宽的大幅增加可能给iSCSI带来的好处很感兴趣。为了充分利用10千兆以太网链路提供的带宽,正在开发专门的远程直接内存访问(RDMA)硬件,以卸载处理并减少标准TCP/IP网络堆栈中的数据复制开销。本文的重点是开发一种能够支持这种新硬件的iSCSI软件实现,并对其性能进行初步评估。我们描述了一种使用UNH iSCSI参考实现实现远程直接内存访问(iSER)的iSCSI扩展的方法。这涉及到一个三步的过程:将UNH-iSCSI软件从Linux内核移到用户空间,向用户空间iSCSI添加对iSER扩展的支持,最后将所有内容移回Linux内核。结果表明,完成的iser辅助iSCSI在RDMA硬件上的实现提高了性能。
{"title":"Implementation and Evaluation of iSCSI over RDMA","authors":"E. Burns, R. Russell","doi":"10.1109/SNAPI.2008.12","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.12","url":null,"abstract":"The Internet small computer systems interface (iSCSI) is a storage network technology that allows block-level access to storage devices, such as disks, over a computer network. Because iSCSI runs over the ubiquitous TCP/IP protocol, it has many advantages over proprietary alternatives. Due to the recent introduction of 10 gigabit Ethernet, storage vendors are interested in the benefits this large increase in network bandwidth could bring to iSCSI. To make full use of the bandwidth provided by a 10 gigabit Ethernet link, specialized remote direct memory access (RDMA) hardware is being developed to offload processing and reduce the data-copy-overhead found in a standard TCP/IP network stack. This paper focuses on the development of an iSCSI software implementation capable of supporting this new hardware, and a preliminary evaluation ofits performance. We describe an approach used to implement iSCSI extensions for remote direct memory access (iSER) with the UNH iSCSI reference implementation. This involved a threestep process: moving UNH-iSCSI software from the Linux kernel to user-space, adding support for the iSER extensions to the user-space iSCSI, and finally moving everything back into the Linux kernel. Results are given that show improved performance of the completed iSER-assisted iSCSI implementation on RDMA hardware.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"7 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130090257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
On Maximizing iSCSI Throughput Using Multiple Connections with Automatic Parallelism Tuning 最大化iSCSI吞吐量使用多个连接与自动并行调优
F. Inoue, H. Ohsaki, Y. Nomoto, M. Imase
In this paper, we propose an iSCSI-APT (iSCSI with automatic parallelism tuning) that maximizes iSCSI throughput in long-fat networks. In recent years, as a protocol for building SANs (Storage Area Networks), iSCSI has been attracting attention for its low cost and high compatibility with existing networking infrastructure. However, it has been known that iSCSI throughput degrades in a long-fat network. iSCSI supports a feature called multiple connections, which allows data delivery over multiple TCP connections in a single session. However, for effective utilization of the multiple connections feature, the number of multiple connections must be appropriately configured according to the network status. In this paper, we propose the iSCSI-APT that automatically adjusts the number of multiple connections according to the network status. Through experiments using our iSCSI-APT implementation, we demonstrate that iSCSI-APT operates quite effectively regardless of the network delay.
在本文中,我们提出了一种iSCSI- apt(具有自动并行调优的iSCSI),它可以最大限度地提高长脂肪网络中的iSCSI吞吐量。近年来,iSCSI作为一种构建san (Storage Area network,存储区域网络)的协议,以其低成本和与现有网络基础设施的高兼容性而备受关注。然而,众所周知,iSCSI吞吐量在长脂肪网络中会降低。iSCSI支持一种称为多连接的特性,它允许在单个会话中通过多个TCP连接传输数据。但是,为了有效地利用多连接特性,必须根据网络状况合理配置多连接数。在本文中,我们提出了根据网络状态自动调整多个连接数量的iSCSI-APT。通过使用我们的iSCSI-APT实现的实验,我们证明了无论网络延迟如何,iSCSI-APT都能非常有效地运行。
{"title":"On Maximizing iSCSI Throughput Using Multiple Connections with Automatic Parallelism Tuning","authors":"F. Inoue, H. Ohsaki, Y. Nomoto, M. Imase","doi":"10.1109/SNAPI.2008.15","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.15","url":null,"abstract":"In this paper, we propose an iSCSI-APT (iSCSI with automatic parallelism tuning) that maximizes iSCSI throughput in long-fat networks. In recent years, as a protocol for building SANs (Storage Area Networks), iSCSI has been attracting attention for its low cost and high compatibility with existing networking infrastructure. However, it has been known that iSCSI throughput degrades in a long-fat network. iSCSI supports a feature called multiple connections, which allows data delivery over multiple TCP connections in a single session. However, for effective utilization of the multiple connections feature, the number of multiple connections must be appropriately configured according to the network status. In this paper, we propose the iSCSI-APT that automatically adjusts the number of multiple connections according to the network status. Through experiments using our iSCSI-APT implementation, we demonstrate that iSCSI-APT operates quite effectively regardless of the network delay.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126732517","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
ADMAD: Application-Driven Metadata Aware De-duplication Archival Storage System ADMAD:应用驱动的元数据感知重复数据删除归档存储系统
Chuanyi Liu, Yingping Lu, Chunhui Shi, Guanlin Lu, D. Du, Dong-Sheng Wang
There is a huge amount of duplicated or redundant data in current storage systems. So data de-duplication, which uses lossless data compression schemes to minimize the duplicated data at the inter-file level, has been receiving broad attention in recent years. But there are still research challenges in current approaches and storage systems, such as: how to chunking the files more efficiently and better leverage potential similarity and identity among dedicated applications; how to store the chunks effectively and reliably into secondary storage devices. In this paper, we propose ADMAD: an application-driven metadata aware de-duplication archival storage system, which makes use of certain meta-data information of different levels in the I/O path to direct the file partitioning into more meaningful data chunks (MC) to maximally reduce the inter-file level duplications. However, the chunks may be with different lengths and variable sizes, storing them into storage devices may result in a lot of fragments and involve a high percentage of random disk accesses, which is very inefficient. Therefore, in ADMAD, chunks are further packaged into fixed sized objects as the storage units to speed up the I/O performance as well as to ease the data management. Preliminary experiments have demonstrated that the proposed system can further reduce the required storage space when compared with current methods (from 20% to near 50% according to several datasets), and largely improves the writing performance (about 50%-70% in average).
当前存储系统中存在大量重复或冗余数据。因此,数据重复数据删除技术近年来受到了广泛的关注,该技术利用无损的数据压缩方案来减少文件间的重复数据。但是,在现有的方法和存储系统中仍然存在研究挑战,例如:如何更有效地对文件进行分块,并更好地利用专用应用程序之间潜在的相似性和同一性;如何有效、可靠地将数据块存储到二级存储设备中。本文提出了一种应用驱动的元数据感知重复数据删除归档存储系统ADMAD,该系统利用I/O路径中不同级别的元数据信息,将文件分区引导为更有意义的数据块(MC),以最大限度地减少文件间级别的重复。但是,这些块可能具有不同的长度和大小,将它们存储到存储设备中可能会导致大量的碎片,并且涉及到高百分比的随机磁盘访问,这是非常低效的。因此,在ADMAD中,块被进一步打包成固定大小的对象作为存储单元,以提高I/O性能并简化数据管理。初步实验表明,与现有方法相比,该系统可以进一步减少所需的存储空间(根据多个数据集,从20%减少到接近50%),并大大提高了写入性能(平均约为50%-70%)。
{"title":"ADMAD: Application-Driven Metadata Aware De-duplication Archival Storage System","authors":"Chuanyi Liu, Yingping Lu, Chunhui Shi, Guanlin Lu, D. Du, Dong-Sheng Wang","doi":"10.1109/SNAPI.2008.11","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.11","url":null,"abstract":"There is a huge amount of duplicated or redundant data in current storage systems. So data de-duplication, which uses lossless data compression schemes to minimize the duplicated data at the inter-file level, has been receiving broad attention in recent years. But there are still research challenges in current approaches and storage systems, such as: how to chunking the files more efficiently and better leverage potential similarity and identity among dedicated applications; how to store the chunks effectively and reliably into secondary storage devices. In this paper, we propose ADMAD: an application-driven metadata aware de-duplication archival storage system, which makes use of certain meta-data information of different levels in the I/O path to direct the file partitioning into more meaningful data chunks (MC) to maximally reduce the inter-file level duplications. However, the chunks may be with different lengths and variable sizes, storing them into storage devices may result in a lot of fragments and involve a high percentage of random disk accesses, which is very inefficient. Therefore, in ADMAD, chunks are further packaged into fixed sized objects as the storage units to speed up the I/O performance as well as to ease the data management. Preliminary experiments have demonstrated that the proposed system can further reduce the required storage space when compared with current methods (from 20% to near 50% according to several datasets), and largely improves the writing performance (about 50%-70% in average).","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125419363","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 66
Pre-allocation Size Adjusting Methods Depending on Growing File Size 根据文件大小增长预分配大小调整方法
T. Nakamura, N. Komoda
We propose several pre-allocation size adjusting methods that can prevent the file fragmentation of files at arbitrary sizes depending on the growing file size. The proposed methods use small-size pre-allocation when the size of a growing file is relatively small and a large-size pre-allocation when the size of a growing file is large. We confirmed that the proposed methods effectively prevent not only internal disk fragmentation of small files (10s KBs), but also file fragmentation of large files(GBs).
我们提出了几种预分配大小调整方法,可以防止任意大小的文件碎片,这取决于文件大小的增长。本文提出的方法在增长文件相对较小时采用小容量预分配,在增长文件较大时采用大容量预分配。我们证实,所提出的方法不仅可以有效地防止小文件(10s kb)的内部磁盘碎片,而且可以有效地防止大文件(gb)的文件碎片。
{"title":"Pre-allocation Size Adjusting Methods Depending on Growing File Size","authors":"T. Nakamura, N. Komoda","doi":"10.1109/SNAPI.2008.9","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.9","url":null,"abstract":"We propose several pre-allocation size adjusting methods that can prevent the file fragmentation of files at arbitrary sizes depending on the growing file size. The proposed methods use small-size pre-allocation when the size of a growing file is relatively small and a large-size pre-allocation when the size of a growing file is large. We confirmed that the proposed methods effectively prevent not only internal disk fragmentation of small files (10s KBs), but also file fragmentation of large files(GBs).","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"10 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124706009","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Partial-Distribution-Fault-Aware Protocol for Consistent Updates in Distributed Storage Systems 分布式存储系统一致性更新的部分分布故障感知协议
P. Sobe
Distribution of data and erasure tolerant-codes allow to store data reliably in distributed systems. Whereby most techniques are directed to failures of storage resources, also erroneously accessing clients and network interruptions may disturb the storage system operation and cause data loss. Particularly, updates that get partially effective onto distributed data may leave data in an inconsistent state and indirectly destroy data content. Besides, redundancy and data can be left in a state that does not allow to tolerate failures anymore. In this paper, we propose a protocol that takes these issues into account. The protocol forces update consistency in partial distribution scenarios and is correlated with the distribution and coding scheme. It is based on a two-phase commit protocol and a two-layered data structure for buffering updates. For block-wise and sequential access, the protocol cost is hidden in the sequence of accesses related to consecutive blocks.
数据的分布和擦除容忍码允许在分布式系统中可靠地存储数据。其中大部分技术都是针对存储资源的故障,而客户端访问错误和网络中断可能会影响存储系统的正常运行,造成数据丢失。特别是,对分布式数据部分有效的更新可能会使数据处于不一致的状态,并间接破坏数据内容。此外,冗余和数据可以保持在不允许再容忍故障的状态。在本文中,我们提出了一个考虑到这些问题的协议。该协议在部分分布场景下强制更新一致性,并与分布和编码方案相关。它基于两阶段提交协议和用于缓冲更新的两层数据结构。对于逐块和顺序访问,协议开销隐藏在与连续块相关的访问序列中。
{"title":"A Partial-Distribution-Fault-Aware Protocol for Consistent Updates in Distributed Storage Systems","authors":"P. Sobe","doi":"10.1109/SNAPI.2008.16","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.16","url":null,"abstract":"Distribution of data and erasure tolerant-codes allow to store data reliably in distributed systems. Whereby most techniques are directed to failures of storage resources, also erroneously accessing clients and network interruptions may disturb the storage system operation and cause data loss. Particularly, updates that get partially effective onto distributed data may leave data in an inconsistent state and indirectly destroy data content. Besides, redundancy and data can be left in a state that does not allow to tolerate failures anymore. In this paper, we propose a protocol that takes these issues into account. The protocol forces update consistency in partial distribution scenarios and is correlated with the distribution and coding scheme. It is based on a two-phase commit protocol and a two-layered data structure for buffering updates. For block-wise and sequential access, the protocol cost is hidden in the sequence of accesses related to consecutive blocks.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"59 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121968820","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Parallel Processing of Data, Metadata, and Aggregates within an Archival Storage System User Interface (Toward Archiving a Million Files and a Million Megabytes per Minute) 归档存储系统用户界面中数据、元数据和聚合的并行处理(面向每分钟归档百万文件和百万兆字节)
M. Roschke, Danny P Cook, Bart J Parliman, David Sherrill
Archiving large datasets requires parallel processing of both data and metadata for timely execution. This paper describes the work in progress to use various processing techniques, including multi-threading of data and metadata operations, distributed processing, aggregation, and conditional processing to achieve increased archival performance for large datasets.
归档大型数据集需要并行处理数据和元数据,以便及时执行。本文描述了正在进行的使用各种处理技术的工作,包括数据和元数据操作的多线程、分布式处理、聚合和条件处理,以提高大型数据集的归档性能。
{"title":"Parallel Processing of Data, Metadata, and Aggregates within an Archival Storage System User Interface (Toward Archiving a Million Files and a Million Megabytes per Minute)","authors":"M. Roschke, Danny P Cook, Bart J Parliman, David Sherrill","doi":"10.1109/SNAPI.2008.10","DOIUrl":"https://doi.org/10.1109/SNAPI.2008.10","url":null,"abstract":"Archiving large datasets requires parallel processing of both data and metadata for timely execution. This paper describes the work in progress to use various processing techniques, including multi-threading of data and metadata operations, distributed processing, aggregation, and conditional processing to achieve increased archival performance for large datasets.","PeriodicalId":335253,"journal":{"name":"2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2008-09-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132521602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1