Characterizing the Balance of Parallel 1/0 Systems

The Sixth Distributed Memory Computing Conference, 1991. Proceedings Pub Date : 1991-04-28 DOI:10.1109/DMCC.1991.633363

J. French

{"title":"Characterizing the Balance of Parallel 1/0 Systems","authors":"J. French","doi":"10.1109/DMCC.1991.633363","DOIUrl":null,"url":null,"abstract":"High pzrformance 110 subsystems are a key element in parallel computer systems that intend to compete with traditional supercomputers in the solution of large scientific and engineering problems. The hardware and software organization and perfomiance of these U0 subsystems are fundamental issues in parallel computer systems that have not been explored in depth. Two commercially available parallel file systems are the Intel iPSC/2 Concurrent File System (CFS) and the NCUBE/ten NChannel board and disk farm. Both systcms are aimed at support of high volume, large block I/O of the type typical of large scientific computations. The evaluation of these systems has proved difficult. There are many parameters affecting performance and the system dynamics are quite complex. In this paper we examine a method of quantifying the balance of an 1/0 system, that is, how well it services I/O rcquests with respect to fairness and distribution of overheads. One may gauge the degree of balance in a systcrn by asking: When resources become saturated, is the bottleneck felt equally by each process or are some processes given preferential service? This paper explores a simple yardstick of system balance. 1. Quantifying and Measuring Parallel I/O Suppose that we have p processes reading (writing) a file of N bytes in parallel. Each process i reads (writes) N n bytes in time ti where n = -. The individual data P transfer rate of a particular processor i is given by ri = z. The average individual data transfer rate is ti given by 7 = 1 firi. P I = There are at least two reasonable measures of the aggregate data transfer rate of the p processors. In the first case, we sum the data rates of the individual processors. This gives rise to the quantity ri called the maximum sustained aggregate rate (ma-SAR). We call this I = 4 tThis research was supported in part by JPL Contract #957721 and by the Department of Energy under Grant DE-FG05-88ER25063. measure the “maximum” rate because, by construction, it assumes that each processor i contributes a rate ri and all processors contribute during the same time instant, however brief. From the definition of F above, we see that max-SAR = ri = p 7 . (1) I = f i This interpretation is illustrated in Figure l(a). In the second case, we consider that all N bytes move through the system in z = max ti time units. That is, the entire file is not transferred until the slowest processor finishes reading (writing) its partition of the file. This gives rise to the quantity called the minimum sustained aggregate rate (min-SAR). We call this a “minimum” rate since this is the rate that an outside observer will perceive as the rate at which the entire processor ensemble is operating. From the definitions above, we see that 1","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"12 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DMCC.1991.633363","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 4

Abstract

High pzrformance 110 subsystems are a key element in parallel computer systems that intend to compete with traditional supercomputers in the solution of large scientific and engineering problems. The hardware and software organization and perfomiance of these U0 subsystems are fundamental issues in parallel computer systems that have not been explored in depth. Two commercially available parallel file systems are the Intel iPSC/2 Concurrent File System (CFS) and the NCUBE/ten NChannel board and disk farm. Both systcms are aimed at support of high volume, large block I/O of the type typical of large scientific computations. The evaluation of these systems has proved difficult. There are many parameters affecting performance and the system dynamics are quite complex. In this paper we examine a method of quantifying the balance of an 1/0 system, that is, how well it services I/O rcquests with respect to fairness and distribution of overheads. One may gauge the degree of balance in a systcrn by asking: When resources become saturated, is the bottleneck felt equally by each process or are some processes given preferential service? This paper explores a simple yardstick of system balance. 1. Quantifying and Measuring Parallel I/O Suppose that we have p processes reading (writing) a file of N bytes in parallel. Each process i reads (writes) N n bytes in time ti where n = -. The individual data P transfer rate of a particular processor i is given by ri = z. The average individual data transfer rate is ti given by 7 = 1 firi. P I = There are at least two reasonable measures of the aggregate data transfer rate of the p processors. In the first case, we sum the data rates of the individual processors. This gives rise to the quantity ri called the maximum sustained aggregate rate (ma-SAR). We call this I = 4 tThis research was supported in part by JPL Contract #957721 and by the Department of Energy under Grant DE-FG05-88ER25063. measure the “maximum” rate because, by construction, it assumes that each processor i contributes a rate ri and all processors contribute during the same time instant, however brief. From the definition of F above, we see that max-SAR = ri = p 7 . (1) I = f i This interpretation is illustrated in Figure l(a). In the second case, we consider that all N bytes move through the system in z = max ti time units. That is, the entire file is not transferred until the slowest processor finishes reading (writing) its partition of the file. This gives rise to the quantity called the minimum sustained aggregate rate (min-SAR). We call this a “minimum” rate since this is the rate that an outside observer will perceive as the rate at which the entire processor ensemble is operating. From the definitions above, we see that 1

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

描述并行1/0系统的平衡

高性能子系统是并行计算机系统在解决大型科学和工程问题方面与传统超级计算机竞争的关键因素。这些U0子系统的硬件和软件组织和性能是并行计算机系统中尚未深入探讨的基本问题。两个商业上可用的并行文件系统是英特尔iPSC/2并发文件系统(CFS)和NCUBE/ 10nchannel板和磁盘场。这两个系统都旨在支持典型的大型科学计算的高容量、大块I/O。对这些系统的评估已证明是困难的。影响系统性能的参数很多，系统动力学非常复杂。在本文中，我们研究了一种量化1/0系统平衡的方法，也就是说，它在公平和开销分配方面为I/O请求提供服务的情况。可以通过以下问题来衡量系统中的平衡程度:当资源饱和时，每个进程是否都能感受到瓶颈，或者某些进程是否获得了优先服务?本文探讨了系统平衡的一个简单尺度。1. 量化和测量并行I/O假设我们有p个进程并行读(写)一个N字节的文件。每个进程i在时间ti上读(写)N个字节，其中N = -。特定处理器i的单个数据P传输速率由ri = z给出，平均单个数据传输速率由7 = 1 firi给出。至少有两种合理的方法来衡量P个处理器的总数据传输速率。在第一种情况下，我们对各个处理器的数据速率求和。这就产生了称为最大持续聚合速率(ma-SAR)的量ri。我们称之为I = 4。这项研究得到了喷气推进实验室合同编号957721和能源部在DE-FG05-88ER25063拨款下的部分支持。测量“最大”速率，因为根据构造，它假设每个处理器I贡献速率ri，并且所有处理器在同一瞬间(无论多么短暂)贡献速率。由上面F的定义可知，max-SAR = ri = p 7。(1) I = f I这种解释如图1 (a)所示。在第二种情况下，我们认为所有N个字节以z = max ti时间单位在系统中移动。也就是说，在最慢的处理器完成对文件分区的读(写)操作之前，不会传输整个文件。这就产生了称为最小持续累计速率(min-SAR)的量。我们称其为“最小”速率，因为外部观察者会认为这是整个处理器集合运行的速率。从上面的定义，我们看到1

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊