{"title":"iPSC上对动态可重分发和可调整大小数组的超任务支持","authors":"M. Baber","doi":"10.1109/DMCC.1991.633086","DOIUrl":null,"url":null,"abstract":"Static allocations of arrays on multicomputers have two major shortcomings. First, algorithms often employ more than one referencepattern for a given array, resulting in the need for more than one mapping between the array elements and the multicomputer nodes. Secondly, it is desirable to provide easily resizeable arrays, especially for multigrid algorithms. This paper describes extensions to the hypertasking paracompiler which provide both dynamically resizeable and redistributable arrays. Hypertasking is a parallel programming tool that transforms C programs containing comment-directives into SPMD Cprogirams that can be run on any size hypercube without recompilation for each cube size. Introduction This paper describes extensions tc~ hypertasking [ 11, a domain decomposition tool that operates on commentdirectives inserted into ordinary sequential C source code. The extensions support run-time redistribution and resizing of arrays. Hypertasking is one of seveial projects [4,5,6,8] that have proposed or produced sourceto-source compilers for parallel architectures. I refer to this class of software tools as paracompilers to distinguish them from the sequential source-to-object compilers they are built upon. A fundamental question for paracompiler designers is whether to make decisions about data and control decomposition at compile-time or at ruin-time. If decisions are made at compile-time, the logic does not have to be repeated every time the program is executed and it is possible to optimize the code for known parameters. * Supported in part by: Defense Advanced Research Projects Agency Information Science and Technology Office Research in Concurrent Computing Systems ARPA Order No. 6402.6402-1; Program Code No. 8E20 & 9E20 Issued by DARPAKMO under Contract #&IDA-972-89-C-0034 Unfortunately, compile-time decisions are also inflexible. Hypertasking nnakes all significant decisions about decomposition at ]run-time. A run-time initialization routine is called by each node to assign values to the members of an amay definition structure. The C code generated by the paracompiler references the values in the structure instead of constants chosen at compile-time. The resulting code is surprisingly efficient. Furthermore, because it is relatively straightforward to change the decomposition variables in the array definition structure, run -ti me decomposition great 1 y facilitates the implementation of dynamic array resizing and redistribution features such as those described in this paper. This paper will begin with an overview of the Hypertasking programming model to provide a framework for the new features. Beginning with redistributable arrays, the purpose and performance of the new features are discussed with reference to example programs. Finally, conclusions and goals for future research are presented. Hypertasking overview Hypertasking is; designed to make it easy for software developers to port their existing data parallel applications to a multicomputer without making their code hardware specific. Hypertasking library routines decompose arrays in any or all dimensions, buit the number of nodes allocated in any given dimension is controlled by the hypertasking run-the library, and is always a power of two to preserve locality of reference within the logical node mesh. All arrays are decomposed into regular rectangular sub-blocks with sides as equal as possible given the previous constraints. Guard rings [3] for each subblock are provided. The term “guard ring”’ tends to imply a 2-D problem decomposed in both dimensions, but the concept is extended in this implementation to multiple dimensions. This paper uses guard wrapper as a general term encompassing guard rings in 2-D decompositions, guard shells in 3-D, and so on. A guard wrapper could be one array element thick for a 2-D 5-point stencil, or two elements thick for a ‘2-D 9-point stencil, for example. Each array element is stored on one or more nodes, 0-81 86-2290-3/91/0OO0/0059/$01 .OO Q 1991 IEEE 59 though it is only owned by one node. Array assignments and references are automatically rewritten by the paracompiler (hype command) so that any node can transparently read or write any element in the distributed virtual array, but communication costs make non-local reads and writes expensive. Application algorithms should exhibit good locality of reference to make hypertasking worthwhile. Hypertasking supports directives to decompose arrays in any or all dimensions, to limit loops to indices that are local for a given array, and to update guard wrappers with boundary values from neighboring nodes. The tool consists of a paracompiler and library routines. Figure 1 depicts the basic hypertasking usage model. Redist ributable arrays The array redistribution features are implemented as two new directives for the paracompiler and two new runtime library routines. The REDISTRIBUTE directive The REDISTRIBUTE directive is similar to the original ARRAY directive except that it is executable instead of declarative. The arguments are the same, allowing the user to specify the thickness of the guard wrapper and whether or not to distribute each dimension of the array. Can be run on a single node as a reference for speedups. C Compiler and Linker","PeriodicalId":313314,"journal":{"name":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1991-04-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"15","resultStr":"{\"title\":\"Hypertasking Support for Dynamically Redistributable and Resizeable Arrays on the iPSC\",\"authors\":\"M. Baber\",\"doi\":\"10.1109/DMCC.1991.633086\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Static allocations of arrays on multicomputers have two major shortcomings. First, algorithms often employ more than one referencepattern for a given array, resulting in the need for more than one mapping between the array elements and the multicomputer nodes. Secondly, it is desirable to provide easily resizeable arrays, especially for multigrid algorithms. This paper describes extensions to the hypertasking paracompiler which provide both dynamically resizeable and redistributable arrays. Hypertasking is a parallel programming tool that transforms C programs containing comment-directives into SPMD Cprogirams that can be run on any size hypercube without recompilation for each cube size. Introduction This paper describes extensions tc~ hypertasking [ 11, a domain decomposition tool that operates on commentdirectives inserted into ordinary sequential C source code. The extensions support run-time redistribution and resizing of arrays. Hypertasking is one of seveial projects [4,5,6,8] that have proposed or produced sourceto-source compilers for parallel architectures. I refer to this class of software tools as paracompilers to distinguish them from the sequential source-to-object compilers they are built upon. A fundamental question for paracompiler designers is whether to make decisions about data and control decomposition at compile-time or at ruin-time. If decisions are made at compile-time, the logic does not have to be repeated every time the program is executed and it is possible to optimize the code for known parameters. * Supported in part by: Defense Advanced Research Projects Agency Information Science and Technology Office Research in Concurrent Computing Systems ARPA Order No. 6402.6402-1; Program Code No. 8E20 & 9E20 Issued by DARPAKMO under Contract #&IDA-972-89-C-0034 Unfortunately, compile-time decisions are also inflexible. Hypertasking nnakes all significant decisions about decomposition at ]run-time. A run-time initialization routine is called by each node to assign values to the members of an amay definition structure. The C code generated by the paracompiler references the values in the structure instead of constants chosen at compile-time. The resulting code is surprisingly efficient. Furthermore, because it is relatively straightforward to change the decomposition variables in the array definition structure, run -ti me decomposition great 1 y facilitates the implementation of dynamic array resizing and redistribution features such as those described in this paper. This paper will begin with an overview of the Hypertasking programming model to provide a framework for the new features. Beginning with redistributable arrays, the purpose and performance of the new features are discussed with reference to example programs. Finally, conclusions and goals for future research are presented. Hypertasking overview Hypertasking is; designed to make it easy for software developers to port their existing data parallel applications to a multicomputer without making their code hardware specific. Hypertasking library routines decompose arrays in any or all dimensions, buit the number of nodes allocated in any given dimension is controlled by the hypertasking run-the library, and is always a power of two to preserve locality of reference within the logical node mesh. All arrays are decomposed into regular rectangular sub-blocks with sides as equal as possible given the previous constraints. Guard rings [3] for each subblock are provided. The term “guard ring”’ tends to imply a 2-D problem decomposed in both dimensions, but the concept is extended in this implementation to multiple dimensions. This paper uses guard wrapper as a general term encompassing guard rings in 2-D decompositions, guard shells in 3-D, and so on. A guard wrapper could be one array element thick for a 2-D 5-point stencil, or two elements thick for a ‘2-D 9-point stencil, for example. Each array element is stored on one or more nodes, 0-81 86-2290-3/91/0OO0/0059/$01 .OO Q 1991 IEEE 59 though it is only owned by one node. Array assignments and references are automatically rewritten by the paracompiler (hype command) so that any node can transparently read or write any element in the distributed virtual array, but communication costs make non-local reads and writes expensive. Application algorithms should exhibit good locality of reference to make hypertasking worthwhile. Hypertasking supports directives to decompose arrays in any or all dimensions, to limit loops to indices that are local for a given array, and to update guard wrappers with boundary values from neighboring nodes. The tool consists of a paracompiler and library routines. Figure 1 depicts the basic hypertasking usage model. Redist ributable arrays The array redistribution features are implemented as two new directives for the paracompiler and two new runtime library routines. The REDISTRIBUTE directive The REDISTRIBUTE directive is similar to the original ARRAY directive except that it is executable instead of declarative. The arguments are the same, allowing the user to specify the thickness of the guard wrapper and whether or not to distribute each dimension of the array. Can be run on a single node as a reference for speedups. C Compiler and Linker\",\"PeriodicalId\":313314,\"journal\":{\"name\":\"The Sixth Distributed Memory Computing Conference, 1991. Proceedings\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1991-04-28\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"15\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"The Sixth Distributed Memory Computing Conference, 1991. Proceedings\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DMCC.1991.633086\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"The Sixth Distributed Memory Computing Conference, 1991. Proceedings","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DMCC.1991.633086","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Hypertasking Support for Dynamically Redistributable and Resizeable Arrays on the iPSC
Static allocations of arrays on multicomputers have two major shortcomings. First, algorithms often employ more than one referencepattern for a given array, resulting in the need for more than one mapping between the array elements and the multicomputer nodes. Secondly, it is desirable to provide easily resizeable arrays, especially for multigrid algorithms. This paper describes extensions to the hypertasking paracompiler which provide both dynamically resizeable and redistributable arrays. Hypertasking is a parallel programming tool that transforms C programs containing comment-directives into SPMD Cprogirams that can be run on any size hypercube without recompilation for each cube size. Introduction This paper describes extensions tc~ hypertasking [ 11, a domain decomposition tool that operates on commentdirectives inserted into ordinary sequential C source code. The extensions support run-time redistribution and resizing of arrays. Hypertasking is one of seveial projects [4,5,6,8] that have proposed or produced sourceto-source compilers for parallel architectures. I refer to this class of software tools as paracompilers to distinguish them from the sequential source-to-object compilers they are built upon. A fundamental question for paracompiler designers is whether to make decisions about data and control decomposition at compile-time or at ruin-time. If decisions are made at compile-time, the logic does not have to be repeated every time the program is executed and it is possible to optimize the code for known parameters. * Supported in part by: Defense Advanced Research Projects Agency Information Science and Technology Office Research in Concurrent Computing Systems ARPA Order No. 6402.6402-1; Program Code No. 8E20 & 9E20 Issued by DARPAKMO under Contract #&IDA-972-89-C-0034 Unfortunately, compile-time decisions are also inflexible. Hypertasking nnakes all significant decisions about decomposition at ]run-time. A run-time initialization routine is called by each node to assign values to the members of an amay definition structure. The C code generated by the paracompiler references the values in the structure instead of constants chosen at compile-time. The resulting code is surprisingly efficient. Furthermore, because it is relatively straightforward to change the decomposition variables in the array definition structure, run -ti me decomposition great 1 y facilitates the implementation of dynamic array resizing and redistribution features such as those described in this paper. This paper will begin with an overview of the Hypertasking programming model to provide a framework for the new features. Beginning with redistributable arrays, the purpose and performance of the new features are discussed with reference to example programs. Finally, conclusions and goals for future research are presented. Hypertasking overview Hypertasking is; designed to make it easy for software developers to port their existing data parallel applications to a multicomputer without making their code hardware specific. Hypertasking library routines decompose arrays in any or all dimensions, buit the number of nodes allocated in any given dimension is controlled by the hypertasking run-the library, and is always a power of two to preserve locality of reference within the logical node mesh. All arrays are decomposed into regular rectangular sub-blocks with sides as equal as possible given the previous constraints. Guard rings [3] for each subblock are provided. The term “guard ring”’ tends to imply a 2-D problem decomposed in both dimensions, but the concept is extended in this implementation to multiple dimensions. This paper uses guard wrapper as a general term encompassing guard rings in 2-D decompositions, guard shells in 3-D, and so on. A guard wrapper could be one array element thick for a 2-D 5-point stencil, or two elements thick for a ‘2-D 9-point stencil, for example. Each array element is stored on one or more nodes, 0-81 86-2290-3/91/0OO0/0059/$01 .OO Q 1991 IEEE 59 though it is only owned by one node. Array assignments and references are automatically rewritten by the paracompiler (hype command) so that any node can transparently read or write any element in the distributed virtual array, but communication costs make non-local reads and writes expensive. Application algorithms should exhibit good locality of reference to make hypertasking worthwhile. Hypertasking supports directives to decompose arrays in any or all dimensions, to limit loops to indices that are local for a given array, and to update guard wrappers with boundary values from neighboring nodes. The tool consists of a paracompiler and library routines. Figure 1 depicts the basic hypertasking usage model. Redist ributable arrays The array redistribution features are implemented as two new directives for the paracompiler and two new runtime library routines. The REDISTRIBUTE directive The REDISTRIBUTE directive is similar to the original ARRAY directive except that it is executable instead of declarative. The arguments are the same, allowing the user to specify the thickness of the guard wrapper and whether or not to distribute each dimension of the array. Can be run on a single node as a reference for speedups. C Compiler and Linker