G. Elsesser, Viet N. Ngo, S. Bhattacharya, W. Tsai
{"title":"完美俱乐部DOALL循环的负载平衡","authors":"G. Elsesser, Viet N. Ngo, S. Bhattacharya, W. Tsai","doi":"10.1109/IPPS.1993.262868","DOIUrl":null,"url":null,"abstract":"The speedup achieved by concurrent execution of loop iterations is determined by load balance and several other factors, so no single strategy provides maximum speedup for all classes of programs and all target architectures. Hence, the selection of a load balancing strategy must be guided by characteristics of both the application domain and the target machine architecture. The authors study loop load balance in the context of the well known Perfect Club benchmark. Several static and dynamic characteristics of DOALL loops are observed and interpreted. Late arrival of processors is identified as a significant source of load imbalance. A scheme for processor preallocation is proposed and the advantages and applicability of this scheme are demonstrated by analytical estimates as well as experimental evaluation on a Cray YMP-8.<<ETX>>","PeriodicalId":248927,"journal":{"name":"[1993] Proceedings Seventh International Parallel Processing Symposium","volume":"2 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1993-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Load balancing of DOALL loops in the Perfect Club\",\"authors\":\"G. Elsesser, Viet N. Ngo, S. Bhattacharya, W. Tsai\",\"doi\":\"10.1109/IPPS.1993.262868\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The speedup achieved by concurrent execution of loop iterations is determined by load balance and several other factors, so no single strategy provides maximum speedup for all classes of programs and all target architectures. Hence, the selection of a load balancing strategy must be guided by characteristics of both the application domain and the target machine architecture. The authors study loop load balance in the context of the well known Perfect Club benchmark. Several static and dynamic characteristics of DOALL loops are observed and interpreted. Late arrival of processors is identified as a significant source of load imbalance. A scheme for processor preallocation is proposed and the advantages and applicability of this scheme are demonstrated by analytical estimates as well as experimental evaluation on a Cray YMP-8.<<ETX>>\",\"PeriodicalId\":248927,\"journal\":{\"name\":\"[1993] Proceedings Seventh International Parallel Processing Symposium\",\"volume\":\"2 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1993-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"[1993] Proceedings Seventh International Parallel Processing Symposium\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/IPPS.1993.262868\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1993] Proceedings Seventh International Parallel Processing Symposium","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPPS.1993.262868","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The speedup achieved by concurrent execution of loop iterations is determined by load balance and several other factors, so no single strategy provides maximum speedup for all classes of programs and all target architectures. Hence, the selection of a load balancing strategy must be guided by characteristics of both the application domain and the target machine architecture. The authors study loop load balance in the context of the well known Perfect Club benchmark. Several static and dynamic characteristics of DOALL loops are observed and interpreted. Late arrival of processors is identified as a significant source of load imbalance. A scheme for processor preallocation is proposed and the advantages and applicability of this scheme are demonstrated by analytical estimates as well as experimental evaluation on a Cray YMP-8.<>