{"title":"领域分解技术在Intel超立方体上的并行实现","authors":"M. Haghoo, W. Proskurowski","doi":"10.1145/63047.63132","DOIUrl":null,"url":null,"abstract":"Parallel implementation of domain decomposition techniques for elliptic PDEs in rectangular regions is considered. This technique is well suited for parallel processing, since in the solution process the subproblems either are independent or can be easily converted into decoupled problems. More than 80% of execution time is spent on solving these independent and decoupled problems.\nThe hypercube architecture is used for concurrent execution. The performance of the parallel algorithm is compared against the sequential version. The speed-up, efficiency, and communication factors are studied as functions the number of processors. Extensive tests are performed to find, for a given mesh size, the number of subregions and nodes that minimize the overall execution time.","PeriodicalId":299435,"journal":{"name":"Conference on Hypercube Concurrent Computers and Applications","volume":"50 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1989-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"2","resultStr":"{\"title\":\"Parallel implementation of domain decomposition techniques on Intel's hypercube\",\"authors\":\"M. Haghoo, W. Proskurowski\",\"doi\":\"10.1145/63047.63132\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Parallel implementation of domain decomposition techniques for elliptic PDEs in rectangular regions is considered. This technique is well suited for parallel processing, since in the solution process the subproblems either are independent or can be easily converted into decoupled problems. More than 80% of execution time is spent on solving these independent and decoupled problems.\\nThe hypercube architecture is used for concurrent execution. The performance of the parallel algorithm is compared against the sequential version. The speed-up, efficiency, and communication factors are studied as functions the number of processors. Extensive tests are performed to find, for a given mesh size, the number of subregions and nodes that minimize the overall execution time.\",\"PeriodicalId\":299435,\"journal\":{\"name\":\"Conference on Hypercube Concurrent Computers and Applications\",\"volume\":\"50 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"1989-01-03\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"2\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Conference on Hypercube Concurrent Computers and Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1145/63047.63132\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Conference on Hypercube Concurrent Computers and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/63047.63132","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Parallel implementation of domain decomposition techniques on Intel's hypercube
Parallel implementation of domain decomposition techniques for elliptic PDEs in rectangular regions is considered. This technique is well suited for parallel processing, since in the solution process the subproblems either are independent or can be easily converted into decoupled problems. More than 80% of execution time is spent on solving these independent and decoupled problems.
The hypercube architecture is used for concurrent execution. The performance of the parallel algorithm is compared against the sequential version. The speed-up, efficiency, and communication factors are studied as functions the number of processors. Extensive tests are performed to find, for a given mesh size, the number of subregions and nodes that minimize the overall execution time.