{"title":"The impact of synchronization and granularity on parallel systems","authors":"D. Chen, H. Su, P. Yew","doi":"10.1145/325164.325150","DOIUrl":null,"url":null,"abstract":"A study is made of the impact of synchronization and granularity on the performance of parallel systems using an execution-driven simulation technique. It is found that, even though there can be a lot of parallelism at the fine-grain level, synchronization and scheduling strategies determine the ultimate performance of the system. Loop-iteration-level parallelism seems to be a more appropriate level when those factors are considered. Barrier synchronization and data synchronization at the loop-iteration level are also studied. It is found that both schemes are needed for a better performance.<<ETX>>","PeriodicalId":297046,"journal":{"name":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","volume":"69 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"1990-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"95","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"[1990] Proceedings. The 17th Annual International Symposium on Computer Architecture","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/325164.325150","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 95
Abstract
A study is made of the impact of synchronization and granularity on the performance of parallel systems using an execution-driven simulation technique. It is found that, even though there can be a lot of parallelism at the fine-grain level, synchronization and scheduling strategies determine the ultimate performance of the system. Loop-iteration-level parallelism seems to be a more appropriate level when those factors are considered. Barrier synchronization and data synchronization at the loop-iteration level are also studied. It is found that both schemes are needed for a better performance.<>