Fenglong Song, Zhiyong Liu, Dongrui Fan, He Huang, Nan Yuan, Lei-Ping Yu, Junchao Zhang
{"title":"片上共享内存多核处理器同步性评价方法","authors":"Fenglong Song, Zhiyong Liu, Dongrui Fan, He Huang, Nan Yuan, Lei-Ping Yu, Junchao Zhang","doi":"10.1109/ISPA.2009.6","DOIUrl":null,"url":null,"abstract":"On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor\",\"authors\":\"Fenglong Song, Zhiyong Liu, Dongrui Fan, He Huang, Nan Yuan, Lei-Ping Yu, Junchao Zhang\",\"doi\":\"10.1109/ISPA.2009.6\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.\",\"PeriodicalId\":346815,\"journal\":{\"name\":\"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2009-08-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ISPA.2009.6\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPA.2009.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor
On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.