Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor

Fenglong Song, Zhiyong Liu, Dongrui Fan, He Huang, Nan Yuan, Lei-Ping Yu, Junchao Zhang
{"title":"Evaluation Method of Synchronization for Shared-Memory On-Chip Many-Core Processor","authors":"Fenglong Song, Zhiyong Liu, Dongrui Fan, He Huang, Nan Yuan, Lei-Ping Yu, Junchao Zhang","doi":"10.1109/ISPA.2009.6","DOIUrl":null,"url":null,"abstract":"On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.","PeriodicalId":346815,"journal":{"name":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2009-08-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2009 IEEE International Symposium on Parallel and Distributed Processing with Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPA.2009.6","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

On-chip many-core architecture is an emerging and promising computation platform. High speed on-chip communication and abundant chipped resources are two outstanding advantages of this architecture, which provide an opportunity to implement efficient synchronization scheme. The practical execution efficiency of synchronization scheme is critical to this platform. However, there are few researches on systematic evaluation method of choice synchronization schemes for on-chip many-core processors, and effect of dedicated hardware support in this context. So we focus on the evaluation method and criterion of synchronization scheme on the platform. Firstly, we present several criterions proper to on-chip many-core architecture, that is, absolute overhead of synchronization operation, the transferring time between different synchronization operations, overhead caused by load imbalance, and the network congestion caused by synchronization operation. Secondly, we illustrate how to design microbenchmarks which one dedicated to evaluate a performance criterion respectively. Finally, we implement these microbenchmarks and synchronization schemes on an on-chip many-core processor with shared level-two cache and AMD Opteron commercial chip multi-processor, respectively. And we analyze effect of dedicated hardware support. Results show that the most overhead of synchronization is caused by load imbalance and serialization on synchronization point. It also shows that synchronization scheme supported with dedicated hardware can improve its performance obviously for chipped many-core processor.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
片上共享内存多核处理器同步性评价方法
片上多核架构是一个新兴的、有发展前途的计算平台。高速的片上通信和丰富的芯片资源是该架构的两个突出优点,这为实现高效的同步方案提供了机会。同步方案的实际执行效率对该平台至关重要。然而,对于片上多核处理器选择同步方案的系统评估方法,以及在此背景下专用硬件支持的影响,研究较少。因此,重点研究了平台同步方案的评价方法和评价标准。首先,我们提出了几个适合片上多核架构的标准,即同步操作的绝对开销、不同同步操作之间的传递时间、负载不平衡引起的开销以及同步操作引起的网络拥塞。其次,我们说明了如何设计用于分别评估性能标准的微基准。最后,我们分别在具有共享二级缓存的片上多核处理器和AMD Opteron商用芯片多处理器上实现了这些微基准测试和同步方案。并分析了专用硬件支持的效果。结果表明,同步开销最大的原因是同步点上的负载不平衡和串行化。在芯片化多核处理器上,采用专用硬件支持的同步方案可以明显提高其性能。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Completion Time Estimation for Instances of Generalized Well-Formed Workflow A Synchronization-Based Alternative to Directory Protocol Web Service Locating Unit in RFID-Centric Anti-counterfeit System Distributed Transfer Network Learning Based Intrusion Detection Multi-Source Traffic Data Fusion Method Based on Regulation and Reliability
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1