PCAsim: A parallel cycle accurate simulation platform for CMPs

Xiaodong Zhu, Junmin Wu, Xiufeng Sui, Wei Yin, Qingbo Wang, Zhe Gong
{"title":"PCAsim: A parallel cycle accurate simulation platform for CMPs","authors":"Xiaodong Zhu, Junmin Wu, Xiufeng Sui, Wei Yin, Qingbo Wang, Zhe Gong","doi":"10.1109/ICCDA.2010.5540881","DOIUrl":null,"url":null,"abstract":"As the approaching of the multi-core era, chip multiprocessor(CMP) architectures present a challenge for efficient simulation, combining with the requirements of a detailed simulator running realistic workloads. Parallelization, which can exploit inherent parallelism in CMP simulation, is a common method to reduce simualtion time. We design and implement PCAsim, a parallel cycle accurate and user-level CMP simulator running on shared memory platform. The simulator is parallelized by POSIX threads according to target system architecture. Each core thread and the manager thread are synchronized with Slack mechanism [11]. But we find slack mechanism can not ensure the simulator against time violation among events generated by network activity and cache coherence protocol. To solve the problem, we propose an effective synchronous method called pending barrier. This method augments the power of traditional conservative parallel synchronous mechanism and improves simulation accuracy with negligible performance degradation. Except synchronization, we also encountered many other troublesome issues in implementing PCAsim. This paper describes some common ones and illustrates how we address them. The evaluations show that PCAsim can achieve reasonable speed-up and scalability.","PeriodicalId":190625,"journal":{"name":"2010 International Conference On Computer Design and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference On Computer Design and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCDA.2010.5540881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3

Abstract

As the approaching of the multi-core era, chip multiprocessor(CMP) architectures present a challenge for efficient simulation, combining with the requirements of a detailed simulator running realistic workloads. Parallelization, which can exploit inherent parallelism in CMP simulation, is a common method to reduce simualtion time. We design and implement PCAsim, a parallel cycle accurate and user-level CMP simulator running on shared memory platform. The simulator is parallelized by POSIX threads according to target system architecture. Each core thread and the manager thread are synchronized with Slack mechanism [11]. But we find slack mechanism can not ensure the simulator against time violation among events generated by network activity and cache coherence protocol. To solve the problem, we propose an effective synchronous method called pending barrier. This method augments the power of traditional conservative parallel synchronous mechanism and improves simulation accuracy with negligible performance degradation. Except synchronization, we also encountered many other troublesome issues in implementing PCAsim. This paper describes some common ones and illustrates how we address them. The evaluations show that PCAsim can achieve reasonable speed-up and scalability.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
PCAsim: cmp并行周期精确仿真平台
随着多核时代的到来,芯片多处理器(CMP)体系结构结合运行实际工作负载的详细模拟器的要求,对高效仿真提出了挑战。并行化利用了CMP仿真中固有的并行性,是一种减少仿真时间的常用方法。本文设计并实现了一个运行在共享内存平台上的并行周期精确的用户级CMP模拟器PCAsim。根据目标系统架构,该模拟器由POSIX线程并行化。各核心线程和管理线程采用Slack机制同步[11]。但是我们发现松弛机制不能保证模拟器不受网络活动和缓存一致性协议产生的事件间时间冲突的影响。为了解决这个问题,我们提出了一种有效的同步方法,称为挂起屏障。该方法增强了传统保守并联同步机构的性能,提高了仿真精度,而性能下降可以忽略不计。除了同步,我们在实现PCAsim的过程中还遇到了很多其他的麻烦问题。本文描述了一些常见的问题,并说明了如何解决这些问题。评估结果表明,PCAsim能够实现合理的加速和可扩展性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
Bandwidth allocation in virtual network based on traffic prediction Research of collision detection algorithm based on particle swarm optimization Fault diagnosis expert system of artillery radar based on neural network Improved Concentric Clustering Routing Scheme adapted to various environments of sensor networks PCAsim: A parallel cycle accurate simulation platform for CMPs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1