{"title":"PCAsim: A parallel cycle accurate simulation platform for CMPs","authors":"Xiaodong Zhu, Junmin Wu, Xiufeng Sui, Wei Yin, Qingbo Wang, Zhe Gong","doi":"10.1109/ICCDA.2010.5540881","DOIUrl":null,"url":null,"abstract":"As the approaching of the multi-core era, chip multiprocessor(CMP) architectures present a challenge for efficient simulation, combining with the requirements of a detailed simulator running realistic workloads. Parallelization, which can exploit inherent parallelism in CMP simulation, is a common method to reduce simualtion time. We design and implement PCAsim, a parallel cycle accurate and user-level CMP simulator running on shared memory platform. The simulator is parallelized by POSIX threads according to target system architecture. Each core thread and the manager thread are synchronized with Slack mechanism [11]. But we find slack mechanism can not ensure the simulator against time violation among events generated by network activity and cache coherence protocol. To solve the problem, we propose an effective synchronous method called pending barrier. This method augments the power of traditional conservative parallel synchronous mechanism and improves simulation accuracy with negligible performance degradation. Except synchronization, we also encountered many other troublesome issues in implementing PCAsim. This paper describes some common ones and illustrates how we address them. The evaluations show that PCAsim can achieve reasonable speed-up and scalability.","PeriodicalId":190625,"journal":{"name":"2010 International Conference On Computer Design and Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 International Conference On Computer Design and Applications","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICCDA.2010.5540881","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
Abstract
As the approaching of the multi-core era, chip multiprocessor(CMP) architectures present a challenge for efficient simulation, combining with the requirements of a detailed simulator running realistic workloads. Parallelization, which can exploit inherent parallelism in CMP simulation, is a common method to reduce simualtion time. We design and implement PCAsim, a parallel cycle accurate and user-level CMP simulator running on shared memory platform. The simulator is parallelized by POSIX threads according to target system architecture. Each core thread and the manager thread are synchronized with Slack mechanism [11]. But we find slack mechanism can not ensure the simulator against time violation among events generated by network activity and cache coherence protocol. To solve the problem, we propose an effective synchronous method called pending barrier. This method augments the power of traditional conservative parallel synchronous mechanism and improves simulation accuracy with negligible performance degradation. Except synchronization, we also encountered many other troublesome issues in implementing PCAsim. This paper describes some common ones and illustrates how we address them. The evaluations show that PCAsim can achieve reasonable speed-up and scalability.