Siddharth Nilakantan, K. Sangaiah, A. More, G. Salvador, B. Taskin, Mark Hempstead
{"title":"Synchrotrace: synchronization-aware architecture-agnostic traces for light-weight multicore simulation","authors":"Siddharth Nilakantan, K. Sangaiah, A. More, G. Salvador, B. Taskin, Mark Hempstead","doi":"10.1109/ISPASS.2015.7095813","DOIUrl":null,"url":null,"abstract":"Trace-driven simulation of chip multiprocessor (CMP) systems offers many advantages over execution-driven simulation, such as reducing simulation time and complexity, and allowing portability, and scalability. However, trace-based simulation approaches have encountered difficulty capturing and accurately replaying multi-threaded traces due to the inherent non-determinism in the execution of multi-threaded programs. In this work, we present SynchroTrace, a scalable, flexible, and accurate trace-based multi-threaded simulation methodology. The methodology captures synchronization- and dependency-aware, architecture-agnostic, multi-threaded traces and uses a replay mechanism that plays back these traces correctly. By recording synchronization events and dependencies in the traces, independent of the host architecture, the methodology is able to accurately model the non-determinism of multi-threaded programs for different platforms. We validate the SynchroTrace simulation flow by successfully achieving the equivalent results of a constraint-based design space exploration with the Gem5 Full-System simulator. The results from simulating benchmarks from PARSEC 2.1 and Splash-2 show that our trace-based approach with trace filtering has a peak speedup of up to 18.4x over simulation in Gem5 Full-System with an average of about 7.5x speedup. We are also able to compress traces up to 74% of their original size with almost no impact on accuracy.","PeriodicalId":189378,"journal":{"name":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","volume":"94 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-03-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"22","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ISPASS.2015.7095813","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 22
Abstract
Trace-driven simulation of chip multiprocessor (CMP) systems offers many advantages over execution-driven simulation, such as reducing simulation time and complexity, and allowing portability, and scalability. However, trace-based simulation approaches have encountered difficulty capturing and accurately replaying multi-threaded traces due to the inherent non-determinism in the execution of multi-threaded programs. In this work, we present SynchroTrace, a scalable, flexible, and accurate trace-based multi-threaded simulation methodology. The methodology captures synchronization- and dependency-aware, architecture-agnostic, multi-threaded traces and uses a replay mechanism that plays back these traces correctly. By recording synchronization events and dependencies in the traces, independent of the host architecture, the methodology is able to accurately model the non-determinism of multi-threaded programs for different platforms. We validate the SynchroTrace simulation flow by successfully achieving the equivalent results of a constraint-based design space exploration with the Gem5 Full-System simulator. The results from simulating benchmarks from PARSEC 2.1 and Splash-2 show that our trace-based approach with trace filtering has a peak speedup of up to 18.4x over simulation in Gem5 Full-System with an average of about 7.5x speedup. We are also able to compress traces up to 74% of their original size with almost no impact on accuracy.