{"title":"Time Warp on the GPU: Design and Assessment","authors":"Xinhu Liu, Philipp Andelfinger","doi":"10.1145/3064911.3064912","DOIUrl":null,"url":null,"abstract":"The parallel execution of discrete-event simulations on commodity GPUs has been shown to achieve high event rates. Most previous proposals have focused on conservative synchronization, which typically extracts only limited parallelism in cases of low event density in simulated time. We present the design and implementation of an optimistic fully GPU-based parallel discrete-event simulator based on the Time Warp synchronization algorithm. The optimistic simulator implementation is compared with an otherwise identical implementation using conservative synchronization. Our evaluation shows that in most cases, the increase in parallelism when using optimistic synchronization significantly outweighs the increased overhead for state keeping and rollbacks. To reduce the cost of state keeping, we show how XORWOW, the default pseudo-random number generator in CUDA, can be reversed based solely on its current state. Since the optimal configuration of multiple performance-critical simulator parameters depends on the behavior of the simulation model, these parameters are adapted dynamically based on performance measurements and heuristic optimization at runtime. We evaluate the simulator using the PHOLD benchmark model and a simplified model of peer-to-peer networks using the Kademlia protocol. On a commodity GPU, the optimistic simulator achieves event rates of up to 81.4 million events per second and a speedup of up to 3.6 compared with conservative synchronization.","PeriodicalId":341026,"journal":{"name":"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-05-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the 2017 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1145/3064911.3064912","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
The parallel execution of discrete-event simulations on commodity GPUs has been shown to achieve high event rates. Most previous proposals have focused on conservative synchronization, which typically extracts only limited parallelism in cases of low event density in simulated time. We present the design and implementation of an optimistic fully GPU-based parallel discrete-event simulator based on the Time Warp synchronization algorithm. The optimistic simulator implementation is compared with an otherwise identical implementation using conservative synchronization. Our evaluation shows that in most cases, the increase in parallelism when using optimistic synchronization significantly outweighs the increased overhead for state keeping and rollbacks. To reduce the cost of state keeping, we show how XORWOW, the default pseudo-random number generator in CUDA, can be reversed based solely on its current state. Since the optimal configuration of multiple performance-critical simulator parameters depends on the behavior of the simulation model, these parameters are adapted dynamically based on performance measurements and heuristic optimization at runtime. We evaluate the simulator using the PHOLD benchmark model and a simplified model of peer-to-peer networks using the Kademlia protocol. On a commodity GPU, the optimistic simulator achieves event rates of up to 81.4 million events per second and a speedup of up to 3.6 compared with conservative synchronization.