首页 > 最新文献

SIGSIM Principles of Advanced Discrete Simulation最新文献

英文 中文
Synchronisation for dynamic load balancing of decentralised conservative distributed simulation 分散保守分布式仿真中动态负载平衡的同步
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601386
Quentin Bragard, Anthony Ventresque, L. Murphy
Synchronisation mechanisms are essential in distributed simulation. Some systems rely on central units to control the simulation but central units are known to be bottlenecks. If we want to avoid using a central unit to optimise the simulation speed, we lose the capacity to act on the simulation at a global scale. Being able to act on the entire simulation is an important feature which allows to dynamically load-balance a distributed simulation. While some local partitioning algorithms exist, their lack of global view reduces their efficiency. Running a global partitioning algorithm without central unit requires a synchronisation of all logical processes (LPs) at the same step. The first algorithm requires the knowledge of some topological properties of the network while the second algorithm works without any requirement. The algorithms are detailed and compared against each other. An evaluation shows the benefits of using a global dynamic load-balancing for distributed simulations.
同步机制在分布式仿真中是必不可少的。有些系统依赖中央单元来控制仿真,但中央单元是已知的瓶颈。如果我们想避免使用中央单元来优化模拟速度,我们就失去了在全球范围内对模拟进行操作的能力。能够对整个模拟进行操作是一个重要的特性,它允许动态负载平衡分布式模拟。虽然存在一些局部分区算法,但它们缺乏全局视图,降低了它们的效率。运行没有中心单元的全局分区算法需要在同一步骤同步所有逻辑进程(lp)。第一种算法需要了解网络的一些拓扑性质,而第二种算法则没有任何要求。详细介绍了这些算法,并对它们进行了比较。一项评估显示了在分布式仿真中使用全局动态负载平衡的好处。
{"title":"Synchronisation for dynamic load balancing of decentralised conservative distributed simulation","authors":"Quentin Bragard, Anthony Ventresque, L. Murphy","doi":"10.1145/2601381.2601386","DOIUrl":"https://doi.org/10.1145/2601381.2601386","url":null,"abstract":"Synchronisation mechanisms are essential in distributed simulation. Some systems rely on central units to control the simulation but central units are known to be bottlenecks. If we want to avoid using a central unit to optimise the simulation speed, we lose the capacity to act on the simulation at a global scale. Being able to act on the entire simulation is an important feature which allows to dynamically load-balance a distributed simulation. While some local partitioning algorithms exist, their lack of global view reduces their efficiency. Running a global partitioning algorithm without central unit requires a synchronisation of all logical processes (LPs) at the same step. The first algorithm requires the knowledge of some topological properties of the network while the second algorithm works without any requirement. The algorithms are detailed and compared against each other. An evaluation shows the benefits of using a global dynamic load-balancing for distributed simulations.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121272341","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Hierarchical resource management for enhancing performance of large-scale simulations on data centers 提高数据中心大规模仿真性能的分层资源管理
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601390
Zengxiang Li, Xiaorong Li, Long Wang, Wentong Cai
More and more interests have been shown to move large-scale simulations on modern data centers composed of a large number of virtualized multi-core computers. However, the simulation components (Federates) consolidated in the same computer may have imbalanced simulation workloads. Similarly, the computers involved in the same simulation execution (Federation) may also have imbalanced simulation workloads. Hence, federates may waste a lot of computer resources on time synchronization with each other. In this paper, a hierarchical resource management system is proposed to enhance simulation execution performance. Federates in the federation are enraptured in their individual Virtual Machines (VMs), which are consolidated on a group of virtualized multi-core computers. On the computer level, multiple VMs share the resource of the computer according to the simulation workloads of their corresponding federates. On the federation level, some VMs are migrated for workload balance purpose. Therefore, computer resources are fully utilized to conduct useful simulation workloads, avoiding the synchronization overheads. Experiments using synthetic and real simulation workloads have verified that the hierarchical resource management system enhances simulation performance significantly.
在由大量虚拟多核计算机组成的现代数据中心上进行大规模模拟已引起人们越来越多的兴趣。但是,合并在同一台计算机中的仿真组件(Federates)可能具有不平衡的仿真工作负载。类似地,参与相同模拟执行(联邦)的计算机也可能具有不平衡的模拟工作负载。因此,联邦可能会在彼此的时间同步上浪费大量的计算机资源。为了提高仿真执行性能,本文提出了一种分层资源管理系统。联盟中的联盟成员对各自的虚拟机(vm)非常着迷,这些虚拟机被整合到一组虚拟的多核计算机上。在计算机级别,多个虚拟机根据其相应联邦的模拟工作负载共享计算机的资源。在联邦级别,迁移一些虚拟机是为了平衡工作负载。因此,计算机资源被充分利用来执行有用的模拟工作负载,避免了同步开销。通过综合仿真和真实仿真工作负载的实验,验证了分层资源管理系统显著提高了仿真性能。
{"title":"Hierarchical resource management for enhancing performance of large-scale simulations on data centers","authors":"Zengxiang Li, Xiaorong Li, Long Wang, Wentong Cai","doi":"10.1145/2601381.2601390","DOIUrl":"https://doi.org/10.1145/2601381.2601390","url":null,"abstract":"More and more interests have been shown to move large-scale simulations on modern data centers composed of a large number of virtualized multi-core computers. However, the simulation components (Federates) consolidated in the same computer may have imbalanced simulation workloads. Similarly, the computers involved in the same simulation execution (Federation) may also have imbalanced simulation workloads. Hence, federates may waste a lot of computer resources on time synchronization with each other. In this paper, a hierarchical resource management system is proposed to enhance simulation execution performance. Federates in the federation are enraptured in their individual Virtual Machines (VMs), which are consolidated on a group of virtualized multi-core computers. On the computer level, multiple VMs share the resource of the computer according to the simulation workloads of their corresponding federates. On the federation level, some VMs are migrated for workload balance purpose. Therefore, computer resources are fully utilized to conduct useful simulation workloads, avoiding the synchronization overheads. Experiments using synthetic and real simulation workloads have verified that the hierarchical resource management system enhances simulation performance significantly.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114071466","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Exploring many-core architecture design space for parallel discrete event simulation 探索并行离散事件仿真的多核架构设计空间
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601392
Yi Zhang, Jingjing Wang, D. Ponomarev, N. Abu-Ghazaleh
As multicore and manycore processor architectures are emerging and the core counts per chip continue to increase, it is important to evaluate and understand the performance and scalability of Parallel Discrete Event Simulation (PDES) on these platforms. Most existing architectures are still limited to a modest number of cores, feature simple designs and do not exhibit heterogeneity, making it impossible to perform comprehensive analysis and evaluations of PDES on these platforms. Instead, in this paper we evaluate PDES using a full-system cycle-accurate simulator of a multicore processor and memory subsystem. With this approach, it is possible to flexibly configure the simulator and perform exploration of the impact of architecture design choices on the performance of PDES. In particular, we answer the following four questions with respect to PDES performance and scalability: (1) For the same total chip area, what is the best design point in terms of the number of cores and the size of the on-chip cache? (2) What is the impact of using in-order vs. out-of-order cores? (3) What is the impact of a heterogeneous system with a mix of in-order and out-of-order cores? (4) What is the impact of object partitioning on PDES performance in heterogeneous systems? To answer these questions, we use MARSSx86 simulator for evaluating performance, and rely on Cacti and McPAT tools to derive the area and latency estimates for cores and caches.
随着多核和多核处理器架构的出现以及每个芯片的核数不断增加,评估和理解并行离散事件仿真(PDES)在这些平台上的性能和可扩展性非常重要。大多数现有架构仍然局限于有限数量的核心,具有简单的设计,并且没有表现出异质性,因此不可能在这些平台上对PDES进行全面的分析和评估。在本文中,我们使用多核处理器和存储器子系统的全系统周期精确模拟器来评估PDES。使用这种方法,可以灵活地配置模拟器,并探索体系结构设计选择对PDES性能的影响。特别是,我们回答了关于PDES性能和可扩展性的以下四个问题:(1)对于相同的总芯片面积,就核心数量和片上缓存的大小而言,最佳设计点是什么?(2)使用有序内核和无序内核的影响是什么?(3)一个混杂有序和无序内核的异构系统的影响是什么?(4)异构系统中对象分区对PDES性能的影响是什么?为了回答这些问题,我们使用MARSSx86模拟器来评估性能,并依靠Cacti和McPAT工具来得出内核和缓存的面积和延迟估计。
{"title":"Exploring many-core architecture design space for parallel discrete event simulation","authors":"Yi Zhang, Jingjing Wang, D. Ponomarev, N. Abu-Ghazaleh","doi":"10.1145/2601381.2601392","DOIUrl":"https://doi.org/10.1145/2601381.2601392","url":null,"abstract":"As multicore and manycore processor architectures are emerging and the core counts per chip continue to increase, it is important to evaluate and understand the performance and scalability of Parallel Discrete Event Simulation (PDES) on these platforms. Most existing architectures are still limited to a modest number of cores, feature simple designs and do not exhibit heterogeneity, making it impossible to perform comprehensive analysis and evaluations of PDES on these platforms. Instead, in this paper we evaluate PDES using a full-system cycle-accurate simulator of a multicore processor and memory subsystem. With this approach, it is possible to flexibly configure the simulator and perform exploration of the impact of architecture design choices on the performance of PDES. In particular, we answer the following four questions with respect to PDES performance and scalability: (1) For the same total chip area, what is the best design point in terms of the number of cores and the size of the on-chip cache? (2) What is the impact of using in-order vs. out-of-order cores? (3) What is the impact of a heterogeneous system with a mix of in-order and out-of-order cores? (4) What is the impact of object partitioning on PDES performance in heterogeneous systems? To answer these questions, we use MARSSx86 simulator for evaluating performance, and rely on Cacti and McPAT tools to derive the area and latency estimates for cores and caches.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"19 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131551758","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Power consumption of data distribution management for on-line simulations 在线仿真数据分发管理的功耗
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601409
Sabra A. Neal, Gaurav Kantikar, R. Fujimoto
With the growing use of mobile devices, power aware algorithms have become essential. Data distribution management (DDM) is an approach to disseminate information that was proposed in the High Level Architecture (HLA) for modeling and simulation. This paper explores the power consumption of mobile devices used by pedestrians in an urban environment communicating through HLA DDM services operating over a mobile ad-hoc network (MANET). The computation and communication power requirements of Grid-Based and Region-Based implementation approaches to DDM are contrasted and quantitatively evaluated through experimentation and simulation.
随着移动设备的使用越来越多,功率感知算法变得至关重要。数据分布管理(DDM)是高级体系结构(HLA)中提出的一种用于建模和仿真的信息传播方法。本文探讨了行人在城市环境中通过移动自组织网络(MANET)上运行的HLA DDM服务进行通信的移动设备的功耗。通过实验和仿真,对基于网格和基于区域的DDM实现方法的计算能力和通信能力需求进行了对比和定量评价。
{"title":"Power consumption of data distribution management for on-line simulations","authors":"Sabra A. Neal, Gaurav Kantikar, R. Fujimoto","doi":"10.1145/2601381.2601409","DOIUrl":"https://doi.org/10.1145/2601381.2601409","url":null,"abstract":"With the growing use of mobile devices, power aware algorithms have become essential. Data distribution management (DDM) is an approach to disseminate information that was proposed in the High Level Architecture (HLA) for modeling and simulation. This paper explores the power consumption of mobile devices used by pedestrians in an urban environment communicating through HLA DDM services operating over a mobile ad-hoc network (MANET). The computation and communication power requirements of Grid-Based and Region-Based implementation approaches to DDM are contrasted and quantitatively evaluated through experimentation and simulation.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"64 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130947419","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Modeling and simulation of data center networks 数据中心网络建模与仿真
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601389
R. Alshahrani, H. Peyravi
Data centers are integral part of cloud computing that support Web services, online social networking, data analysis, computation intensive applications and scientific computing. They require high performance components for their inter-process communication, storage and sub-communication systems. The performance bottleneck that used to be the processing power has now been shifted to communication speed within data centers. The performance of a data center, in terms of throughput and delay, is directly related to the performance of the underlying internal communication network. In this paper, we introduce an analytical model that can be used to evaluate the underlying network architecture in data centers. The model can further be used to develop simulation tools that extend the scope of performance evaluation beyond what it can be achieved by the theoretical model in terms of various network topologies, different traffic distributions, scalability, and load balancing. While the model is generic, we focus on its implementation for fat-tree networks that are widely used in data centers. The theoretical results are compared and validated with the simulation results for several network configurations. The results of this analysis provide a basis for data center network design and optimization.
数据中心是云计算的组成部分,支持Web服务、在线社交网络、数据分析、计算密集型应用和科学计算。它们需要高性能的进程间通信、存储和子通信系统组件。过去的性能瓶颈是处理能力,现在已经转移到数据中心内的通信速度。数据中心的性能,就吞吐量和延迟而言,与底层内部通信网络的性能直接相关。在本文中,我们介绍了一个可用于评估数据中心底层网络体系结构的分析模型。该模型可以进一步用于开发仿真工具,以扩展性能评估的范围,超出理论模型在各种网络拓扑、不同流量分布、可伸缩性和负载平衡方面所能实现的范围。虽然该模型是通用的,但我们关注的是它在数据中心广泛使用的胖树网络中的实现。将理论结果与几种网络结构的仿真结果进行了比较和验证。分析结果为数据中心网络的设计和优化提供了依据。
{"title":"Modeling and simulation of data center networks","authors":"R. Alshahrani, H. Peyravi","doi":"10.1145/2601381.2601389","DOIUrl":"https://doi.org/10.1145/2601381.2601389","url":null,"abstract":"Data centers are integral part of cloud computing that support Web services, online social networking, data analysis, computation intensive applications and scientific computing. They require high performance components for their inter-process communication, storage and sub-communication systems. The performance bottleneck that used to be the processing power has now been shifted to communication speed within data centers. The performance of a data center, in terms of throughput and delay, is directly related to the performance of the underlying internal communication network. In this paper, we introduce an analytical model that can be used to evaluate the underlying network architecture in data centers. The model can further be used to develop simulation tools that extend the scope of performance evaluation beyond what it can be achieved by the theoretical model in terms of various network topologies, different traffic distributions, scalability, and load balancing. While the model is generic, we focus on its implementation for fat-tree networks that are widely used in data centers. The theoretical results are compared and validated with the simulation results for several network configurations. The results of this analysis provide a basis for data center network design and optimization.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"217 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116012589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
GPU-assisted hybrid network traffic model gpu辅助混合网络流量模型
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601382
Jason Liu, Yuan Liu, Zhihui Du, Ting Li
Large-scale network simulation imposes extremely high computing demand. While parallel processing techniques allows network simulation to scale up and benefit from contemporary high-end computing platforms, multi-resolutional modeling techniques, which differentiate network traffic representations in network models, can substantially reduce the computational requirement. In this paper, we present a novel method for offloading computationally intensive bulk traffic calculations to the background onto GPU, while leaving CPU to simulate detailed network transactions in the foreground. We present a hybrid traffic model that combines the foreground packet-oriented discrete-event simulation on CPU with the background fluid-based numerical calculations on GPU. In particular, we present several optimizations to efficiently integrate packet and fluid flows in simulation with overlapping computations on CPU and GPU. These optimizations exploit the lookahead inherent to the fluid equations, and take advantage of batch runs with fix-up computation and on-demand prefetching to reduce the frequency of interactions between CPU and GPU. Experiments show that our GPU-assisted hybrid traffic model can achieve substantial performance improvement over the CPU-only approach, while still maintaining good accuracy.
大规模网络仿真对计算量的要求非常高。虽然并行处理技术允许网络模拟扩展并受益于当代高端计算平台,但多分辨率建模技术可以在网络模型中区分网络流量表示,可以大大减少计算需求。在本文中,我们提出了一种新的方法,将计算密集型的批量流量计算卸载到GPU的后台,而让CPU在前台模拟详细的网络事务。本文提出了一种混合流量模型,该模型将CPU上前台面向数据包的离散事件仿真与GPU上后台基于流体的数值计算相结合。特别是,我们提出了几种优化方法,以便在CPU和GPU重叠计算的情况下有效地集成模拟中的数据包和流体流动。这些优化利用了流体方程固有的前瞻性,并利用了批量运行与固定计算和按需预取的优势,以减少CPU和GPU之间的交互频率。实验表明,我们的gpu辅助混合流量模型在保持良好准确率的同时,比仅使用cpu的方法取得了显著的性能提升。
{"title":"GPU-assisted hybrid network traffic model","authors":"Jason Liu, Yuan Liu, Zhihui Du, Ting Li","doi":"10.1145/2601381.2601382","DOIUrl":"https://doi.org/10.1145/2601381.2601382","url":null,"abstract":"Large-scale network simulation imposes extremely high computing demand. While parallel processing techniques allows network simulation to scale up and benefit from contemporary high-end computing platforms, multi-resolutional modeling techniques, which differentiate network traffic representations in network models, can substantially reduce the computational requirement. In this paper, we present a novel method for offloading computationally intensive bulk traffic calculations to the background onto GPU, while leaving CPU to simulate detailed network transactions in the foreground. We present a hybrid traffic model that combines the foreground packet-oriented discrete-event simulation on CPU with the background fluid-based numerical calculations on GPU. In particular, we present several optimizations to efficiently integrate packet and fluid flows in simulation with overlapping computations on CPU and GPU. These optimizations exploit the lookahead inherent to the fluid equations, and take advantage of batch runs with fix-up computation and on-demand prefetching to reduce the frequency of interactions between CPU and GPU. Experiments show that our GPU-assisted hybrid traffic model can achieve substantial performance improvement over the CPU-only approach, while still maintaining good accuracy.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"79 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122388766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Lock-free pending event set management in time warp 时间扭曲中的无锁挂起事件集管理
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601393
Sounak Gupta, P. Wilsey
The rapid growth in the parallelism of multi-core processors has opened up new opportunities and challenges for parallel simulation discrete event simulation (PDES). PDES simulators attempt to find parallelism within the pending event set to achieve speedup. Typically the pending event set is sorted to preserve the causal orders of the contained events. Sorting is a key aspect that amplifies contention for exclusive access to the shared event scheduler and events are generally scheduled to follow the time-based order of the pending events. In this work we leverage a Ladder Queue data structure to partition the pending events into groups (called buckets) arranged by adjacent and short regions of time. We assume that the pending events within any one bucket are causally independent and schedule them for execution without sorting and without consideration of their total time-based order. We use the Time Warp mechanism to recover whenever actual dependencies arise. Due to the lack of need for sorting, we further extend our pending event data structure so that it can be organized for lock-free access. Experimental results show consistent speedup for all studied configurations and simulation models. The speedups range from 1.1 to 1.49 with higher speedups occurring with higher thread counts where contention for the shared event set becomes more problematic with a conventional mutex locking mechanism.
多核处理器并行性的快速发展为并行仿真离散事件仿真(PDES)带来了新的机遇和挑战。PDES模拟器尝试在挂起事件集中找到并行性以实现加速。通常对挂起的事件集进行排序,以保留所包含事件的因果顺序。排序是一个关键方面,它放大了对共享事件调度程序的独占访问的争用,事件通常按照挂起事件的基于时间的顺序进行调度。在这项工作中,我们利用梯子队列数据结构将待处理事件划分为按相邻和短时间区域排列的组(称为桶)。我们假设任何一个bucket中的挂起事件在因果关系上是独立的,并且在不排序和不考虑它们基于时间的总顺序的情况下调度它们的执行。我们使用时间扭曲机制来恢复实际的依赖关系。由于不需要排序,我们进一步扩展了挂起事件数据结构,以便可以组织它进行无锁访问。实验结果表明,所有研究配置和仿真模型都具有一致的加速特性。加速范围从1.1到1.49,线程数越多,加速越高,在这种情况下,使用传统的互斥锁机制,共享事件集的争用会变得更成问题。
{"title":"Lock-free pending event set management in time warp","authors":"Sounak Gupta, P. Wilsey","doi":"10.1145/2601381.2601393","DOIUrl":"https://doi.org/10.1145/2601381.2601393","url":null,"abstract":"The rapid growth in the parallelism of multi-core processors has opened up new opportunities and challenges for parallel simulation discrete event simulation (PDES). PDES simulators attempt to find parallelism within the pending event set to achieve speedup. Typically the pending event set is sorted to preserve the causal orders of the contained events. Sorting is a key aspect that amplifies contention for exclusive access to the shared event scheduler and events are generally scheduled to follow the time-based order of the pending events. In this work we leverage a Ladder Queue data structure to partition the pending events into groups (called buckets) arranged by adjacent and short regions of time. We assume that the pending events within any one bucket are causally independent and schedule them for execution without sorting and without consideration of their total time-based order. We use the Time Warp mechanism to recover whenever actual dependencies arise. Due to the lack of need for sorting, we further extend our pending event data structure so that it can be organized for lock-free access. Experimental results show consistent speedup for all studied configurations and simulation models. The speedups range from 1.1 to 1.49 with higher speedups occurring with higher thread counts where contention for the shared event set becomes more problematic with a conventional mutex locking mechanism.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"26 5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123592013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
A case study in using massively parallel simulation for extreme-scale torus network codesign 大规模并行仿真在极端尺度环面网络协同设计中的应用
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601383
M. Mubarak, C. Carothers, R. Ross, P. Carns
A high-bandwidth, low-latency interconnect will be a critical component of future exascale systems. The torus network topology, which uses multidimensional network links to improve path diversity and exploit locality between nodes, is a potential candidate for exascale interconnects. The communication behavior of large-scale scientific applications running on future exascale networks is particularly important and analytical/algorithmic models alone cannot deduce it. Therefore, before building systems, it is important to explore the design space and performance of candidate exascale interconnects by using simulation. We improve upon previous work in this area and present a methodology for modeling and simulating a high-fidelity, validated, and scalable torus network topology at a packet-chunk level detail using the Rensselaer Optimistic Simulation System (ROSS). We execute various configurations of a 1.3 million node torus network model in order to examine the effect of torus dimensionality on network performance with relevant HPC traffic patterns. To the best of our knowledge, these are the largest torus network simulations that are carried out at such a detailed fidelity. In terms of simulation performance, a 1.3 million node, 9-D torus network model is shown to process a simulated exascale-class workload of nearest-neighbor traffic with 100 million message injections per second per node using 65,536 Blue Gene/Q cores in a simulation run-time of only 25 seconds. We also demonstrate that massive-scale simulations are a critical tool in exascale system design since small-scale torus simulations are not always indicative of the network behavior at an exascale size. The take-away message from this case study is that massively parallel simulation is a key enabler for effective extreme-scale network codesign.
高带宽、低延迟的互连将是未来百亿亿级系统的关键组成部分。环面网络拓扑使用多维网络链路来改善路径多样性和利用节点之间的局部性,是百亿亿级互连的潜在候选者。运行在未来百亿亿级网络上的大规模科学应用程序的通信行为尤为重要,仅靠分析/算法模型无法推断出它。因此,在构建系统之前,重要的是通过仿真来探索候选百亿亿级互连的设计空间和性能。我们改进了之前在这一领域的工作,并提出了一种方法,用于使用Rensselaer乐观仿真系统(ROSS)在分组块级详细建模和模拟高保真、经过验证和可扩展的环面网络拓扑。我们执行了130万个节点环面网络模型的各种配置,以检查环面维度对相关HPC流量模式下网络性能的影响。据我们所知,这些是以如此详细的保真度进行的最大的环面网络模拟。在模拟性能方面,使用65,536个Blue Gene/Q内核,一个130万个节点、9-D环面网络模型在模拟运行时间仅为25秒的情况下,可以处理模拟的百亿亿级最近邻流量工作负载,每个节点每秒注入1亿个消息。我们还证明了大规模模拟是百亿亿级系统设计中的关键工具,因为小规模环面模拟并不总是表明百亿亿级规模的网络行为。从这个案例研究中得到的信息是,大规模并行仿真是有效的极端规模网络协同设计的关键推动者。
{"title":"A case study in using massively parallel simulation for extreme-scale torus network codesign","authors":"M. Mubarak, C. Carothers, R. Ross, P. Carns","doi":"10.1145/2601381.2601383","DOIUrl":"https://doi.org/10.1145/2601381.2601383","url":null,"abstract":"A high-bandwidth, low-latency interconnect will be a critical component of future exascale systems. The torus network topology, which uses multidimensional network links to improve path diversity and exploit locality between nodes, is a potential candidate for exascale interconnects.\u0000 The communication behavior of large-scale scientific applications running on future exascale networks is particularly important and analytical/algorithmic models alone cannot deduce it. Therefore, before building systems, it is important to explore the design space and performance of candidate exascale interconnects by using simulation. We improve upon previous work in this area and present a methodology for modeling and simulating a high-fidelity, validated, and scalable torus network topology at a packet-chunk level detail using the Rensselaer Optimistic Simulation System (ROSS). We execute various configurations of a 1.3 million node torus network model in order to examine the effect of torus dimensionality on network performance with relevant HPC traffic patterns. To the best of our knowledge, these are the largest torus network simulations that are carried out at such a detailed fidelity. In terms of simulation performance, a 1.3 million node, 9-D torus network model is shown to process a simulated exascale-class workload of nearest-neighbor traffic with 100 million message injections per second per node using 65,536 Blue Gene/Q cores in a simulation run-time of only 25 seconds. We also demonstrate that massive-scale simulations are a critical tool in exascale system design since small-scale torus simulations are not always indicative of the network behavior at an exascale size. The take-away message from this case study is that massively parallel simulation is a key enabler for effective extreme-scale network codesign.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131390312","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 29
Securing industrial control systems with a simulation-based verification system 用基于仿真的验证系统保护工业控制系统
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2601411
Dong Jin, Y. Ni
Today's quality of life is highly dependent on the successful operation of many large-scale industrial control systems. To enhance their protection against cyber-attacks and operational errors, we develop a simulation-based verification framework with cross-layer verification techniques that allow comprehensive analysis of the entire ICS-specific stack, including application, protocol, and network layers.
今天的生活质量高度依赖于许多大型工业控制系统的成功运行。为了增强对网络攻击和操作错误的保护,我们开发了一个基于仿真的验证框架,该框架采用跨层验证技术,允许对整个ics特定堆栈进行全面分析,包括应用程序,协议和网络层。
{"title":"Securing industrial control systems with a simulation-based verification system","authors":"Dong Jin, Y. Ni","doi":"10.1145/2601381.2601411","DOIUrl":"https://doi.org/10.1145/2601381.2601411","url":null,"abstract":"Today's quality of life is highly dependent on the successful operation of many large-scale industrial control systems. To enhance their protection against cyber-attacks and operational errors, we develop a simulation-based verification framework with cross-layer verification techniques that allow comprehensive analysis of the entire ICS-specific stack, including application, protocol, and network layers.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"677 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123826914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The earth system modeling framework: interoperability infrastructure for high performance weather and climate models 地球系统建模框架:用于高性能天气和气候模型的互操作性基础设施
Pub Date : 2014-05-18 DOI: 10.1145/2601381.2611130
C. DeLuca
Weather forecasting and climate modeling are grand challenge problems because of the complexity and diversity of the processes that must be simulated. The Earth system modeling community is driven to finer resolution grids and faster execution times by the need to provide accurate weather and seasonal forecasts, long term climate projections, and information about societal impacts such as droughts and floods. The models used in these simulations are generally written by teams of specialists, with each team focusing on a specific physical domain, such as the atmosphere, ocean, or sea ice. These specialized components are connected where their surfaces meet to form composite models that are largely self-consistent and allow for important cross-domain feedbacks. Since the components are often developed independently, there is a need for standard component interfaces and "coupling" software that transforms and transfers data so that outputs match expected inputs in the composite modeling system. The Earth System Modeling Framework (ESMF) project began in 2002 as a multi-agency effort to define a standard component interface and architecture, and to pool resources to develop shareable utilities for common functions such as grid remapping, time management and I/O. The ESMF development team was charged with making the infrastructure sufficiently general to accommodate many different numerical approaches and legacy modeling systems, as well as making it reliable, portable, well-documented, accurate, and high performance. To satisfy this charge, the development team needed to develop innovative numerical and computational methods, a formal and rigorous approach to interoperability, and distributed development and testing processes that promote software quality. ESMF has evolved to become the leading U.S. framework in the climate and weather communities, with users including the Navy, NASA, the National Weather Service, and community models supported by the National Science Foundation. In this talk, we will present ESMF's evolution, approach, and future plans.
由于必须模拟的过程的复杂性和多样性,天气预报和气候建模是一个巨大的挑战问题。由于需要提供准确的天气和季节预报、长期气候预测以及有关干旱和洪水等社会影响的信息,地球系统建模社区需要更精细的分辨率网格和更快的执行时间。这些模拟中使用的模型通常是由专家团队编写的,每个团队专注于一个特定的物理领域,如大气、海洋或海冰。这些专门的组件在它们的表面相遇的地方连接起来,形成复合模型,这些模型在很大程度上是自一致的,并允许重要的跨域反馈。由于组件通常是独立开发的,因此需要标准组件接口和“耦合”软件来转换和传输数据,以便输出与组合建模系统中的预期输入相匹配。地球系统建模框架(ESMF)项目开始于2002年,作为一个多机构的努力,定义了一个标准的组件接口和体系结构,并汇集资源,为网格重新映射、时间管理和I/O等公共功能开发可共享的实用程序。ESMF开发团队负责使基础设施足够通用,以适应许多不同的数值方法和遗留建模系统,并使其可靠、可移植、文档完备、准确和高性能。为了满足这一要求,开发团队需要开发创新的数值和计算方法,一种正式而严格的互操作性方法,以及提高软件质量的分布式开发和测试过程。ESMF已经发展成为美国气候和天气社区的主要框架,用户包括海军、NASA、国家气象局和由国家科学基金会支持的社区模型。在这次演讲中,我们将介绍ESMF的发展、方法和未来计划。
{"title":"The earth system modeling framework: interoperability infrastructure for high performance weather and climate models","authors":"C. DeLuca","doi":"10.1145/2601381.2611130","DOIUrl":"https://doi.org/10.1145/2601381.2611130","url":null,"abstract":"Weather forecasting and climate modeling are grand challenge problems because of the complexity and diversity of the processes that must be simulated. The Earth system modeling community is driven to finer resolution grids and faster execution times by the need to provide accurate weather and seasonal forecasts, long term climate projections, and information about societal impacts such as droughts and floods. The models used in these simulations are generally written by teams of specialists, with each team focusing on a specific physical domain, such as the atmosphere, ocean, or sea ice. These specialized components are connected where their surfaces meet to form composite models that are largely self-consistent and allow for important cross-domain feedbacks. Since the components are often developed independently, there is a need for standard component interfaces and \"coupling\" software that transforms and transfers data so that outputs match expected inputs in the composite modeling system. The Earth System Modeling Framework (ESMF) project began in 2002 as a multi-agency effort to define a standard component interface and architecture, and to pool resources to develop shareable utilities for common functions such as grid remapping, time management and I/O. The ESMF development team was charged with making the infrastructure sufficiently general to accommodate many different numerical approaches and legacy modeling systems, as well as making it reliable, portable, well-documented, accurate, and high performance. To satisfy this charge, the development team needed to develop innovative numerical and computational methods, a formal and rigorous approach to interoperability, and distributed development and testing processes that promote software quality.\u0000 ESMF has evolved to become the leading U.S. framework in the climate and weather communities, with users including the Navy, NASA, the National Weather Service, and community models supported by the National Science Foundation. In this talk, we will present ESMF's evolution, approach, and future plans.","PeriodicalId":255272,"journal":{"name":"SIGSIM Principles of Advanced Discrete Simulation","volume":"20 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2014-05-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115774244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
SIGSIM Principles of Advanced Discrete Simulation
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1