首页 > 最新文献

2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文 中文
End-User Driven Technology Benchmarks Based on Market-Risk Workloads 基于市场风险负载的最终用户驱动的技术基准
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.141
P. Lankford, L. Ericson, Andrey Nikolaev
Market risk management is a critical, resourceintensive task for financial trading firms. The industry relies heavily on innovation in technical infrastructure to increase the quality and quantity of risk management information and to reduce the cost of its production. However, until recently, the industry has lacked an independent standard for gauging the potential of new technologies to help. This changed when the STAC BenchmarkTM Council developed STAC-A2TM, a vendorindependent benchmark suite based on real-world market risk analysis workloads. It was specified by trading firms and made actionable by leading HPC vendors. Unlike vendor-developed benchmarks known to the authors, STAC-A2 satisfies all of the requirements important to end-user firms: relevance, neutrality, scalability, and completeness. Intel has demonstrated the utility of STAC-A2 for comparing successive generations of Intel® Xeon® processors.
市场风险管理对金融交易公司来说是一项关键的资源密集型任务。该行业严重依赖技术基础设施的创新,以提高风险管理信息的质量和数量,并降低其生产成本。然而,直到最近,该行业还缺乏一个独立的标准来衡量新技术的帮助潜力。当STAC BenchmarkTM委员会开发了基于真实市场风险分析工作负载的独立于供应商的基准套件STAC- a2tm后,这种情况发生了变化。它由贸易公司指定,并由领先的HPC供应商执行。与作者已知的供应商开发的基准不同,STAC-A2满足最终用户公司的所有重要需求:相关性、中立性、可伸缩性和完整性。英特尔已经展示了用于比较连续几代英特尔®至强®处理器的STAC-A2的实用性。
{"title":"End-User Driven Technology Benchmarks Based on Market-Risk Workloads","authors":"P. Lankford, L. Ericson, Andrey Nikolaev","doi":"10.1109/SC.Companion.2012.141","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.141","url":null,"abstract":"Market risk management is a critical, resourceintensive task for financial trading firms. The industry relies heavily on innovation in technical infrastructure to increase the quality and quantity of risk management information and to reduce the cost of its production. However, until recently, the industry has lacked an independent standard for gauging the potential of new technologies to help. This changed when the STAC BenchmarkTM Council developed STAC-A2TM, a vendorindependent benchmark suite based on real-world market risk analysis workloads. It was specified by trading firms and made actionable by leading HPC vendors. Unlike vendor-developed benchmarks known to the authors, STAC-A2 satisfies all of the requirements important to end-user firms: relevance, neutrality, scalability, and completeness. Intel has demonstrated the utility of STAC-A2 for comparing successive generations of Intel® Xeon® processors.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"47 1","pages":"1171-1175"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91206941","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
Understanding Cloud Data Using Approximate String Matching and Edit Distance 使用近似字符串匹配和编辑距离理解云数据
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.149
Joseph Jupin, Justin Y. Shi, Z. Obradovic
For health and human services, fraud detection and other security services, identity resolution is a core requirement for understanding big data in the cloud. Due to the lack of a globally unique identifier and captured typographic differences for the same identity, identity resolution has high spatial and temporal complexities. We propose a filter and verify method to substantially increase the speed of approximate string matching using edit distance. This method has been found to be almost 80 times faster (130 times when combined with other optimizations) than Damerau-Levenshtein edit distance and preserves all approximate matches. Our method creates compressed signatures for data fields and uses Boolean operations and an enhanced bit counter to quickly compare the distance between the fields. This method is intended to be applied to data records whose fields contain relatively short-length strings, such as those found in most demographic data. Without loss of accuracy, the proposed Fast Bitwise Filter will provide substantial performance gain to approximate string comparison in database, record linkage and deduplication data processing systems.
对于健康和人类服务、欺诈检测和其他安全服务而言,身份解析是理解云中的大数据的核心要求。由于缺乏全局唯一标识符和捕获相同标识的排版差异,标识解析具有很高的空间和时间复杂性。我们提出了一种过滤和验证方法,可以大大提高使用编辑距离进行近似字符串匹配的速度。这种方法被发现比Damerau-Levenshtein编辑距离快近80倍(与其他优化相结合时快130倍),并保留所有近似匹配。我们的方法为数据字段创建压缩签名,并使用布尔运算和增强的位计数器来快速比较字段之间的距离。此方法旨在应用于字段包含相对较短字符串的数据记录,例如大多数人口统计数据中的字符串。在不损失准确性的情况下,所提出的Fast Bitwise Filter将为数据库、记录链接和重复数据处理系统中的近似字符串比较提供实质性的性能增益。
{"title":"Understanding Cloud Data Using Approximate String Matching and Edit Distance","authors":"Joseph Jupin, Justin Y. Shi, Z. Obradovic","doi":"10.1109/SC.Companion.2012.149","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.149","url":null,"abstract":"For health and human services, fraud detection and other security services, identity resolution is a core requirement for understanding big data in the cloud. Due to the lack of a globally unique identifier and captured typographic differences for the same identity, identity resolution has high spatial and temporal complexities. We propose a filter and verify method to substantially increase the speed of approximate string matching using edit distance. This method has been found to be almost 80 times faster (130 times when combined with other optimizations) than Damerau-Levenshtein edit distance and preserves all approximate matches. Our method creates compressed signatures for data fields and uses Boolean operations and an enhanced bit counter to quickly compare the distance between the fields. This method is intended to be applied to data records whose fields contain relatively short-length strings, such as those found in most demographic data. Without loss of accuracy, the proposed Fast Bitwise Filter will provide substantial performance gain to approximate string comparison in database, record linkage and deduplication data processing systems.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"59 1","pages":"1234-1243"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73140971","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
New ASHRAE Thermal Guidelines for Air and Liquid Cooling 新的ASHRAE空气和液体冷却热指南
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.122
M. Ellsworth
This presentation provides a tutorial on ASHRAE thermal guidelines for both air and liquid cooling.
本报告提供了ASHRAE空气和液体冷却热指南的教程。
{"title":"New ASHRAE Thermal Guidelines for Air and Liquid Cooling","authors":"M. Ellsworth","doi":"10.1109/SC.Companion.2012.122","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.122","url":null,"abstract":"This presentation provides a tutorial on ASHRAE thermal guidelines for both air and liquid cooling.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"39 1","pages":"942-961"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"73600053","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Array Databases 数组的数据库
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.365
P. Baumann
Summary form only given. The paper presents the Array Databases using the example of rasdaman, a fully implemented system in operational service since years. We introduce an array query language which embeds seamlessly into standard SQL and show how this language can be supported by a streamlined architecture which allows for effective storage and query optimization and parallelization. In this context we emphasize that Array Database research can gain a lot from combining the knowledge of database, supercomputing, and programming language domains.
只提供摘要形式。本文以rasdaman为例介绍了阵列数据库,这是一个多年来在运营服务中完全实现的系统。我们将介绍一种数组查询语言,它可以无缝嵌入到标准SQL中,并展示如何通过简化的体系结构来支持这种语言,从而实现有效的存储和查询优化以及并行化。在这种背景下,我们强调数组数据库的研究可以从数据库、超级计算和编程语言领域的知识相结合中获益良多。
{"title":"Array Databases","authors":"P. Baumann","doi":"10.1109/SC.Companion.2012.365","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.365","url":null,"abstract":"Summary form only given. The paper presents the Array Databases using the example of rasdaman, a fully implemented system in operational service since years. We introduce an array query language which embeds seamlessly into standard SQL and show how this language can be supported by a streamlined architecture which allows for effective storage and query optimization and parallelization. In this context we emphasize that Array Database research can gain a lot from combining the knowledge of database, supercomputing, and programming language domains.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"28 1","pages":"1329-1329"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74073467","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 8
Poster: Bringing Task and Data Parallelism to Analysis of Climate Model Output 海报:将任务和数据并行性引入气候模式输出分析
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.283
R. Jacob, Jayesh Krishna, Xiabing Xu, S. Mickelson, T. Tautges, M. Wilde, R. Latham, Ian T Foster, R. Ross, M. Hereld, J. Larson, P. Bochev, K. Peterson, M. Taylor, K. Schuchardt, Jain Yin, D. Middleton, Mary Haley, David Brown, Wei Huang, D. Shea, R. Brownrigg, M. Vertenstein, K. Ma, Jingrong Xie
Climate models are both outputting larger and larger amounts of data and are doing it on more sophisticated numerical grids. The tools climate scientists have used to analyze climate output, an essential component of climate modeling, are single threaded and assume rectangular structured grids in their analysis algorithms. We are bringing both task- and data-parallelism to the analysis of climate model output. We have created a new data-parallel library, the Parallel Gridded Analysis Library (ParGAL) which can read in data using parallel I/O, store the data on a compete representation of the structured or unstructured mesh and perform sophisticated analysis on the data in parallel. ParGAL has been used to create a parallel version of a script-based analysis and visualization package. Finally, we have also taken current workflows and employed task-based parallelism to decrease the total execution time.
气候模型输出的数据量越来越大,而且是在更复杂的数值网格上进行的。气候科学家用来分析气候输出(气候建模的重要组成部分)的工具是单线程的,在分析算法中采用矩形结构网格。我们正在将任务和数据并行性引入气候模型输出的分析。我们创建了一个新的数据并行库,并行网格分析库(ParGAL),它可以使用并行I/O读取数据,将数据存储在结构化或非结构化网格的竞争表示中,并并行地对数据进行复杂的分析。ParGAL被用来创建一个基于脚本的分析和可视化包的并行版本。最后,我们还采用了当前的工作流,并采用了基于任务的并行性来减少总执行时间。
{"title":"Poster: Bringing Task and Data Parallelism to Analysis of Climate Model Output","authors":"R. Jacob, Jayesh Krishna, Xiabing Xu, S. Mickelson, T. Tautges, M. Wilde, R. Latham, Ian T Foster, R. Ross, M. Hereld, J. Larson, P. Bochev, K. Peterson, M. Taylor, K. Schuchardt, Jain Yin, D. Middleton, Mary Haley, David Brown, Wei Huang, D. Shea, R. Brownrigg, M. Vertenstein, K. Ma, Jingrong Xie","doi":"10.1109/SC.Companion.2012.283","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.283","url":null,"abstract":"Climate models are both outputting larger and larger amounts of data and are doing it on more sophisticated numerical grids. The tools climate scientists have used to analyze climate output, an essential component of climate modeling, are single threaded and assume rectangular structured grids in their analysis algorithms. We are bringing both task- and data-parallelism to the analysis of climate model output. We have created a new data-parallel library, the Parallel Gridded Analysis Library (ParGAL) which can read in data using parallel I/O, store the data on a compete representation of the structured or unstructured mesh and perform sophisticated analysis on the data in parallel. ParGAL has been used to create a parallel version of a script-based analysis and visualization package. Finally, we have also taken current workflows and employed task-based parallelism to decrease the total execution time.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"12 1","pages":"1495"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76674227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
An Evolutionary Path to Object Storage Access 对象存储访问的进化路径
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.17
David Goodell, S. Kim, R. Latham, M. Kandemir, R. Ross
High-performance computing (HPC) storage systems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.
高性能计算(HPC)存储系统通常由对象存储系统组成,通过POSIX文件接口访问。然而,系统规模和存储系统复杂性的快速增长揭示了该模型的许多局限性。特别是,应用程序和库在将数据划分为具有独立并发控制的单元的能力方面受到限制,并且将复杂的科学数据模型映射到POSIX文件模型是不方便的。在本文中,我们提出了一个可供应用程序和库使用的替代接口,该接口提供了对底层存储对象的直接访问。该模型允许应用程序和库围绕这些对象组织存储访问,以避免锁争用,而无需创建许多单独的文件。此外,复杂的数据模型更容易组织成多个对象数据流,从而简化了可变长度数据的存储,并允许选择与访问需求相关的并行度。我们的方法允许存储在这个新模型中的数据集与POSIX文件共存,允许随着时间的推移向新模型演进。我们在PVFS, PLFS和Parallel netCDF包中应用这些概念来原型化模型并描述我们的经验。
{"title":"An Evolutionary Path to Object Storage Access","authors":"David Goodell, S. Kim, R. Latham, M. Kandemir, R. Ross","doi":"10.1109/SC.Companion.2012.17","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.17","url":null,"abstract":"High-performance computing (HPC) storage systems typically consist of an object storage system that is accessed via the POSIX file interface. However, rapid increases in system scales and storage system complexity have uncovered a number of limitations in this model. In particular, applications and libraries are limited in their ability to partition data into units with independent concurrency control, and mapping complex science data models into the POSIX file model is inconvenient at best. In this paper we propose an alternative interface for use by applications and libraries that provides direct access to underlying storage objects. This model allows applications and libraries to organize storage access around these objects in order to avoid lock contention without needing to create many separate files. Additionally, complex data models are more readily organized into multiple object data streams, simplifying the storage of variable-length data and allowing a choice of degree of parallelism related to access needs. Our approach provides for datasets stored in this new model to coexist with POSIX files, allowing evolution to the new model over time. We apply these concepts in the PVFS, PLFS, and Parallel netCDF packages to prototype the model and describe our experiences.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"108 1","pages":"36-41"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"74661813","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 13
Scalable Cyber-Security for Terabit Cloud Computing 太比特云计算的可扩展网络安全
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.338
Jordi Ros-Giralt, Péter Szilágyi, R. Lethin
This paper addresses the problem of scalable cyber-security using a cloud computing architecture. Scalability is treated in two contexts: (1) performance and power efficiency and (2) degree of cyber security-relevant information detected by the cyber-security cloud (CSC). We provide a framework to construct CSCs, which derives from a set of fundamental building blocks (forwarders, analyzers and grounds) and the identification of the smallest functional units (atomic CSC cells or simply aCS C cells) capable of embedding the full functionality of the cyber-security cloud. aCSC cells are then studied and several high-performance algorithms are presented to optimize the system's performance and power efficiency. Among these, a new queuing policy - called tail early detection (TED) - is introduced to proactively drop packets in a way that the degree of detected information is maximized while saving power by avoiding spending cycles on less relevant traffic components. We also show that it is possible to use aCSC cells as core building blocks to construct arbitrarily large cyber-security clouds by structuring the cells using a hierarchical architecture. To demonstrate the utility of our framework, we implement one cyber-security "mini-cloud" on a single chip prototype based on the Tilera's TILEPro64 processor demonstrating performance of up to 10Gbps.
本文讨论了使用云计算架构的可扩展网络安全问题。可扩展性是在两种情况下处理的:(1)性能和功率效率;(2)网络安全云(CSC)检测到的网络安全相关信息的程度。我们提供了一个构建CSC的框架,该框架源自一组基本构建块(转发器、分析器和基础)和最小功能单元(原子CSC细胞或简单的aCS C细胞)的识别,能够嵌入网络安全云的全部功能。然后研究了aCSC单元,并提出了几种高性能算法来优化系统的性能和功率效率。其中,引入了一种新的队列策略-尾部早期检测(TED) -以一种方式主动丢弃数据包,以最大程度地检测到信息,同时通过避免在不相关的流量组件上花费周期来节省电力。我们还表明,可以使用aCSC单元作为核心构建块,通过使用分层架构构建单元来构建任意大的网络安全云。为了演示我们的框架的实用性,我们在基于Tilera的tile64处理器的单芯片原型上实现了一个网络安全“迷你云”,其性能高达10Gbps。
{"title":"Scalable Cyber-Security for Terabit Cloud Computing","authors":"Jordi Ros-Giralt, Péter Szilágyi, R. Lethin","doi":"10.1109/SC.Companion.2012.338","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.338","url":null,"abstract":"This paper addresses the problem of scalable cyber-security using a cloud computing architecture. Scalability is treated in two contexts: (1) performance and power efficiency and (2) degree of cyber security-relevant information detected by the cyber-security cloud (CSC). We provide a framework to construct CSCs, which derives from a set of fundamental building blocks (forwarders, analyzers and grounds) and the identification of the smallest functional units (atomic CSC cells or simply aCS C cells) capable of embedding the full functionality of the cyber-security cloud. aCSC cells are then studied and several high-performance algorithms are presented to optimize the system's performance and power efficiency. Among these, a new queuing policy - called tail early detection (TED) - is introduced to proactively drop packets in a way that the degree of detected information is maximized while saving power by avoiding spending cycles on less relevant traffic components. We also show that it is possible to use aCSC cells as core building blocks to construct arbitrarily large cyber-security clouds by structuring the cells using a hierarchical architecture. To demonstrate the utility of our framework, we implement one cyber-security \"mini-cloud\" on a single chip prototype based on the Tilera's TILEPro64 processor demonstrating performance of up to 10Gbps.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"1 1","pages":"1607-1616"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76311938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Trace Driven Data Structure Transformations 跟踪驱动的数据结构转换
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.65
T. Janjusic, K. Kavi, Christos Kartsaklis
As the complexity of scientific codes and computational hardware increases it is increasingly important to study the effects of data-structure layouts on program memory behavior. Program structure layouts affect the memory performance differently, therefore we need the capability to effectively study such transformations without the need to rewrite application codes. Trace-driven simulations are an effective and convenient mechanism to simulate program behavior at various granularities. During an application's execution, a tool known as a tracer or profiler, collects program flow data and records program instructions. The trace-file consists of tuples that associate each program instruction with program internal variables. In this paper we outline a proof-of-concept mechanism to apply data-structure transformations during trace simulation and observe effects on memory without the need to manually transform an application's code.
随着科学代码和计算硬件复杂性的增加,研究数据结构布局对程序内存行为的影响变得越来越重要。程序结构布局对内存性能的影响不同,因此我们需要在不重写应用程序代码的情况下有效地研究这种转换的能力。跟踪驱动仿真是一种在不同粒度上模拟程序行为的有效且方便的机制。在应用程序的执行过程中,一个被称为跟踪器或分析器的工具收集程序流数据并记录程序指令。跟踪文件由元组组成,这些元组将每个程序指令与程序内部变量关联起来。在本文中,我们概述了一种概念验证机制,可以在跟踪模拟期间应用数据结构转换并观察对内存的影响,而无需手动转换应用程序的代码。
{"title":"Trace Driven Data Structure Transformations","authors":"T. Janjusic, K. Kavi, Christos Kartsaklis","doi":"10.1109/SC.Companion.2012.65","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.65","url":null,"abstract":"As the complexity of scientific codes and computational hardware increases it is increasingly important to study the effects of data-structure layouts on program memory behavior. Program structure layouts affect the memory performance differently, therefore we need the capability to effectively study such transformations without the need to rewrite application codes. Trace-driven simulations are an effective and convenient mechanism to simulate program behavior at various granularities. During an application's execution, a tool known as a tracer or profiler, collects program flow data and records program instructions. The trace-file consists of tuples that associate each program instruction with program internal variables. In this paper we outline a proof-of-concept mechanism to apply data-structure transformations during trace simulation and observe effects on memory without the need to manually transform an application's code.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"146 1","pages":"456-464"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"76443786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned 基于Blue Gene/Q的代数多重网格性能建模:经验教训
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.57
Hormozd Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. Yang
The IBM Blue Gene/Q represents a large step in the evolution of massively parallel machines. It features 16-core compute nodes, with additional parallelism in the form of four simultaneous hardware threads per core, connected together by a five-dimensional torus network. Machines are being built with core counts in the hundreds of thousands, with the largest, Sequoia, featuring over 1.5 million cores. In this paper, we develop a performance model for the solve cycle of algebraic multigrid on Blue Gene/Q to help us understand the issues this popular linear solver for large, sparse linear systems faces on this architecture. We validate the model on a Blue Gene/Q at IBM, and conclude with a discussion of the implications of our results.
IBM蓝色基因/Q代表了大规模并行机器进化的一大步。它具有16核计算节点,每个核同时有四个硬件线程,通过一个五维环面网络连接在一起,具有额外的并行性。机器的核心数量达到数十万个,其中最大的红杉(Sequoia)拥有超过150万个核心。在本文中,我们开发了一个在Blue Gene/Q上求解代数多重网格循环的性能模型,以帮助我们理解这种流行的线性求解器在这种架构上面对的大型稀疏线性系统的问题。我们在IBM的Blue Gene/Q上验证了模型,最后讨论了我们的结果的含义。
{"title":"Performance Modeling of Algebraic Multigrid on Blue Gene/Q: Lessons Learned","authors":"Hormozd Gahvari, W. Gropp, K. E. Jordan, M. Schulz, U. Yang","doi":"10.1109/SC.Companion.2012.57","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.57","url":null,"abstract":"The IBM Blue Gene/Q represents a large step in the evolution of massively parallel machines. It features 16-core compute nodes, with additional parallelism in the form of four simultaneous hardware threads per core, connected together by a five-dimensional torus network. Machines are being built with core counts in the hundreds of thousands, with the largest, Sequoia, featuring over 1.5 million cores. In this paper, we develop a performance model for the solve cycle of algebraic multigrid on Blue Gene/Q to help us understand the issues this popular linear solver for large, sparse linear systems faces on this architecture. We validate the model on a Blue Gene/Q at IBM, and conclude with a discussion of the implications of our results.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"39 3 1","pages":"377-385"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79906465","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
TDPSS: A Scalable Time Domain Power System Simulator for Dynamic Security Assessment TDPSS:用于动态安全评估的可扩展时域电力系统模拟器
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.51
S. Khaitan, J. McCalley
Simulation plays a very crucial role to model, study and experiment with any design innovation proposed in the power systems. Since mathematical modeling of power systems leads to tens of thousands of stiff DAEs (differential and algebraic equations), the design of power system simulators involve exercising a trade-off between the simulation speed and modeling accuracy. Lack of efficient and detailed simulators forces the designers to experiment their techniques with small test systems and hence, the results obtained from such experiments may not be representative of the results obtained using real-life power systems. In this paper, we present TDPSS, a high speed time domain power system simulator for dynamic security assessment. TDPSS has been designed using object-oriented programming framework, and thus, it is modular and extensible. By offering a variety of models of power system components and fast numerical algorithms, it provides the user with the flexibility to experiment with different design options in an efficient manner. We discuss the design of TDPSS to give insights into the simulation infrastructure and also discuss the areas where TDPSS can be extended for parallel contingency analysis. We also validate it against the commercial power system simulators, namely PSSE and DSA Tools. Further, we compare the simulation speed of TPDSS for different numerical algorithms. The results have shown that TDPSS is accurate and also outperforms the commonly used commercial simulator PSSE in terms of its computational efficiency.
仿真对于电力系统中任何设计创新的建模、研究和实验都起着至关重要的作用。由于电力系统的数学建模导致成千上万的刚性DAEs(微分方程和代数方程),因此电力系统模拟器的设计涉及在仿真速度和建模精度之间进行权衡。由于缺乏高效和详细的模拟器,设计人员不得不在小型测试系统上试验他们的技术,因此,从这些实验中获得的结果可能不能代表使用实际电力系统获得的结果。本文提出了一种用于电力系统动态安全评估的高速时域仿真器TDPSS。TDPSS采用面向对象的编程框架进行设计,具有模块化和可扩展性。通过提供各种电力系统组件模型和快速数值算法,它为用户提供了灵活的实验,以有效的方式不同的设计方案。我们讨论了TDPSS的设计,以深入了解仿真基础设施,并讨论了可以扩展TDPSS进行并行偶然性分析的领域。我们还对商业电力系统模拟器,即PSSE和DSA工具进行了验证。此外,我们比较了不同数值算法下TPDSS的仿真速度。结果表明,TDPSS是精确的,并且在计算效率方面优于常用的商用模拟器PSSE。
{"title":"TDPSS: A Scalable Time Domain Power System Simulator for Dynamic Security Assessment","authors":"S. Khaitan, J. McCalley","doi":"10.1109/SC.Companion.2012.51","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.51","url":null,"abstract":"Simulation plays a very crucial role to model, study and experiment with any design innovation proposed in the power systems. Since mathematical modeling of power systems leads to tens of thousands of stiff DAEs (differential and algebraic equations), the design of power system simulators involve exercising a trade-off between the simulation speed and modeling accuracy. Lack of efficient and detailed simulators forces the designers to experiment their techniques with small test systems and hence, the results obtained from such experiments may not be representative of the results obtained using real-life power systems. In this paper, we present TDPSS, a high speed time domain power system simulator for dynamic security assessment. TDPSS has been designed using object-oriented programming framework, and thus, it is modular and extensible. By offering a variety of models of power system components and fast numerical algorithms, it provides the user with the flexibility to experiment with different design options in an efficient manner. We discuss the design of TDPSS to give insights into the simulation infrastructure and also discuss the areas where TDPSS can be extended for parallel contingency analysis. We also validate it against the commercial power system simulators, namely PSSE and DSA Tools. Further, we compare the simulation speed of TPDSS for different numerical algorithms. The results have shown that TDPSS is accurate and also outperforms the commonly used commercial simulator PSSE in terms of its computational efficiency.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"14 1","pages":"323-332"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80103585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
期刊
2012 SC Companion: High Performance Computing, Networking Storage and Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1