首页 > 最新文献

2012 SC Companion: High Performance Computing, Networking Storage and Analysis最新文献

英文 中文
Abstract: Towards Highly Accurate Large-Scale Ab Initio Calculations Using Fragment Molecular Orbital Method in GAMESS 利用GAMESS中的片段分子轨道法实现高精度大规模从头算
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.170
Maricris L. Mayes, G. Fletcher, M. Gordon
Summary form only given. One of the major challenges of modern quantum chemistry (QC) is to apply it to large systems with thousands of correlated electrons and basis functions. The availability of supercomputers and development of novel methods are necessary to realize this challenge. In particular, we employ linear scaling Fragment Molecular Orbital (FMO) method which decompose the large system into smaller, localized fragments which can be treated with high-level QC method like MP2. FMO is inherently scalable since the individual fragment calculations can be carried out simultaneously on separate processor groups. It is implemented in GAMESS, a popular ab-initio QC program. We present the scalability and performance of FMO on Intrepid (Blue Gene/P) and Blue Gene/Q systems at ALCF.
只提供摘要形式。现代量子化学(QC)的主要挑战之一是将其应用于具有数千个相关电子和基函数的大型系统。要实现这一挑战,超级计算机的可用性和新方法的发展是必要的。特别是,我们采用线性缩放片段分子轨道(FMO)方法,将大系统分解成更小的局部片段,可以用MP2等高级QC方法处理。FMO具有固有的可扩展性,因为单个片段计算可以在单独的处理器组上同时进行。它是在GAMESS中实现的,GAMESS是一个流行的从头算QC程序。我们在ALCF的Intrepid (Blue Gene/P)和Blue Gene/Q系统上展示了FMO的可扩展性和性能。
{"title":"Abstract: Towards Highly Accurate Large-Scale Ab Initio Calculations Using Fragment Molecular Orbital Method in GAMESS","authors":"Maricris L. Mayes, G. Fletcher, M. Gordon","doi":"10.1109/SC.Companion.2012.170","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.170","url":null,"abstract":"Summary form only given. One of the major challenges of modern quantum chemistry (QC) is to apply it to large systems with thousands of correlated electrons and basis functions. The availability of supercomputers and development of novel methods are necessary to realize this challenge. In particular, we employ linear scaling Fragment Molecular Orbital (FMO) method which decompose the large system into smaller, localized fragments which can be treated with high-level QC method like MP2. FMO is inherently scalable since the individual fragment calculations can be carried out simultaneously on separate processor groups. It is implemented in GAMESS, a popular ab-initio QC program. We present the scalability and performance of FMO on Intrepid (Blue Gene/P) and Blue Gene/Q systems at ALCF.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"109 1","pages":"1335-1335"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86007733","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Abstract: MemzNet: Memory-Mapped Zero-Copy Network Channel for Moving Large Datasets over 100Gbps Network MemzNet:内存映射零拷贝网络通道,用于在100Gbps网络上移动大数据集
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.294
Mehmet Balman
High-bandwidth networks are poised to provide new opportunities in tackling large data challenges in today's scientific applications. However, increasing the bandwidth is not sufficient by itself; we need careful evaluation of future high-bandwidth networks from the applications' perspective. We have experimented with current state-of-the-art data movement tools, and realized that file-centric data transfer protocols do not perform well with managing the transfer of many small files in high-bandwidth networks, even when using parallel streams or concurrent transfers. We require enhancements in current middleware tools to take advantage of future networking frameworks. To improve performance and efficiency, we develop an experimental prototype, called MemzNet: Memory-mapped Zero-copy Network Channel, which uses a block-based data movement method in moving large scientific datasets. We have implemented MemzNet that takes the approach of aggregating files into blocks and providing dynamic data channel management. In this work, we present our initial results in 100Gbps network.
在当今的科学应用中,高带宽网络为应对大数据挑战提供了新的机遇。然而,增加带宽本身是不够的;我们需要从应用的角度对未来的高带宽网络进行仔细的评估。我们对当前最先进的数据移动工具进行了实验,并意识到以文件为中心的数据传输协议在管理高带宽网络中许多小文件的传输方面表现不佳,即使在使用并行流或并发传输时也是如此。我们需要增强当前的中间件工具,以利用未来的网络框架。为了提高性能和效率,我们开发了一个实验原型,称为MemzNet:内存映射零复制网络通道,它使用基于块的数据移动方法来移动大型科学数据集。我们已经实现了MemzNet,它采用将文件聚合到块中的方法,并提供动态数据通道管理。在这项工作中,我们介绍了我们在100Gbps网络中的初步结果。
{"title":"Abstract: MemzNet: Memory-Mapped Zero-Copy Network Channel for Moving Large Datasets over 100Gbps Network","authors":"Mehmet Balman","doi":"10.1109/SC.Companion.2012.294","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.294","url":null,"abstract":"High-bandwidth networks are poised to provide new opportunities in tackling large data challenges in today's scientific applications. However, increasing the bandwidth is not sufficient by itself; we need careful evaluation of future high-bandwidth networks from the applications' perspective. We have experimented with current state-of-the-art data movement tools, and realized that file-centric data transfer protocols do not perform well with managing the transfer of many small files in high-bandwidth networks, even when using parallel streams or concurrent transfers. We require enhancements in current middleware tools to take advantage of future networking frameworks. To improve performance and efficiency, we develop an experimental prototype, called MemzNet: Memory-mapped Zero-copy Network Channel, which uses a block-based data movement method in moving large scientific datasets. We have implemented MemzNet that takes the approach of aggregating files into blocks and providing dynamic data channel management. In this work, we present our initial results in 100Gbps network.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"78 1","pages":"1511-1512"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78478286","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Poster: Portals 4 Network Programming Interface 海报:门户4网络编程接口
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.264
Brian W. Barrett, R. Brightwell, K. Underwood, K. Hemmert
Portals 4 is an advanced network programming interface which allows for the development of a rich set of upper layer protocols. By careful selection of interfaces and strong progress guarantees, Portals 4 is able to support multiple protocols without significant overhead. Recent developments with Portals 4, including development of MPI, SHMEM, and GASNet protocols are discussed.
Portals 4是一种高级网络编程接口,它允许开发一组丰富的上层协议。通过仔细选择接口和强大的进度保证,portal 4能够在没有显著开销的情况下支持多个协议。讨论了portal 4的最新发展,包括MPI、SHMEM和GASNet协议的发展。
{"title":"Poster: Portals 4 Network Programming Interface","authors":"Brian W. Barrett, R. Brightwell, K. Underwood, K. Hemmert","doi":"10.1109/SC.Companion.2012.264","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.264","url":null,"abstract":"Portals 4 is an advanced network programming interface which allows for the development of a rich set of upper layer protocols. By careful selection of interfaces and strong progress guarantees, Portals 4 is able to support multiple protocols without significant overhead. Recent developments with Portals 4, including development of MPI, SHMEM, and GASNet protocols are discussed.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"19 1","pages":"1467-1467"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81830428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Integrating Policy with Scientific Workflow Management for Data-Intensive Applications 集成策略与科学工作流管理的数据密集型应用
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.29
A. Chervenak, David E. Smith, Weiwei Chen, E. Deelman
As scientific applications generate and consume data at ever-increasing rates, scientific workflow systems that manage the growing complexity of analyses and data movement will increase in importance. The goal of our work is to improve the overall performance of scientific workflows by using policy to improve data staging into and out of computational resources. We developed a Policy Service that gives advice to the workflow system about how to stage data, including advice on the order of data transfers and on transfer parameters. The Policy Service gives this advice based on its knowledge of ongoing transfers, recent transfer performance, and the current allocation of resources for data staging. The paper describes the architecture of the Policy Service and its integration with the Pegasus Workflow Management System. It employs a range of policies for data staging, and presents performance results for one policy that does a greedy allocation of data transfer streams between source and destination sites. The results show performance improvements for a data-intensive workflow: the Montage astronomy workflow augmented to perform additional large data staging operations.
随着科学应用程序以不断增长的速度生成和使用数据,管理日益复杂的分析和数据移动的科学工作流系统将变得越来越重要。我们的工作目标是通过使用策略来改进进出计算资源的数据分段,从而提高科学工作流的整体性能。我们开发了一个Policy Service,它向工作流系统提供关于如何存放数据的建议,包括关于数据传输顺序和传输参数的建议。Policy Service根据其对正在进行的传输、最近的传输性能和当前用于数据暂存在的资源分配的了解提供此建议。本文描述了策略服务的体系结构及其与Pegasus工作流管理系统的集成。它采用了一系列策略进行数据暂放,并给出了一个策略的性能结果,该策略在源站点和目标站点之间贪婪地分配数据传输流。结果显示了数据密集型工作流的性能改进:增强了Montage天文学工作流以执行额外的大数据分段操作。
{"title":"Integrating Policy with Scientific Workflow Management for Data-Intensive Applications","authors":"A. Chervenak, David E. Smith, Weiwei Chen, E. Deelman","doi":"10.1109/SC.Companion.2012.29","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.29","url":null,"abstract":"As scientific applications generate and consume data at ever-increasing rates, scientific workflow systems that manage the growing complexity of analyses and data movement will increase in importance. The goal of our work is to improve the overall performance of scientific workflows by using policy to improve data staging into and out of computational resources. We developed a Policy Service that gives advice to the workflow system about how to stage data, including advice on the order of data transfers and on transfer parameters. The Policy Service gives this advice based on its knowledge of ongoing transfers, recent transfer performance, and the current allocation of resources for data staging. The paper describes the architecture of the Policy Service and its integration with the Pegasus Workflow Management System. It employs a range of policies for data staging, and presents performance results for one policy that does a greedy allocation of data transfer streams between source and destination sites. The results show performance improvements for a data-intensive workflow: the Montage astronomy workflow augmented to perform additional large data staging operations.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"28 1","pages":"140-149"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90303692","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
Philosophy 301: But Can You "Handle the Truth"? 哲学301:但你能“面对真相”吗?
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.124
Nicolas Dubé
This presentation debunks three "truths" as seen from Plato's cave: the untold story of PUE, clean coal, and water is free and available.
这个演讲揭穿了从柏拉图的洞穴中看到的三个“真理”:PUE的不为人知的故事,清洁煤,水是免费的。
{"title":"Philosophy 301: But Can You \"Handle the Truth\"?","authors":"Nicolas Dubé","doi":"10.1109/SC.Companion.2012.124","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.124","url":null,"abstract":"This presentation debunks three \"truths\" as seen from Plato's cave: the untold story of PUE, clean coal, and water is free and available.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"19 1","pages":"993-1017"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90699826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project 社区可访问的高通量计算数据存储:来自材料项目的经验
Pub Date : 2012-11-10 DOI: 10.1109/SC.COMPANION.2012.150
D. Gunter, S. Cholia, Anubhav Jain, M. Kocher, K. Persson, L. Ramakrishnan, S. Ong, G. Ceder
Efforts such as the Human Genome Project provided a dramatic example of opening scientific datasets to the community. Making high quality scientific data accessible through an online database allows scientists around the world to multiply the value of that data through scientific innovations. Similarly, the goal of the Materials Project is to calculate physical properties of all known inorganic materials and make this data freely available, with the goal of accelerating to invention of better materials. However, the complexity of scientific data, and the complexity of the simulations needed to generate and analyze it, pose challenges to current software ecosystem. In this paper, we describe the approach we used in the Materials Project to overcome these challenges and create and disseminate a high quality database of materials properties computed by solving the basic laws of physics. Our infrastructure requires a novel combination of highthroughput approaches with broadly applicable and scalable approaches to data storage and dissemination.
人类基因组计划等努力为向社会开放科学数据集提供了一个引人注目的例子。通过在线数据库提供高质量的科学数据,使世界各地的科学家能够通过科学创新使这些数据的价值成倍增加。同样,材料项目的目标是计算所有已知无机材料的物理性质,并使这些数据免费提供,以加速发明更好的材料。然而,科学数据的复杂性,以及生成和分析这些数据所需的模拟的复杂性,给当前的软件生态系统带来了挑战。在本文中,我们描述了我们在材料项目中使用的方法,以克服这些挑战,并通过解决基本物理定律来创建和传播高质量的材料属性数据库。我们的基础设施需要高吞吐量方法与广泛适用和可扩展的数据存储和传播方法的新颖组合。
{"title":"Community Accessible Datastore of High-Throughput Calculations: Experiences from the Materials Project","authors":"D. Gunter, S. Cholia, Anubhav Jain, M. Kocher, K. Persson, L. Ramakrishnan, S. Ong, G. Ceder","doi":"10.1109/SC.COMPANION.2012.150","DOIUrl":"https://doi.org/10.1109/SC.COMPANION.2012.150","url":null,"abstract":"Efforts such as the Human Genome Project provided a dramatic example of opening scientific datasets to the community. Making high quality scientific data accessible through an online database allows scientists around the world to multiply the value of that data through scientific innovations. Similarly, the goal of the Materials Project is to calculate physical properties of all known inorganic materials and make this data freely available, with the goal of accelerating to invention of better materials. However, the complexity of scientific data, and the complexity of the simulations needed to generate and analyze it, pose challenges to current software ecosystem. In this paper, we describe the approach we used in the Materials Project to overcome these challenges and create and disseminate a high quality database of materials properties computed by solving the basic laws of physics. Our infrastructure requires a novel combination of highthroughput approaches with broadly applicable and scalable approaches to data storage and dissemination.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"23 1","pages":"1244-1251"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"90792496","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
Application performance characterization and analysis on Blue Gene/Q Blue Gene/Q应用性能表征与分析
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.358
B. Walkup
This article consists of a collection of slides from the author's conference presentation. The author concludes that The Blue Gene/Q design, low-power simple cores, four hardware threads per core, resu lts in high instruction throughput, and thus exceptional power efficiency for applications. Can effectively fill in pipeline stalls and hide latencies in the memory subsystem. The consequence is low performance per thread, so a high degree of parallelization is required for high application performance. Traditional programming methods (MPI, OpenMP, Pthreads) hold up at very large scales. Memory costs can limit scaling when there are data-structures with size linear in the number of processes, threading helps by keeping the number of processes manageable. Detailed performance analysis is viable at > 10^6 processes but requires care. On-the-fly performance data reduction has merits.
本文由作者在会议上的演讲幻灯片组成。作者得出结论:Blue Gene/Q设计,低功耗的简单内核,每核四个硬件线程,导致高指令吞吐量,从而为应用程序提供卓越的功耗效率。可以有效地填补管道的停顿和隐藏内存子系统的延迟。其结果是每个线程的性能较低,因此需要高度的并行化来获得较高的应用程序性能。传统的编程方法(MPI、OpenMP、Pthreads)适用于非常大的规模。当数据结构的大小与进程数量呈线性关系时,内存成本可能会限制扩展,线程可以帮助保持进程数量的可管理性。详细的性能分析在bbb10 ^6进程中是可行的,但需要注意。动态性能数据缩减有其优点。
{"title":"Application performance characterization and analysis on Blue Gene/Q","authors":"B. Walkup","doi":"10.1109/SC.Companion.2012.358","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.358","url":null,"abstract":"This article consists of a collection of slides from the author's conference presentation. The author concludes that The Blue Gene/Q design, low-power simple cores, four hardware threads per core, resu lts in high instruction throughput, and thus exceptional power efficiency for applications. Can effectively fill in pipeline stalls and hide latencies in the memory subsystem. The consequence is low performance per thread, so a high degree of parallelization is required for high application performance. Traditional programming methods (MPI, OpenMP, Pthreads) hold up at very large scales. Memory costs can limit scaling when there are data-structures with size linear in the number of processes, threading helps by keeping the number of processes manageable. Detailed performance analysis is viable at > 10^6 processes but requires care. On-the-fly performance data reduction has merits.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"77 1","pages":"2247-2280"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"80791774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
Abstract: Leveraging PEPPHER Technology for Performance Portable Supercomputing 摘要:利用PEPPHER技术实现高性能便携式超级计算
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.212
C. Kessler, Usman Dastgeer, M. Majeed, N. Furmento, Samuel Thibault, R. Namyst, S. Benkner, Sabri Pllana, J. Träff, Martin Wimmer
PEPPHER is a 3-year EU FP7 project that develops a novel approach and framework to enhance performance portability and programmability of heterogeneous multi-core systems. Its primary target is single-node heterogeneous systems, where several CPU cores are supported by accelerators such as GPUs. This poster briefly surveys the PEPPHER framework for single-node systems, and elaborates on the prospectives for leveraging the PEPPHER approach to generate performance-portable code for heterogeneous multi-node systems.
PEPPHER是一个为期3年的欧盟FP7项目,旨在开发一种新的方法和框架,以增强异构多核系统的性能可移植性和可编程性。它的主要目标是单节点异构系统,其中由gpu等加速器支持多个CPU内核。这张海报简要介绍了单节点系统的PEPPHER框架,并详细阐述了利用PEPPHER方法为异构多节点系统生成性能可移植代码的前景。
{"title":"Abstract: Leveraging PEPPHER Technology for Performance Portable Supercomputing","authors":"C. Kessler, Usman Dastgeer, M. Majeed, N. Furmento, Samuel Thibault, R. Namyst, S. Benkner, Sabri Pllana, J. Träff, Martin Wimmer","doi":"10.1109/SC.Companion.2012.212","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.212","url":null,"abstract":"PEPPHER is a 3-year EU FP7 project that develops a novel approach and framework to enhance performance portability and programmability of heterogeneous multi-core systems. Its primary target is single-node heterogeneous systems, where several CPU cores are supported by accelerators such as GPUs. This poster briefly surveys the PEPPHER framework for single-node systems, and elaborates on the prospectives for leveraging the PEPPHER approach to generate performance-portable code for heterogeneous multi-node systems.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"90 1","pages":"1395-1396"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"86603215","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Poster: Automatically Adapting Programs for Mixed-Precision Floating-Point Computation 海报:自动适应混合精度浮点计算程序
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.232
Michael O. Lam, B. Supinski, M. LeGendre, J. Hollingsworth
As scientific computation continues to scale, efficient use of floating-point arithmetic processors is critical. Lower precision allows streaming architectures to perform more operations per second and can reduce memory bandwidth pressure on all architectures. However, using a precision that is too low for a given algorithm and data set leads to inaccurate results. We present a framework that uses binary instrumentation and modification to build mixed-precision configurations of existing binaries that were originally developed to use only double-precision. Initial results with the Algebraic MultiGrid kernel demonstrate a nearly 2χ speedup.
随着科学计算的不断扩展,浮点算术处理器的有效使用至关重要。较低的精度允许流架构每秒执行更多的操作,并且可以减少所有架构的内存带宽压力。然而,对于给定的算法和数据集,使用过低的精度会导致不准确的结果。我们提出了一个框架,该框架使用二进制工具和修改来构建混合精度配置的现有二进制文件,这些文件最初仅使用双精度。使用algeaic MultiGrid内核的初步结果显示了近2χ的加速。
{"title":"Poster: Automatically Adapting Programs for Mixed-Precision Floating-Point Computation","authors":"Michael O. Lam, B. Supinski, M. LeGendre, J. Hollingsworth","doi":"10.1109/SC.Companion.2012.232","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.232","url":null,"abstract":"As scientific computation continues to scale, efficient use of floating-point arithmetic processors is critical. Lower precision allows streaming architectures to perform more operations per second and can reduce memory bandwidth pressure on all architectures. However, using a precision that is too low for a given algorithm and data set leads to inaccurate results. We present a framework that uses binary instrumentation and modification to build mixed-precision configurations of existing binaries that were originally developed to use only double-precision. Initial results with the Algebraic MultiGrid kernel demonstrate a nearly 2χ speedup.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"96 1","pages":"1424-1424"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88408077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
FRIEDA: Flexible Robust Intelligent Elastic Data Management in Cloud Environments FRIEDA:云环境中灵活稳健的智能弹性数据管理
Pub Date : 2012-11-10 DOI: 10.1109/SC.Companion.2012.132
D. Ghoshal, L. Ramakrishnan
Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and, failure rates. The explosion in scientific data coupled with unique characteristics of cloud environments require a more flexible and robust distributed data management solution than the ones currently in existence. This paper describes the design and implementation of FRIEDA - a Flexible Robust Intelligent Elastic Data Management framework. FRIEDA coordinates data in a transient cloud environment taking into account specific application characteristics. Additionally, we describe a range of data management strategies and show the benefit of flexible data management schemes in cloud environments. We study two distinct scientific applications from bioinformatics and image analysis to understand the effectiveness of such a framework.
科学应用越来越多地使用云资源进行数据分析工作流程。然而,在这些云资源上有效和高效地管理数据是具有挑战性的,因为有无数的存储选择,具有不同的性能成本权衡,复杂的应用程序选择,与弹性和故障率相关的复杂性。科学数据的爆炸式增长,加上云环境的独特特征,需要比现有的分布式数据管理解决方案更灵活、更健壮的解决方案。本文介绍了一个灵活、稳健的智能弹性数据管理框架FRIEDA的设计与实现。FRIEDA在瞬态云环境中协调数据,同时考虑到特定的应用程序特征。此外,我们还描述了一系列数据管理策略,并展示了灵活的数据管理方案在云环境中的好处。我们研究了生物信息学和图像分析两种不同的科学应用,以了解这种框架的有效性。
{"title":"FRIEDA: Flexible Robust Intelligent Elastic Data Management in Cloud Environments","authors":"D. Ghoshal, L. Ramakrishnan","doi":"10.1109/SC.Companion.2012.132","DOIUrl":"https://doi.org/10.1109/SC.Companion.2012.132","url":null,"abstract":"Scientific applications are increasingly using cloud resources for their data analysis workflows. However, managing data effectively and efficiently over these cloud resources is challenging due to the myriad storage choices with different performance-cost trade-offs, complex application choices, complexity associated with elasticity and, failure rates. The explosion in scientific data coupled with unique characteristics of cloud environments require a more flexible and robust distributed data management solution than the ones currently in existence. This paper describes the design and implementation of FRIEDA - a Flexible Robust Intelligent Elastic Data Management framework. FRIEDA coordinates data in a transient cloud environment taking into account specific application characteristics. Additionally, we describe a range of data management strategies and show the benefit of flexible data management schemes in cloud environments. We study two distinct scientific applications from bioinformatics and image analysis to understand the effectiveness of such a framework.","PeriodicalId":6346,"journal":{"name":"2012 SC Companion: High Performance Computing, Networking Storage and Analysis","volume":"116 1","pages":"1096-1105"},"PeriodicalIF":0.0,"publicationDate":"2012-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79367490","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 18
期刊
2012 SC Companion: High Performance Computing, Networking Storage and Analysis
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1