IPDPS 2019技术计划

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS) Pub Date : 2019-05-01 DOI:10.1109/ipdps.2019.00008

Vinod E. F. Rebello, Lawrence Rauchwerger

{"title":"IPDPS 2019技术计划","authors":"Vinod E. F. Rebello, Lawrence Rauchwerger","doi":"10.1109/ipdps.2019.00008","DOIUrl":null,"url":null,"abstract":": In 2001, as early high-speed networks were deployed, George Gilder observed that “when the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances.” Two decades later, our networks are 1,000 times faster, our appliances are increasingly specialized, and our computer systems are indeed disintegrating. As hardware acceleration overcomes speed-of-light delays, time and space merge into a computing continuum. Familiar questions like “where should I compute,” “for what workloads should I design computers,” and \"where should I place my computers” seem to allow for a myriad of new answers that are exhilarating but also daunting. Are there concepts that can help guide us as we design applications and computer systems in a world that is untethered from familiar landmarks like center, cloud, edge? I propose some ideas and report on experiments in coding the continuum. Abstract: Parallel computers have come of age and need parallel software to justify their usefulness. There are two major avenues to get programs to run in parallel: parallelizing compilers and parallel languages and/or libraries. In this talk we present our latest results using both approaches and draw some conclusions about their relative effectiveness and potential. In the first part we introduce the Hybrid Analysis (HA) compiler framework that can seamlessly integrate static and run-time analysis of memory references into a single framework capable of full automatic loop level parallelization. Experimental results on 26 benchmarks show full program speedups superior to those obtained by the Intel Fortran compilers. In the second part of this talk we present the Standard Template Adaptive Parallel Library (STAPL) based approach to parallelizing code. STAPL is a collection of generic data structures and algorithms that provides a high productivity, parallel programming infrastructure analogous to the C++ Standard Template Library (STL). In this talk, we provide an overview of the major STAPL components with particular emphasis on graph algorithms. We then present scalability results of real codes using peta scale machines such as IBM BG/Q and Cray. Finally we present some of our ideas for future work in this area. Abstract: The trends in hardware architecture are paving the road towards Exascale. However, these trends are also increasing the complexity of design and development of the software developer environment that is deployed on modern supercomputers. Moreover, the scale and complexity of high-end systems creates a new set of challenges for application developers. Computational scientists are facing system characteristics that will significantly impact the programmability and scalability of applications. In order to address these issues, software architects need to take a holistic view of the entire system and deliver a high-level programming environment that can help maximize programmability, while not losing sight of performance portability. In this talk, I will discuss the current trends in computer architecture and their implications in application development and will present Cray’s high level parallel programming environment for performance and programmability on current and future supercomputers. I will also discuss some of the challenges and open research problems that need to be addressed in order to build a software developer environment for extreme-scale systems that helps users solve multi-disciplinary and multi-scale problems with high levels of performance, programmability, and scalability.","PeriodicalId":403406,"journal":{"name":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2019-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"IPDPS 2019 Technical Program\",\"authors\":\"Vinod E. F. Rebello, Lawrence Rauchwerger\",\"doi\":\"10.1109/ipdps.2019.00008\",\"DOIUrl\":null,\"url\":null,\"abstract\":\": In 2001, as early high-speed networks were deployed, George Gilder observed that “when the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances.” Two decades later, our networks are 1,000 times faster, our appliances are increasingly specialized, and our computer systems are indeed disintegrating. As hardware acceleration overcomes speed-of-light delays, time and space merge into a computing continuum. Familiar questions like “where should I compute,” “for what workloads should I design computers,” and \\\"where should I place my computers” seem to allow for a myriad of new answers that are exhilarating but also daunting. Are there concepts that can help guide us as we design applications and computer systems in a world that is untethered from familiar landmarks like center, cloud, edge? I propose some ideas and report on experiments in coding the continuum. Abstract: Parallel computers have come of age and need parallel software to justify their usefulness. There are two major avenues to get programs to run in parallel: parallelizing compilers and parallel languages and/or libraries. In this talk we present our latest results using both approaches and draw some conclusions about their relative effectiveness and potential. In the first part we introduce the Hybrid Analysis (HA) compiler framework that can seamlessly integrate static and run-time analysis of memory references into a single framework capable of full automatic loop level parallelization. Experimental results on 26 benchmarks show full program speedups superior to those obtained by the Intel Fortran compilers. In the second part of this talk we present the Standard Template Adaptive Parallel Library (STAPL) based approach to parallelizing code. STAPL is a collection of generic data structures and algorithms that provides a high productivity, parallel programming infrastructure analogous to the C++ Standard Template Library (STL). In this talk, we provide an overview of the major STAPL components with particular emphasis on graph algorithms. We then present scalability results of real codes using peta scale machines such as IBM BG/Q and Cray. Finally we present some of our ideas for future work in this area. Abstract: The trends in hardware architecture are paving the road towards Exascale. However, these trends are also increasing the complexity of design and development of the software developer environment that is deployed on modern supercomputers. Moreover, the scale and complexity of high-end systems creates a new set of challenges for application developers. Computational scientists are facing system characteristics that will significantly impact the programmability and scalability of applications. In order to address these issues, software architects need to take a holistic view of the entire system and deliver a high-level programming environment that can help maximize programmability, while not losing sight of performance portability. In this talk, I will discuss the current trends in computer architecture and their implications in application development and will present Cray’s high level parallel programming environment for performance and programmability on current and future supercomputers. I will also discuss some of the challenges and open research problems that need to be addressed in order to build a software developer environment for extreme-scale systems that helps users solve multi-disciplinary and multi-scale problems with high levels of performance, programmability, and scalability.\",\"PeriodicalId\":403406,\"journal\":{\"name\":\"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"volume\":\"27 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2019-05-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ipdps.2019.00008\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ipdps.2019.00008","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

2001年，当早期高速网络开始部署时，乔治•吉尔德(George Gilder)观察到:“当网络的速度与计算机内部连接的速度一样快时，计算机就会在网络上分解成一组特殊用途的设备。”二十年后，我们的网络速度快了1000倍，我们的设备越来越专业化，我们的计算机系统确实在瓦解。当硬件加速克服光速延迟时，时间和空间合并成一个计算连续体。熟悉的问题，如“我应该在哪里计算”，“我应该为什么样的工作负载设计计算机”，以及“我应该把计算机放在哪里”，似乎提供了无数令人兴奋但也令人生畏的新答案。当我们在一个不受中心、云、边缘等熟悉的标志束缚的世界中设计应用程序和计算机系统时，是否有一些概念可以帮助指导我们?我提出了一些想法，并报告了对连续体进行编码的实验。摘要:并行计算机已经成熟，需要并行软件来证明其实用性。让程序并行运行有两种主要途径:并行编译器和并行语言和/或库。在这次演讲中，我们将介绍我们使用这两种方法的最新结果，并就它们的相对有效性和潜力得出一些结论。在第一部分中，我们介绍了混合分析(Hybrid Analysis, HA)编译器框架，它可以无缝地将内存引用的静态和运行时分析集成到一个能够实现全自动循环级并行化的框架中。在26个基准测试上的实验结果表明，完整的程序速度优于英特尔Fortran编译器获得的速度。在本演讲的第二部分，我们将介绍基于标准模板自适应并行库(STAPL)的并行代码处理方法。STAPL是通用数据结构和算法的集合，它提供了类似于c++标准模板库(STL)的高生产率并行编程基础设施。在这次演讲中，我们提供了主要STAPL组件的概述，特别强调图算法。然后，我们展示了使用peta级机器(如IBM BG/Q和Cray)的真实代码的可扩展性结果。最后，对今后在这方面的工作提出了一些设想。摘要:硬件架构的发展趋势正在为Exascale的发展铺平道路。然而，这些趋势也增加了部署在现代超级计算机上的软件开发人员环境的设计和开发的复杂性。此外，高端系统的规模和复杂性给应用程序开发人员带来了一系列新的挑战。计算科学家正面临着将显著影响应用程序可编程性和可扩展性的系统特性。为了解决这些问题，软件架构师需要从整体上看待整个系统，并交付一个高级编程环境，以帮助最大化可编程性，同时不忽略性能可移植性。在这次演讲中，我将讨论计算机体系结构的当前趋势及其对应用程序开发的影响，并将介绍Cray在当前和未来超级计算机上的高性能和可编程性的高级并行编程环境。我还将讨论一些需要解决的挑战和开放的研究问题，以便为极端规模的系统构建一个软件开发环境，帮助用户解决具有高水平性能、可编程性和可扩展性的多学科和多规模问题。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

IPDPS 2019 Technical Program

: In 2001, as early high-speed networks were deployed, George Gilder observed that “when the network is as fast as the computer's internal links, the machine disintegrates across the net into a set of special purpose appliances.” Two decades later, our networks are 1,000 times faster, our appliances are increasingly specialized, and our computer systems are indeed disintegrating. As hardware acceleration overcomes speed-of-light delays, time and space merge into a computing continuum. Familiar questions like “where should I compute,” “for what workloads should I design computers,” and "where should I place my computers” seem to allow for a myriad of new answers that are exhilarating but also daunting. Are there concepts that can help guide us as we design applications and computer systems in a world that is untethered from familiar landmarks like center, cloud, edge? I propose some ideas and report on experiments in coding the continuum. Abstract: Parallel computers have come of age and need parallel software to justify their usefulness. There are two major avenues to get programs to run in parallel: parallelizing compilers and parallel languages and/or libraries. In this talk we present our latest results using both approaches and draw some conclusions about their relative effectiveness and potential. In the first part we introduce the Hybrid Analysis (HA) compiler framework that can seamlessly integrate static and run-time analysis of memory references into a single framework capable of full automatic loop level parallelization. Experimental results on 26 benchmarks show full program speedups superior to those obtained by the Intel Fortran compilers. In the second part of this talk we present the Standard Template Adaptive Parallel Library (STAPL) based approach to parallelizing code. STAPL is a collection of generic data structures and algorithms that provides a high productivity, parallel programming infrastructure analogous to the C++ Standard Template Library (STL). In this talk, we provide an overview of the major STAPL components with particular emphasis on graph algorithms. We then present scalability results of real codes using peta scale machines such as IBM BG/Q and Cray. Finally we present some of our ideas for future work in this area. Abstract: The trends in hardware architecture are paving the road towards Exascale. However, these trends are also increasing the complexity of design and development of the software developer environment that is deployed on modern supercomputers. Moreover, the scale and complexity of high-end systems creates a new set of challenges for application developers. Computational scientists are facing system characteristics that will significantly impact the programmability and scalability of applications. In order to address these issues, software architects need to take a holistic view of the entire system and deliver a high-level programming environment that can help maximize programmability, while not losing sight of performance portability. In this talk, I will discuss the current trends in computer architecture and their implications in application development and will present Cray’s high level parallel programming environment for performance and programmability on current and future supercomputers. I will also discuss some of the challenges and open research problems that need to be addressed in order to build a software developer environment for extreme-scale systems that helps users solve multi-disciplinary and multi-scale problems with high levels of performance, programmability, and scalability.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)

自引率

0.00%

发文量