Pub Date : 2020-12-01DOI: 10.1109/hipc50609.2020.00013
{"title":"27th IEEE International Conference on High Performance Computing, Data, and Analytics (HiPC 2020) Technical program","authors":"","doi":"10.1109/hipc50609.2020.00013","DOIUrl":"https://doi.org/10.1109/hipc50609.2020.00013","url":null,"abstract":"","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114951171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00034
A. Paul, Olaf Faaland, A. Moody, Elsa Gonsiorowski, K. Mohror, A. Butt
The processor performance of high performance computing (HPC) systems is increasing at a much higher rate than storage performance. This imbalance leads to I/O performance bottlenecks in massively parallel HPC applications. Therefore, there is a need for improvements in storage and file system designs to meet the ever-growing I/O needs of HPC applications. Storage and file system designers require a deep understanding of how HPC application I/O behavior affects current storage system installations in order to improve them. In this work, we contribute to this understanding using application-agnostic file system statistics gathered on compute nodes as well as metadata and object storage file system servers. We analyze file system statistics of more than 4 million jobs over a period of three years on two systems at Lawrence Livermore National Laboratory that include a 15 PiB Lustre file system for storage. The results of our study add to the state-of-the-art in I/O understanding by providing insight into how general HPC workloads affect the performance of large-scale storage systems. Some key observations in our study show that reads and writes are evenly distributed across the storage system; applications which perform I/O, spread that I/O across ∼78% of the minutes of their runtime on average; less than 22% of HPC users who submit write-intensive jobs perform efficient writes to the file system; and I/O contention seriously impacts I/O performance.
高性能计算(HPC)系统的处理器性能正以比存储性能高得多的速度增长。这种不平衡导致大规模并行HPC应用程序中的I/O性能瓶颈。因此,需要改进存储和文件系统设计,以满足HPC应用程序不断增长的I/O需求。存储和文件系统设计人员需要深入了解HPC应用程序I/O行为如何影响当前的存储系统安装,以便改进它们。在这项工作中,我们使用在计算节点以及元数据和对象存储文件系统服务器上收集的与应用程序无关的文件系统统计数据来促进这种理解。我们分析了劳伦斯利弗莫尔国家实验室(Lawrence Livermore National Laboratory)三个系统上超过400万个工作的文件系统统计数据,这两个系统包括一个用于存储的15 PiB Lustre文件系统。我们的研究结果通过深入了解一般HPC工作负载如何影响大型存储系统的性能,增加了对I/O的最新理解。我们研究中的一些关键观察结果表明,读和写在整个存储系统中是均匀分布的;执行I/O的应用程序将I/O平均分配到运行时时间的78%分钟;在提交写密集型作业的HPC用户中,只有不到22%的人对文件系统执行了有效的写操作;I/O争用严重影响I/O性能。
{"title":"Understanding HPC Application I/O Behavior Using System Level Statistics","authors":"A. Paul, Olaf Faaland, A. Moody, Elsa Gonsiorowski, K. Mohror, A. Butt","doi":"10.1109/HiPC50609.2020.00034","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00034","url":null,"abstract":"The processor performance of high performance computing (HPC) systems is increasing at a much higher rate than storage performance. This imbalance leads to I/O performance bottlenecks in massively parallel HPC applications. Therefore, there is a need for improvements in storage and file system designs to meet the ever-growing I/O needs of HPC applications. Storage and file system designers require a deep understanding of how HPC application I/O behavior affects current storage system installations in order to improve them. In this work, we contribute to this understanding using application-agnostic file system statistics gathered on compute nodes as well as metadata and object storage file system servers. We analyze file system statistics of more than 4 million jobs over a period of three years on two systems at Lawrence Livermore National Laboratory that include a 15 PiB Lustre file system for storage. The results of our study add to the state-of-the-art in I/O understanding by providing insight into how general HPC workloads affect the performance of large-scale storage systems. Some key observations in our study show that reads and writes are evenly distributed across the storage system; applications which perform I/O, spread that I/O across ∼78% of the minutes of their runtime on average; less than 22% of HPC users who submit write-intensive jobs perform efficient writes to the file system; and I/O contention seriously impacts I/O performance.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"133040078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00043
Ruidong Gu, M. Becchi
GPUs have been extensively used to accelerate scientific applications from a variety of domains: computational fluid dynamics, astronomy and astrophysics, climate modeling, numerical analysis, to name a few. Many of these applications rely on floating-point arithmetic, which is approximate in nature. High-precision libraries have been proposed to mitigate accuracy issues due to the use of floating-point arithmetic. However, these libraries offer increased accuracy at a significant performance cost. Previous work, primarily focusing on CPU code and on standard IEEE floating-point data types, has explored mixed precision as a compromise between performance and accuracy. In this work, we propose a mixed precision autotuner for GPU applications that rely on floating-point arithmetic. Our tool supports standard 32- and 64-bit floating-point arithmetic, as well as high precision through the QD library. Our autotuner relies on compiler analysis to reduce the size of the tuning space. In particular, our tuning strategy takes into account code patterns prone to error propagation and GPU-specific considerations to generate a tuning plan that balances performance and accuracy. Our autotuner pipeline, implemented using the ROSE compiler and Python scripts, is fully automated and the code is available in open source. Our experimental results collected on benchmark applications with various code complexities show performance-accuracy tradeoffs for these applications and the effectiveness of our tool in identifying representative tuning points.
{"title":"GPU-FPtuner: Mixed-precision Auto-tuning for Floating-point Applications on GPU","authors":"Ruidong Gu, M. Becchi","doi":"10.1109/HiPC50609.2020.00043","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00043","url":null,"abstract":"GPUs have been extensively used to accelerate scientific applications from a variety of domains: computational fluid dynamics, astronomy and astrophysics, climate modeling, numerical analysis, to name a few. Many of these applications rely on floating-point arithmetic, which is approximate in nature. High-precision libraries have been proposed to mitigate accuracy issues due to the use of floating-point arithmetic. However, these libraries offer increased accuracy at a significant performance cost. Previous work, primarily focusing on CPU code and on standard IEEE floating-point data types, has explored mixed precision as a compromise between performance and accuracy. In this work, we propose a mixed precision autotuner for GPU applications that rely on floating-point arithmetic. Our tool supports standard 32- and 64-bit floating-point arithmetic, as well as high precision through the QD library. Our autotuner relies on compiler analysis to reduce the size of the tuning space. In particular, our tuning strategy takes into account code patterns prone to error propagation and GPU-specific considerations to generate a tuning plan that balances performance and accuracy. Our autotuner pipeline, implemented using the ROSE compiler and Python scripts, is fully automated and the code is available in open source. Our experimental results collected on benchmark applications with various code complexities show performance-accuracy tradeoffs for these applications and the effectiveness of our tool in identifying representative tuning points.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"122985739","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00037
Joshua D. Suetterlein, J. Manzano, A. Márquez, G. Gao
The rise of the accelerator-based architectures and reconfigurable computing have showcased the weakness of software stack toolchains that still maintain a static view of the hardware instead of relying on a symbiotic relationship between static (e.g., compilers) and dynamic tools (e.g., runtimes). In the past decades, this need has given rise to adaptive runtimes with increasingly finer computational tasks. These finer tasks help to take advantage of the hardware by switching out when a long latency operation is encountered (because of the deeper memory hierarchies and new memory technologies that might target streaming instead of random access), thus trading off idle time for unrelated work. Examples of these finer task runtimes are Asynchronous Many Task (AMT) runtimes, in which highly efficient computational graphs run on a variety of hardware. Due to its inherent latency tolerant characteristics, latency-sensitive applications, such as Graph Analytics and Big Data can effectively use these runtimes. This paper aims to present an example of how the careful design of an AMT can exploit the hardware substrate when faced with high latency applications such as the ones given in the Big Data domain. Moreover, with its introspection and adaptive capabilities, we aim to show the power of these runtimes when facing the changing requirements of application workloads. We use the Performance Open Community Runtime (P-OCR) as our vehicle to demonstrate the concepts presented here.
基于加速器的架构和可重构计算的兴起显示了软件堆栈工具链的弱点,这些工具链仍然保持硬件的静态视图,而不是依赖于静态(例如编译器)和动态工具(例如运行时)之间的共生关系。在过去的几十年里,这种需求产生了具有越来越精细的计算任务的自适应运行时。这些更精细的任务在遇到长延迟操作时(因为更深层的内存层次结构和新的内存技术可能针对流访问而不是随机访问)通过切换来帮助利用硬件,从而将空闲时间用于不相关的工作。这些更好的任务运行时的例子是异步多任务(AMT)运行时,其中高效的计算图在各种硬件上运行。由于其固有的延迟容忍特性,延迟敏感的应用程序,如Graph Analytics和Big Data可以有效地使用这些运行时。本文旨在提供一个例子,说明在面对诸如大数据领域中给出的高延迟应用时,AMT的精心设计如何利用硬件基板。此外,通过自省和自适应功能,我们的目标是在面对不断变化的应用程序工作负载需求时展示这些运行时的强大功能。我们使用性能开放社区运行时(Performance Open Community Runtime, P-OCR)作为演示这里介绍的概念的工具。
{"title":"On the Marriage of Asynchronous Many Task Runtimes and Big Data: A Glance","authors":"Joshua D. Suetterlein, J. Manzano, A. Márquez, G. Gao","doi":"10.1109/HiPC50609.2020.00037","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00037","url":null,"abstract":"The rise of the accelerator-based architectures and reconfigurable computing have showcased the weakness of software stack toolchains that still maintain a static view of the hardware instead of relying on a symbiotic relationship between static (e.g., compilers) and dynamic tools (e.g., runtimes). In the past decades, this need has given rise to adaptive runtimes with increasingly finer computational tasks. These finer tasks help to take advantage of the hardware by switching out when a long latency operation is encountered (because of the deeper memory hierarchies and new memory technologies that might target streaming instead of random access), thus trading off idle time for unrelated work. Examples of these finer task runtimes are Asynchronous Many Task (AMT) runtimes, in which highly efficient computational graphs run on a variety of hardware. Due to its inherent latency tolerant characteristics, latency-sensitive applications, such as Graph Analytics and Big Data can effectively use these runtimes. This paper aims to present an example of how the careful design of an AMT can exploit the hardware substrate when faced with high latency applications such as the ones given in the Big Data domain. Moreover, with its introspection and adaptive capabilities, we aim to show the power of these runtimes when facing the changing requirements of application workloads. We use the Performance Open Community Runtime (P-OCR) as our vehicle to demonstrate the concepts presented here.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115791195","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00029
Manas Tiwari, Sathish S. Vadhiyar
Preconditioned Conjugate Gradient (PCG) method has been one of the widely used methods for solving linear systems of equations for sparse problems. Pipelined PCG (PIPECG) attempts to eliminate the dependencies in the computations in the PCG algorithm and overlap non-dependent computations by reorganizing the traditional PCG code and using non-blocking allreduces. We have developed a novel pipelined PCG algorithm called PIPECG-OATI (One Allreduce per Two Iterations) that provides large overlap of global communication and computations at higher number of cores in distributed memory CPU systems. Our method achieves this overlapping by using iteration combination and by introducing new non-recurrence computations. We compare our method with other pipelined CG methods on a variety of problems and demonstrate that our method always gives the least runtimes. Our method gives up to 3x speedup over PCG method and 1.73x speedup over PIPECG method at large number of cores.
{"title":"Pipelined Preconditioned Conjugate Gradient Methods for Distributed Memory Systems","authors":"Manas Tiwari, Sathish S. Vadhiyar","doi":"10.1109/HiPC50609.2020.00029","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00029","url":null,"abstract":"Preconditioned Conjugate Gradient (PCG) method has been one of the widely used methods for solving linear systems of equations for sparse problems. Pipelined PCG (PIPECG) attempts to eliminate the dependencies in the computations in the PCG algorithm and overlap non-dependent computations by reorganizing the traditional PCG code and using non-blocking allreduces. We have developed a novel pipelined PCG algorithm called PIPECG-OATI (One Allreduce per Two Iterations) that provides large overlap of global communication and computations at higher number of cores in distributed memory CPU systems. Our method achieves this overlapping by using iteration combination and by introducing new non-recurrence computations. We compare our method with other pipelined CG methods on a variety of problems and demonstrate that our method always gives the least runtimes. Our method gives up to 3x speedup over PCG method and 1.73x speedup over PIPECG method at large number of cores.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"40 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128452475","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/hipc50609.2020.00002
{"title":"[Title page]","authors":"","doi":"10.1109/hipc50609.2020.00002","DOIUrl":"https://doi.org/10.1109/hipc50609.2020.00002","url":null,"abstract":"","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"141 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131971972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00025
A. Shafi, J. Hashmi, H. Subramoni, D. Panda
Python is emerging as a popular language in the data science community due to its ease-of-use, vibrant community, and rich set of libraries. Dask is a popular Python-based distributed computing framework that allows users to process large amounts of data on parallel hardware. The Dask distributed package is a non-blocking, asynchronous, and concurrent library that offers support for distributed execution of tasks on datacenter and HPC environments. A few key requirements of designing high-performance communication backends for Dask distributed is to provide scalable support for coroutines that are unlike regular Python functions and can only be invoked from asynchronous applications. In this paper, we present Blink—a high-performance communication library for Dask on high-performance RDMA networks like InfiniBand. Blink offers a multi-layered architecture that matches the communication requirements of Dask and exploits high-performance interconnects using a Cython wrapper layer to the C backend. We evaluate the performance of Blink against other counterparts using various micro-benchmarks and application kernels on three different cluster testbeds with varying interconnect speeds. Our micro-benchmark evaluation reveals that Blink outperforms other communication backends by more than 3× for message sizes ranging from 1 Byte to 64 KByte, and by a factor of 2× for message sizes ranging from 128 KByte to 8 MByte. Using various application-level evaluations, we demonstrate that Dask achieves up to 7% improvement in application throughput (e.g., total worker throughput).
{"title":"Blink: Towards Efficient RDMA-based Communication Coroutines for Parallel Python Applications","authors":"A. Shafi, J. Hashmi, H. Subramoni, D. Panda","doi":"10.1109/HiPC50609.2020.00025","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00025","url":null,"abstract":"Python is emerging as a popular language in the data science community due to its ease-of-use, vibrant community, and rich set of libraries. Dask is a popular Python-based distributed computing framework that allows users to process large amounts of data on parallel hardware. The Dask distributed package is a non-blocking, asynchronous, and concurrent library that offers support for distributed execution of tasks on datacenter and HPC environments. A few key requirements of designing high-performance communication backends for Dask distributed is to provide scalable support for coroutines that are unlike regular Python functions and can only be invoked from asynchronous applications. In this paper, we present Blink—a high-performance communication library for Dask on high-performance RDMA networks like InfiniBand. Blink offers a multi-layered architecture that matches the communication requirements of Dask and exploits high-performance interconnects using a Cython wrapper layer to the C backend. We evaluate the performance of Blink against other counterparts using various micro-benchmarks and application kernels on three different cluster testbeds with varying interconnect speeds. Our micro-benchmark evaluation reveals that Blink outperforms other communication backends by more than 3× for message sizes ranging from 1 Byte to 64 KByte, and by a factor of 2× for message sizes ranging from 128 KByte to 8 MByte. Using various application-level evaluations, we demonstrate that Dask achieves up to 7% improvement in application throughput (e.g., total worker throughput).","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"48 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124138887","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/HiPC50609.2020.00028
Lawton Manning, Grey Ballard, R. Kannan, Haesun Park
Nonnegative Matrix Factorization (NMF) is an effective tool for clustering nonnegative data, either for computing a flat partitioning of a dataset or for determining a hierarchy of similarity. In this paper, we propose a parallel algorithm for hierarchical clustering that uses a divide-and-conquer approach based on rank-two NMF to split a data set into two cohesive parts. Not only does this approach uncover more structure in the data than a flat NMF clustering, but also rank-two NMF can be computed more quickly than for general ranks, providing comparable overall time to solution. Our data distribution and parallelization strategies are designed to maintain computational load balance throughout the data-dependent hierarchy of computation while limiting interprocess communication, allowing the algorithm to scale to large dense and sparse data sets. We demonstrate the scalability of our parallel algorithm in terms of data size (up to 800 GB) and number of processors (up to 80 nodes of the Summit supercomputer), applying the hierarchical clustering approach to hyperspectral imaging and image classification data. Our algorithm for Rank-2 NMF scales perfectly on up to 1000s of cores and the entire hierarchical clustering method achieves 5.9x speedup scaling from 10 to 80 nodes on the 800 GB dataset.
{"title":"Parallel Hierarchical Clustering using Rank-Two Nonnegative Matrix Factorization","authors":"Lawton Manning, Grey Ballard, R. Kannan, Haesun Park","doi":"10.1109/HiPC50609.2020.00028","DOIUrl":"https://doi.org/10.1109/HiPC50609.2020.00028","url":null,"abstract":"Nonnegative Matrix Factorization (NMF) is an effective tool for clustering nonnegative data, either for computing a flat partitioning of a dataset or for determining a hierarchy of similarity. In this paper, we propose a parallel algorithm for hierarchical clustering that uses a divide-and-conquer approach based on rank-two NMF to split a data set into two cohesive parts. Not only does this approach uncover more structure in the data than a flat NMF clustering, but also rank-two NMF can be computed more quickly than for general ranks, providing comparable overall time to solution. Our data distribution and parallelization strategies are designed to maintain computational load balance throughout the data-dependent hierarchy of computation while limiting interprocess communication, allowing the algorithm to scale to large dense and sparse data sets. We demonstrate the scalability of our parallel algorithm in terms of data size (up to 800 GB) and number of processors (up to 80 nodes of the Summit supercomputer), applying the hierarchical clustering approach to hyperspectral imaging and image classification data. Our algorithm for Rank-2 NMF scales perfectly on up to 1000s of cores and the entire hierarchical clustering method achieves 5.9x speedup scaling from 10 to 80 nodes on the 800 GB dataset.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130860892","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The main conference spans over three days from January 8 through January 10 and it is adjoined by two days of workshops before and after the main conference days. The first conference day will begin with two keynote speeches from two research leaders from the academia: Carla-Fabiana Chiasserini from Politecnico di Torino, Italy, and Gerhard P. Fettweis from TU-Dresden, Germany. The second day, January 9 will start with a newly-introduced fireside chat with the two COMSNETS lifetime achievement awardees! On the second day, we will also have a distinguished banquet speaker in the evening: Rahul Mangharam, from University of Pennsylvania, USA. On the third day of the main conference, we will have two distinguished keynote speakers from the industry: Sriram Rajamani. From Microsoft Research, India, and Saravanan Radhakrishnan, CISCO, India.
主要会议从1月8日持续到1月10日,为期三天,在主要会议日之前和之后分别有两天的研讨会。第一天的会议将以两位学术界研究领袖的主题演讲开始:来自意大利都灵理工大学的Carla-Fabiana Chiasserini和来自德国德累斯顿理工大学的Gerhard P. Fettweis。第二天,1月9日,我们将与两位COMSNETS终身成就奖获得者进行一次全新的炉边聊天!第二天晚上,我们还将邀请到一位杰出的宴会主讲人:来自美国宾夕法尼亚大学的Rahul Mangharam。在主会议的第三天,我们将邀请到来自业界的两位杰出的主题演讲者:Sriram Rajamani。来自印度微软研究院和印度思科公司的Saravanan Radhakrishnan。
{"title":"Message from the General Co-Chairs","authors":"A. Luque, Yousef Ibrahim, J. J. Rodríguez","doi":"10.1109/micro.2006.35","DOIUrl":"https://doi.org/10.1109/micro.2006.35","url":null,"abstract":"The main conference spans over three days from January 8 through January 10 and it is adjoined by two days of workshops before and after the main conference days. The first conference day will begin with two keynote speeches from two research leaders from the academia: Carla-Fabiana Chiasserini from Politecnico di Torino, Italy, and Gerhard P. Fettweis from TU-Dresden, Germany. The second day, January 9 will start with a newly-introduced fireside chat with the two COMSNETS lifetime achievement awardees! On the second day, we will also have a distinguished banquet speaker in the evening: Rahul Mangharam, from University of Pennsylvania, USA. On the third day of the main conference, we will have two distinguished keynote speakers from the industry: Sriram Rajamani. From Microsoft Research, India, and Saravanan Radhakrishnan, CISCO, India.","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115467360","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2020-12-01DOI: 10.1109/hipc50609.2020.00008
{"title":"HiPC 2020 Technical Program Committee","authors":"","doi":"10.1109/hipc50609.2020.00008","DOIUrl":"https://doi.org/10.1109/hipc50609.2020.00008","url":null,"abstract":"","PeriodicalId":375004,"journal":{"name":"2020 IEEE 27th International Conference on High Performance Computing, Data, and Analytics (HiPC)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-12-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123428691","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}