首页 > 最新文献

2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)最新文献

英文 中文
ePython: An Implementation of Python for the Many-Core Epiphany Co-processor Python:用于多核Epiphany协处理器的Python实现
Nick Brown
The Epiphany is a many-core, low power, low on-chip memory architecture and one can very cheaply gain access to a number of parallel cores which is beneficial for HPC education and prototyping. The very low power nature of these architectures also means that there is potential for their use in future HPC machines, however there is a high barrier to entry in programming them due to the associated complexities and immaturity of supporting tools.In this paper we present our work on ePython, a subset of Python for the Epiphany and similar many-core co-processors. Due to the limited on-chip memory per core we have developed a new Python interpreter and this, combined with additional support for parallelism, has meant that novices can take advantage of Python to very quickly write parallel codes on the Epiphany and explore concepts of HPC using a smaller scale parallel machine. The high level nature of Python opens up new possibilities on the Epiphany, we examine a computationally intensive Gauss-Seidel code from the programmability and performance perspective, discuss running Python hybrid on both the host CPU and Epiphany, and interoperability between a full Python interpreter on the CPU and ePython on the Epiphany. The result of this work is support for developing Python on the Epiphany, which can be applied to other similar architectures, that the community have already started to adopt and use to explore concepts of parallelism and HPC.
Epiphany是一个多核、低功耗、低片上内存架构,并且可以非常便宜地访问多个并行核,这对HPC教育和原型设计是有益的。这些架构的低功耗特性也意味着它们在未来的HPC机器中有使用的潜力,但是由于相关的复杂性和支持工具的不成熟,编程它们的门槛很高。在本文中,我们介绍了我们在Python上的工作,Python的一个子集用于Epiphany和类似的多核协处理器。由于每个内核的片上内存有限,我们开发了一个新的Python解释器,加上对并行性的额外支持,这意味着新手可以利用Python在Epiphany上非常快速地编写并行代码,并使用较小规模的并行机探索HPC的概念。Python的高级特性在Epiphany上开辟了新的可能性,我们从可编程性和性能的角度研究了计算密集型的gaas - seidel代码,讨论了在主机CPU和Epiphany上运行Python混合,以及在CPU和Epiphany上运行完整的Python解释器之间的互操作性。这项工作的结果是支持在Epiphany上开发Python,它可以应用于其他类似的架构,社区已经开始采用并使用这些架构来探索并行和高性能计算的概念。
{"title":"ePython: An Implementation of Python for the Many-Core Epiphany Co-processor","authors":"Nick Brown","doi":"10.1109/PyHPC.2016.012","DOIUrl":"https://doi.org/10.1109/PyHPC.2016.012","url":null,"abstract":"The Epiphany is a many-core, low power, low on-chip memory architecture and one can very cheaply gain access to a number of parallel cores which is beneficial for HPC education and prototyping. The very low power nature of these architectures also means that there is potential for their use in future HPC machines, however there is a high barrier to entry in programming them due to the associated complexities and immaturity of supporting tools.In this paper we present our work on ePython, a subset of Python for the Epiphany and similar many-core co-processors. Due to the limited on-chip memory per core we have developed a new Python interpreter and this, combined with additional support for parallelism, has meant that novices can take advantage of Python to very quickly write parallel codes on the Epiphany and explore concepts of HPC using a smaller scale parallel machine. The high level nature of Python opens up new possibilities on the Epiphany, we examine a computationally intensive Gauss-Seidel code from the programmability and performance perspective, discuss running Python hybrid on both the host CPU and Epiphany, and interoperability between a full Python interpreter on the CPU and ePython on the Epiphany. The result of this work is support for developing Python on the Epiphany, which can be applied to other similar architectures, that the community have already started to adopt and use to explore concepts of parallelism and HPC.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"36 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128559380","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
High-Performance Python-C++ Bindings with PyPy and Cling 带有PyPy和Cling的高性能python - c++绑定
W. Lavrijsen, Aditi Dutta
The use of Python as a high level productivity language on top of high performance libraries written in C++ requires efficient, highly functional, and easy-to-use cross-language bindings. C++ was standardized in 1998 and up until 2011 it saw only one minor revision. Since then, the pace of revisions has increased considerably, with a lot of improvements made to expressing semantic intent in interface definitions. For automatic Python-C++ bindings generators it is both the worst of times, as parsers need to keep up, and the best of times, as important information such as object ownership and thread safety can now be expressed. We present cppyy, which uses Cling, the Clang/LLVM-based C++ interpreter, to automatically generate Python-C++ bindings for PyPy. Cling provides dynamic access to a modern C++ parser and PyPy brings a full toolbox of dynamic optimizations for high performance. The use of Cling for parsing, provides up-to-date C++ support now and in the foreseeable future. We show that with PyPy the overhead of calls to C++ functions from Python can be reduced by an order of magnitude compared to the equivalent in CPython, making it sufficiently low to be unmeasurable for all but the shortest C++ functions. Similarly, access to data in C++ is reduced by two orders of magnitude over access from CPython. Our approach requires no intermediate language and more pythonistic presentations of the C++ libraries can be written in Python itself, with little performance cost due to inlining by PyPy. This allows for future dynamic optimizations to be fully transparent.
使用Python作为基于c++编写的高性能库的高级生产力语言,需要高效、高功能和易于使用的跨语言绑定。c++在1998年被标准化,直到2011年,它只经历了一次小的修改。从那时起,修订的步伐大大加快,在接口定义中表达语义意图方面做了许多改进。对于自动python - c++绑定生成器来说,这既是最坏的时代,因为解析器需要跟上,也是最好的时代,因为对象所有权和线程安全等重要信息现在可以表达了。我们介绍了cppyy,它使用基于Clang/ llvm的c++解释器Clang/ llvm来自动为cppyy生成python - c++绑定。Cling提供了对现代c++解析器的动态访问,而PyPy提供了一个完整的动态优化工具箱,以实现高性能。使用Cling进行解析,可以在现在和可预见的将来提供最新的c++支持。我们表明,使用PyPy,从Python调用c++函数的开销可以比在CPython中等效的开销减少一个数量级,使得它足够低,除了最短的c++函数之外,其他所有函数都无法测量。同样,在c++中对数据的访问比在CPython中访问减少了两个数量级。我们的方法不需要中间语言,c++库的更Python化的表示可以用Python本身编写,由于PyPy的内联,性能成本很小。这允许未来的动态优化完全透明。
{"title":"High-Performance Python-C++ Bindings with PyPy and Cling","authors":"W. Lavrijsen, Aditi Dutta","doi":"10.5555/3019083.3019087","DOIUrl":"https://doi.org/10.5555/3019083.3019087","url":null,"abstract":"The use of Python as a high level productivity language on top of high performance libraries written in C++ requires efficient, highly functional, and easy-to-use cross-language bindings. C++ was standardized in 1998 and up until 2011 it saw only one minor revision. Since then, the pace of revisions has increased considerably, with a lot of improvements made to expressing semantic intent in interface definitions. For automatic Python-C++ bindings generators it is both the worst of times, as parsers need to keep up, and the best of times, as important information such as object ownership and thread safety can now be expressed. We present cppyy, which uses Cling, the Clang/LLVM-based C++ interpreter, to automatically generate Python-C++ bindings for PyPy. Cling provides dynamic access to a modern C++ parser and PyPy brings a full toolbox of dynamic optimizations for high performance. The use of Cling for parsing, provides up-to-date C++ support now and in the foreseeable future. We show that with PyPy the overhead of calls to C++ functions from Python can be reduced by an order of magnitude compared to the equivalent in CPython, making it sufficiently low to be unmeasurable for all but the shortest C++ functions. Similarly, access to data in C++ is reduced by two orders of magnitude over access from CPython. Our approach requires no intermediate language and more pythonistic presentations of the C++ libraries can be written in Python itself, with little performance cost due to inlining by PyPy. This allows for future dynamic optimizations to be fully transparent.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"107 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130493288","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 21
Boosting Python Performance on Intel Processors: A Case Study of Optimizing Music Recognition 提升Python在Intel处理器上的性能:优化音乐识别的案例研究
Yuanzhe Li, L. Schwiebert
We present a case study of optimizing a Python-based music recognition application on Intel Haswell Xeon processor. With support from Numpy and Scipy, Python addresses the requirements of the music recognition problem with math library utilization and special structures for data access. However, a general optimized Python application cannot fully utilize the latest high performance multicore processors. In this study, we survey an existing open-source music recognition application, written in Python, to explore the effect of applying changes to the Scipy and Numpy libraries to achieve full processor resource occupancy and reduce code latency. Instead of comparing across many different architectures, we focus on Intel high performance processors that have multiple cores and vector registers, and we attempt to preserve both user-friendliness and code scalability so that the revised library functions can be ported to other platforms and require no additional code changes.
我们提出了一个在Intel Haswell Xeon处理器上优化基于python的音乐识别应用的案例研究。在Numpy和Scipy的支持下,Python通过利用数学库和数据访问的特殊结构来解决音乐识别问题的需求。然而,一般优化的Python应用程序不能充分利用最新的高性能多核处理器。在本研究中,我们调查了一个用Python编写的现有开源音乐识别应用程序,以探索对Scipy和Numpy库进行更改以实现完全处理器资源占用和减少代码延迟的效果。我们没有比较许多不同的架构,而是将重点放在具有多核和矢量寄存器的英特尔高性能处理器上,我们试图保持用户友好性和代码可扩展性,以便修改后的库函数可以移植到其他平台,而不需要额外的代码更改。
{"title":"Boosting Python Performance on Intel Processors: A Case Study of Optimizing Music Recognition","authors":"Yuanzhe Li, L. Schwiebert","doi":"10.1109/PYHPC.2016.7","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.7","url":null,"abstract":"We present a case study of optimizing a Python-based music recognition application on Intel Haswell Xeon processor. With support from Numpy and Scipy, Python addresses the requirements of the music recognition problem with math library utilization and special structures for data access. However, a general optimized Python application cannot fully utilize the latest high performance multicore processors. In this study, we survey an existing open-source music recognition application, written in Python, to explore the effect of applying changes to the Scipy and Numpy libraries to achieve full processor resource occupancy and reduce code latency. Instead of comparing across many different architectures, we focus on Intel high performance processors that have multiple cores and vector registers, and we attempt to preserve both user-friendliness and code scalability so that the revised library functions can be ported to other platforms and require no additional code changes.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"11 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121249142","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Performance of MPI Codes Written in Python with NumPy and mpi4py 用NumPy和mpi4py用Python编写的MPI代码的性能
Ross Smith
Python is an interpreted language that has become more commonly used within HPC applications. Python benefits from the ability to write extension modules in C, which can further use optimized libraries that have been written in other compiled languages. For HPC users, two of the most common extensions are NumPy and mpi4py. It is possible to write a full computational kernel in a compiled language and then build that kernel into an extension module. However, this process requires not only the kernel be written in the compiled language, but also the interface between the kernel and Python be implemented. If possible, it would be preferable to achieve similar performance by writing the code directly in Python using readily available performant modules. In this work the performance differences between compiled codes and codes written using Python3 and commonly available modules, most notably NumPy and mpi4py, are investigated. Additionally, the performance of an open source Python stack is compared to the recently announced Intel Python3 distribution.
Python是一种解释性语言,在HPC应用程序中越来越常用。Python受益于用C编写扩展模块的能力,它可以进一步使用用其他编译语言编写的优化库。对于HPC用户,两个最常见的扩展是NumPy和mpi4py。可以用编译语言编写完整的计算内核,然后将该内核构建到扩展模块中。然而,这个过程不仅需要用编译语言编写内核,还需要实现内核和Python之间的接口。如果可能的话,最好是通过使用现成的高性能模块直接在Python中编写代码来实现类似的性能。在这项工作中,研究了编译代码和使用Python3和常用模块(最著名的是NumPy和mpi4py)编写的代码之间的性能差异。此外,还将开源Python堆栈的性能与最近发布的Intel Python3发行版进行了比较。
{"title":"Performance of MPI Codes Written in Python with NumPy and mpi4py","authors":"Ross Smith","doi":"10.1109/PYHPC.2016.6","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.6","url":null,"abstract":"Python is an interpreted language that has become more commonly used within HPC applications. Python benefits from the ability to write extension modules in C, which can further use optimized libraries that have been written in other compiled languages. For HPC users, two of the most common extensions are NumPy and mpi4py. It is possible to write a full computational kernel in a compiled language and then build that kernel into an extension module. However, this process requires not only the kernel be written in the compiled language, but also the interface between the kernel and Python be implemented. If possible, it would be preferable to achieve similar performance by writing the code directly in Python using readily available performant modules. In this work the performance differences between compiled codes and codes written using Python3 and commonly available modules, most notably NumPy and mpi4py, are investigated. Additionally, the performance of an open source Python stack is compared to the recently announced Intel Python3 distribution.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131477040","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 11
Dynamic Provisioning and Execution of HPC Workflows Using Python 用Python实现HPC工作流的动态配置和执行
Chris Harris, P. O’leary, M. Grauer, Aashish Chaudhary, Chris Kotfila, Robert M. O'Bara
High-performance computing (HPC) workflows over the last several decades have proven to assist in the understanding of scientific phenomena and the production of better products, more quickly, and at reduced cost. However, HPC workflows are difficult to implement and use for a variety of reasons. In this paper, we describe the development of the Python-based cumulus, which addresses many of these barriers. cumulus is a platform for the dynamic provisioning and execution of HPC workflows. cumulus provides the infrastructure needed to build applications that leverage traditional or Cloud-based HPC resources in their workflows. Finally, we demonstrate the use of cumulus in both web and desktop simulation applications, as well as in an Apache Spark-based analysis application.
在过去的几十年里,高性能计算(HPC)工作流已经被证明有助于理解科学现象,并以更快的速度和更低的成本生产更好的产品。然而,由于各种原因,HPC工作流难以实现和使用。在本文中,我们描述了基于python的积云的发展,它解决了许多这些障碍。cumulus是一个用于动态配置和执行HPC工作流的平台。cumulus提供了构建应用程序所需的基础设施,这些应用程序可以在其工作流中利用传统或基于云的HPC资源。最后,我们演示了在web和桌面模拟应用程序以及基于Apache spark的分析应用程序中使用积云。
{"title":"Dynamic Provisioning and Execution of HPC Workflows Using Python","authors":"Chris Harris, P. O’leary, M. Grauer, Aashish Chaudhary, Chris Kotfila, Robert M. O'Bara","doi":"10.1109/PYHPC.2016.11","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.11","url":null,"abstract":"High-performance computing (HPC) workflows over the last several decades have proven to assist in the understanding of scientific phenomena and the production of better products, more quickly, and at reduced cost. However, HPC workflows are difficult to implement and use for a variety of reasons. In this paper, we describe the development of the Python-based cumulus, which addresses many of these barriers. cumulus is a platform for the dynamic provisioning and execution of HPC workflows. cumulus provides the infrastructure needed to build applications that leverage traditional or Cloud-based HPC resources in their workflows. Finally, we demonstrate the use of cumulus in both web and desktop simulation applications, as well as in an Apache Spark-based analysis application.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"15 16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126157999","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Mrs: High Performance MapReduce for Iterative and Asynchronous Algorithms in Python Mrs:高性能MapReduce迭代和异步算法在Python
Jeffrey Lund, C. Ashcraft, Andrew W. McNabb, Kevin Seppi
Mrs [1] is a lightweight Python-based MapReduce implementation designed to make MapReduce programs easy to write and quick to run, particularly useful for research and academia. A common set of algorithms that would benefit from Mrs are iterative algorithms, like those frequently found in machine learning; however, iterative algorithms typically perform poorly in the MapReduce framework, meaning potentially poor performance in Mrs as well.Therefore, we propose four modifications to the original Mrs with the intent to improve its ability to perform iterative algorithms. First, we used direct task-to-task communication for most iterations and only occasionally write to a distributed file system to preserve fault tolerance. Second, we combine the reduce and map tasks which span successive iterations to eliminate unnecessary communication and scheduling latency. Third, we propose a generator-callback programming model to allow for greater flexibility in the scheduling of tasks. Finally, some iterative algorithms are naturally expressed in terms of asynchronous message passing, so we propose a fully asynchronous variant of MapReduce.We then demonstrate Mrs' enhanced performance in the context of two iterative applications: particle swarm optimization (PSO), and expectation maximization (EM).
Mrs[1]是一个轻量级的基于python的MapReduce实现,旨在使MapReduce程序易于编写和快速运行,对研究和学术界特别有用。从Mrs中受益的一组常见算法是迭代算法,就像机器学习中经常发现的那些算法;然而,迭代算法通常在MapReduce框架中表现不佳,这意味着在Mrs中也可能表现不佳。因此,我们对原始的Mrs进行了四种修改,以提高其执行迭代算法的能力。首先,对于大多数迭代,我们使用直接的任务到任务通信,只是偶尔写入分布式文件系统,以保持容错性。其次,我们将跨越连续迭代的reduce和map任务结合起来,以消除不必要的通信和调度延迟。第三,我们提出了一个生成器-回调编程模型,以便在任务调度中具有更大的灵活性。最后,一些迭代算法自然地以异步消息传递的方式表达,因此我们提出了MapReduce的完全异步变体。然后,我们在两个迭代应用:粒子群优化(PSO)和期望最大化(EM)的背景下展示了Mrs的增强性能。
{"title":"Mrs: High Performance MapReduce for Iterative and Asynchronous Algorithms in Python","authors":"Jeffrey Lund, C. Ashcraft, Andrew W. McNabb, Kevin Seppi","doi":"10.1109/PYHPC.2016.10","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.10","url":null,"abstract":"Mrs [1] is a lightweight Python-based MapReduce implementation designed to make MapReduce programs easy to write and quick to run, particularly useful for research and academia. A common set of algorithms that would benefit from Mrs are iterative algorithms, like those frequently found in machine learning; however, iterative algorithms typically perform poorly in the MapReduce framework, meaning potentially poor performance in Mrs as well.Therefore, we propose four modifications to the original Mrs with the intent to improve its ability to perform iterative algorithms. First, we used direct task-to-task communication for most iterations and only occasionally write to a distributed file system to preserve fault tolerance. Second, we combine the reduce and map tasks which span successive iterations to eliminate unnecessary communication and scheduling latency. Third, we propose a generator-callback programming model to allow for greater flexibility in the scheduling of tasks. Finally, some iterative algorithms are naturally expressed in terms of asynchronous message passing, so we propose a fully asynchronous variant of MapReduce.We then demonstrate Mrs' enhanced performance in the context of two iterative applications: particle swarm optimization (PSO), and expectation maximization (EM).","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"56 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128262122","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PALLADIO: A Parallel Framework for Robust Variable Selection in High-Dimensional Data 高维数据鲁棒变量选择的并行框架
Matteo Barbieri, Samuele Fiorini, Federico Tomasi, A. Barla
The main goal of supervised data analytics is to model a target phenomenon given a limited amount of samples, each represented by an arbitrarily large number of variables. Especially when the number of variables is much larger than the number of available samples, variable selection is a key step as it allows to identify a possibly reduced subset of relevant variables describing the observed phenomenon. Obtaining interpretable and reliable results, in this highly indeterminate scenario, is often a non-trivial task. In this work we present PALLADIO, a framework designed for HPC cluster architectures, that is able to provide robust variable selection in high-dimensional problems. PALLADIO is developed in Python and it integrates CUDA kernels to decrease the computational time needed for several independent element-wise operations. The scalability of the proposed framework is assessed on synthetic data of different sizes, which represent realistic scenarios.
监督数据分析的主要目标是对给定有限数量样本的目标现象进行建模,每个样本由任意大量的变量表示。特别是当变量的数量远远大于可用样本的数量时,变量选择是一个关键步骤,因为它允许识别描述观察到的现象的相关变量的可能减少的子集。在这种高度不确定的情况下,获得可解释和可靠的结果通常是一项非常重要的任务。在这项工作中,我们提出了PALLADIO,一个为高性能计算集群架构设计的框架,它能够在高维问题中提供鲁棒的变量选择。PALLADIO是用Python开发的,它集成了CUDA内核,以减少几个独立元素操作所需的计算时间。在不同规模的合成数据上评估了该框架的可扩展性,这些数据代表了现实场景。
{"title":"PALLADIO: A Parallel Framework for Robust Variable Selection in High-Dimensional Data","authors":"Matteo Barbieri, Samuele Fiorini, Federico Tomasi, A. Barla","doi":"10.1109/PYHPC.2016.13","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.13","url":null,"abstract":"The main goal of supervised data analytics is to model a target phenomenon given a limited amount of samples, each represented by an arbitrarily large number of variables. Especially when the number of variables is much larger than the number of available samples, variable selection is a key step as it allows to identify a possibly reduced subset of relevant variables describing the observed phenomenon. Obtaining interpretable and reliable results, in this highly indeterminate scenario, is often a non-trivial task. In this work we present PALLADIO, a framework designed for HPC cluster architectures, that is able to provide robust variable selection in high-dimensional problems. PALLADIO is developed in Python and it integrates CUDA kernels to decrease the computational time needed for several independent element-wise operations. The scalability of the proposed framework is assessed on synthetic data of different sizes, which represent realistic scenarios.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"67 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121456693","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Migrating Legacy Fortran to Python While Retaining Fortran-Level Performance through Transpilation and Type Hints 将传统的Fortran迁移到Python,同时通过编译和类型提示保留Fortran级别的性能
Mateusz Bysiek, Aleksandr Drozd, S. Matsuoka
We propose a method of accelerating Python code by just-in-time compilation leveraging type hints mechanism introduced in Python 3.5. In our approach performance-critical kernels are expected to be written as if Python was a strictly typed language, however without the need to extend Python syntax. This approach can be applied to any Python application, however we focus on a special case when legacy Fortran applications are automatically translated into Python for easier maintenance. We developed a framework implementing two-way transpilation and achieved performance equivalent to that of Python manually translated to Fortran, and better than using other currently available JIT alternatives (up to 5x times faster than Numba in some experiments).
我们提出了一种利用Python 3.5中引入的类型提示机制通过实时编译来加速Python代码的方法。在我们的方法中,性能关键型内核的编写就好像Python是一种严格类型的语言一样,但是不需要扩展Python语法。这种方法可以应用于任何Python应用程序,但是我们关注的是一种特殊情况,即遗留的Fortran应用程序被自动转换为Python,以便于维护。我们开发了一个实现双向转译的框架,并获得了与Python手动转换为Fortran的性能相当的性能,并且比使用其他当前可用的JIT替代品更好(在某些实验中比Numba快5倍)。
{"title":"Migrating Legacy Fortran to Python While Retaining Fortran-Level Performance through Transpilation and Type Hints","authors":"Mateusz Bysiek, Aleksandr Drozd, S. Matsuoka","doi":"10.1109/PYHPC.2016.12","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.12","url":null,"abstract":"We propose a method of accelerating Python code by just-in-time compilation leveraging type hints mechanism introduced in Python 3.5. In our approach performance-critical kernels are expected to be written as if Python was a strictly typed language, however without the need to extend Python syntax. This approach can be applied to any Python application, however we focus on a special case when legacy Fortran applications are automatically translated into Python for easier maintenance. We developed a framework implementing two-way transpilation and achieved performance equivalent to that of Python manually translated to Fortran, and better than using other currently available JIT alternatives (up to 5x times faster than Numba in some experiments).","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"13 4","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120807208","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 10
A New Architecture for Optimization Modeling Frameworks 一种优化建模框架的新体系结构
Matt Wytock, Steven Diamond, Felix Heide, Stephen P. Boyd
We propose a new architecture for optimization modeling frameworks in which solvers are expressed as computation graphs in a framework like TensorFlow rather than as standalone programs built on a low-level linear algebra interface. Our new architecture makes it easy for modeling frameworks to support high performance computational platforms like GPUs and distributed clusters, as well as to generate solvers specialized to individual problems. Our approach is particularly well adapted to first-order and indirect optimization algorithms. We introduce cvxflow, an open-source convex optimization modeling framework in Python based on the ideas in this paper, and show that it outperforms the state of the art.
我们提出了一种优化建模框架的新架构,其中求解器在像TensorFlow这样的框架中被表示为计算图,而不是作为建立在低级线性代数接口上的独立程序。我们的新架构使得建模框架可以很容易地支持高性能计算平台,如gpu和分布式集群,以及生成专门针对单个问题的求解器。我们的方法特别适合于一阶和间接优化算法。我们介绍了基于本文思想的Python开源凸优化建模框架cvxflow,并证明了它的性能优于目前的技术水平。
{"title":"A New Architecture for Optimization Modeling Frameworks","authors":"Matt Wytock, Steven Diamond, Felix Heide, Stephen P. Boyd","doi":"10.1109/PYHPC.2016.5","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.5","url":null,"abstract":"We propose a new architecture for optimization modeling frameworks in which solvers are expressed as computation graphs in a framework like TensorFlow rather than as standalone programs built on a low-level linear algebra interface. Our new architecture makes it easy for modeling frameworks to support high performance computational platforms like GPUs and distributed clusters, as well as to generate solvers specialized to individual problems. Our approach is particularly well adapted to first-order and indirect optimization algorithms. We introduce cvxflow, an open-source convex optimization modeling framework in Python based on the ideas in this paper, and show that it outperforms the state of the art.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"9 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"123854265","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Devito: Towards a Generic Finite Difference DSL Using Symbolic Python Devito:用符号Python实现一般有限差分DSL
Michael Lange, Navjot Kukreja, M. Louboutin, F. Luporini, Felippe Vieira, Vincenzo Pandolfo, Paulius Velesko, Paulius Kazakas, G. Gorman
Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerful mechanism to speed up scientific Python computation that goes beyond traditional vectorization and pre-compilation approaches, while allowing domain scientists to build applications within the comforts of the Python software ecosystem. In this paper we present Devito, a new finite difference DSL that provides optimized stencil computation from high-level problem specifications based on symbolic Python expressions. We demonstrate Devito's symbolic API and performance advantages over traditional Python acceleration methods before highlighting its use in the scientific context of seismic inversion problems.
领域特定语言(DSL)已被用于各种领域,以简洁的方式表达复杂的科学问题,并为一系列计算体系结构提供自动性能优化。因此,dsl提供了一种强大的机制来加速科学Python计算,超越了传统的向量化和预编译方法,同时允许领域科学家在舒适的Python软件生态系统中构建应用程序。在本文中,我们提出了Devito,一个新的有限差分DSL,它提供了基于符号Python表达式的高级问题规范的优化模板计算。在强调其在地震反演问题的科学背景下的使用之前,我们演示了Devito的符号API和优于传统Python加速方法的性能优势。
{"title":"Devito: Towards a Generic Finite Difference DSL Using Symbolic Python","authors":"Michael Lange, Navjot Kukreja, M. Louboutin, F. Luporini, Felippe Vieira, Vincenzo Pandolfo, Paulius Velesko, Paulius Kazakas, G. Gorman","doi":"10.1109/PYHPC.2016.9","DOIUrl":"https://doi.org/10.1109/PYHPC.2016.9","url":null,"abstract":"Domain specific languages (DSL) have been used in a variety of fields to express complex scientific problems in a concise manner and provide automated performance optimization for a range of computational architectures. As such DSLs provide a powerful mechanism to speed up scientific Python computation that goes beyond traditional vectorization and pre-compilation approaches, while allowing domain scientists to build applications within the comforts of the Python software ecosystem. In this paper we present Devito, a new finite difference DSL that provides optimized stencil computation from high-level problem specifications based on symbolic Python expressions. We demonstrate Devito's symbolic API and performance advantages over traditional Python acceleration methods before highlighting its use in the scientific context of seismic inversion problems.","PeriodicalId":178771,"journal":{"name":"2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)","volume":"100 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131963537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 32
期刊
2016 6th Workshop on Python for High-Performance and Scientific Computing (PyHPC)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1