首页 > 最新文献

Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications最新文献

英文 中文
A Performance Optimization Framework for the Simultaneous Heterogeneous Computing Platforms 基于并行异构计算平台的性能优化框架
S. Li
Heterogeneous computing platforms with multicore host system and many-core accelerator devices have taken a major step forward in the mainstream HPC computing market this year with the announcement of HP Apollo 6000 System's ProLiant XL250a server features the Intel® Xeon Phi™ coprocessors. Although many application developers attempt to use it in the same way as GPGPU acceleration platforms, doing so forfeits the processing capability of multicore host processors and introduces power inefficiency in business operations. In this paper, we propose an application optimization framework to turn sequential legacy applications into highly parallel applications that make use of the hardware resources both on the host CPU and on the accelerator devices to enable simultaneous heterogeneous computing. As a case study, we look at how to apply this framework and adopt a structured methodology to develop option pricing applications to take advantages of a heterogeneous computing environment.
多核主机系统和多核加速器设备的异构计算平台今年在主流HPC计算市场上迈出了重要的一步,惠普Apollo 6000 system的ProLiant XL250a服务器采用了英特尔®Xeon Phi™协处理器。尽管许多应用程序开发人员试图以与GPGPU加速平台相同的方式使用它,但这样做会丧失多核主机处理器的处理能力,并在业务操作中引入功率低效率。在本文中,我们提出了一个应用程序优化框架,将顺序遗留应用程序转换为高度并行的应用程序,利用主机CPU和加速器设备上的硬件资源来实现同步异构计算。作为案例研究,我们将研究如何应用该框架并采用结构化方法来开发期权定价应用程序,以利用异构计算环境的优势。
{"title":"A Performance Optimization Framework for the Simultaneous Heterogeneous Computing Platforms","authors":"S. Li","doi":"10.1145/2916026.2916029","DOIUrl":"https://doi.org/10.1145/2916026.2916029","url":null,"abstract":"Heterogeneous computing platforms with multicore host system and many-core accelerator devices have taken a major step forward in the mainstream HPC computing market this year with the announcement of HP Apollo 6000 System's ProLiant XL250a server features the Intel® Xeon Phi™ coprocessors. Although many application developers attempt to use it in the same way as GPGPU acceleration platforms, doing so forfeits the processing capability of multicore host processors and introduces power inefficiency in business operations. In this paper, we propose an application optimization framework to turn sequential legacy applications into highly parallel applications that make use of the hardware resources both on the host CPU and on the accelerator devices to enable simultaneous heterogeneous computing. As a case study, we look at how to apply this framework and adopt a structured methodology to develop option pricing applications to take advantages of a heterogeneous computing environment.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"75 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115361766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications ACM并行和高性能应用软件工程方法研讨会论文集
Atul Kumar, S. Sarkar, M. Gerndt
It is our great pleasure to welcome you to the Workshop on Software Engineering Methods for Parallel and High Performance Applications - SEM4HPC 2016. The workshop aims to discuss parallel computing beyond traditional scientific computing and using them to develop enterprise and industrial applications. Compared to the traditional sequential computing paradigm, the software development, analysis and migration tools for parallel and high performance applications are far less matured for the IT industry to make a shift towards the new computing paradigm. The mission of this workshop is to bring the global industry and academic experts in this area to identify various research challenges that exist in software engineering methods for parallel and high performance application development, maintenance and migration. The workshop also aims to bring out the current state of the art and practice of the software engineering methods through case-studies, novel research ideas, and keynote and invited talks. The call for papers attracted submissions from Germany, India, Spain, and the United States. We received eleven full technical papers out of which five were selected with an acceptance ratio of 45%. We also encourage attendees to attend the keynote and invited talk presentations. These valuable and insightful talks can and will guide us to a better understanding of challenges in this area: Keynote: Challenges in Transition, Kazuaki Ishizaki (IBM Research -- Tokyo, Japan) Invited Talk: The READEX project for Dynamic Energy Efficiency Tuning, Michael Gerndt (Technical University of Munich, Germany) Invited Talk: Developer Productivity in HPC Application Development: An Overview of Recent Techniques, Santonu Sarkar (BITS Pilani -- Goa Campus, India)
我们非常高兴地欢迎您参加并行和高性能应用软件工程方法研讨会- SEM4HPC 2016。研讨会旨在讨论超越传统科学计算的并行计算,并利用它们开发企业和工业应用。与传统的顺序计算范式相比,用于并行和高性能应用程序的软件开发、分析和迁移工具远未成熟,IT行业无法向新的计算范式转变。本次研讨会的任务是将该领域的全球行业和学术专家聚集在一起,确定并行和高性能应用程序开发、维护和迁移的软件工程方法中存在的各种研究挑战。该研讨会还旨在通过案例研究、新颖的研究思想、主题演讲和邀请演讲,提出软件工程方法的艺术和实践的当前状态。论文征集吸引了来自德国、印度、西班牙和美国的投稿。我们收到了11篇完整的技术论文,其中5篇入选,接受率为45%。我们也鼓励与会者参加主题演讲和邀请演讲。这些有价值和有见地的演讲可以并将引导我们更好地理解这一领域的挑战:主题演讲:转型中的挑战,Kazuaki Ishizaki (IBM研究院-东京,日本)邀请演讲:动态能源效率调整的READEX项目,Michael Gerndt(慕尼黑工业大学,德国)邀请演讲:高性能计算应用开发中的开发者生产力:最新技术概述,Santonu Sarkar (BITS Pilani -果阿校区,印度)
{"title":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","authors":"Atul Kumar, S. Sarkar, M. Gerndt","doi":"10.1145/2916026","DOIUrl":"https://doi.org/10.1145/2916026","url":null,"abstract":"It is our great pleasure to welcome you to the Workshop on Software Engineering Methods for Parallel and High Performance Applications - SEM4HPC 2016. \u0000 \u0000The workshop aims to discuss parallel computing beyond traditional scientific computing and using them to develop enterprise and industrial applications. Compared to the traditional sequential computing paradigm, the software development, analysis and migration tools for parallel and high performance applications are far less matured for the IT industry to make a shift towards the new computing paradigm. The mission of this workshop is to bring the global industry and academic experts in this area to identify various research challenges that exist in software engineering methods for parallel and high performance application development, maintenance and migration. The workshop also aims to bring out the current state of the art and practice of the software engineering methods through case-studies, novel research ideas, and keynote and invited talks. \u0000 \u0000The call for papers attracted submissions from Germany, India, Spain, and the United States. We received eleven full technical papers out of which five were selected with an acceptance ratio of 45%. \u0000 \u0000We also encourage attendees to attend the keynote and invited talk presentations. These valuable and insightful talks can and will guide us to a better understanding of challenges in this area: \u0000Keynote: Challenges in Transition, Kazuaki Ishizaki (IBM Research -- Tokyo, Japan) \u0000Invited Talk: The READEX project for Dynamic Energy Efficiency Tuning, Michael Gerndt (Technical University of Munich, Germany) \u0000Invited Talk: Developer Productivity in HPC Application Development: An Overview of Recent Techniques, Santonu Sarkar (BITS Pilani -- Goa Campus, India)","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"89 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126495283","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA 在Virtex-6 FPGA上实现组合Karatsuba的LUT优化
D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai
This paper discusses different approaches that allow optimizing the combinational logic used in Multipliers for Generic ECC (Elliptic Curve Cryptography) implementation in the Galois field GF(2n) . First,a Combinational Multiplier using Karatsuba Ofman logic with 2*2as a base multiplier has been studied. Proper utilization of Look Up Table (LUT) at base level results in effective optimization of the hardware resources. Hence in order to optimize LUT utilization, designs for combinational logic with 3*3 base and 2*3 base have been explored, keeping the LUT structure of Virtex-6 FPGA in mind. Comparisons have shown that, 3*3 base multipliers designed using Karatsuba Ofman algorithm outperformed 2*2 and 2*3 base Multiplier in terms of resource utilization. To further maximize utilization of hardware resources, the exploration has been further carried out using Shift and Add Algorithm(SAA) and it has been found that SAA remains optimized for lower length operands. Algorithmic and platform oriented optimization results in efficient hardware implementations. The final proposed design is a Hybrid Karatsuba Algorithm, which uses SAA at lower level and at higher level uses Karatsuba Ofman Logic. Again here using 3*3 bit Multiplier with SAA configuration is better than the other two. This approach stands a step closer for efficient implementations of fast algorithm on hardware based applications, as this hybrid multiplier is found to use least number of FPGA resources. All the operations in this paper have been performed based on Virtex-6 ML605 using ESD tool as XILINX 12.1
本文讨论了在伽罗瓦域GF(2n)中对通用椭圆曲线加密(ECC)实现的乘法器中使用的组合逻辑进行优化的不同方法。首先,研究了以2*2为基乘法器的Karatsuba Ofman逻辑组合乘法器。在基础级适当地利用查找表(lookup Table, LUT)可以有效地优化硬件资源。因此,为了优化LUT的利用,在考虑Virtex-6 FPGA的LUT结构的情况下,探索了3*3基和2*3基组合逻辑的设计。比较表明,使用Karatsuba Ofman算法设计的3*3基乘法器在资源利用率方面优于2*2和2*3基乘法器。为了进一步最大限度地利用硬件资源,我们使用Shift and Add Algorithm(SAA)进行了进一步的探索,发现SAA对于较低长度的操作数仍然是优化的。面向算法和平台的优化导致了高效的硬件实现。最后提出的设计是一种混合Karatsuba算法,它在低级使用SAA,在高级使用Karatsuba Ofman逻辑。这里再次使用3*3位乘法器与SAA配置比其他两个更好。这种方法更接近于在基于硬件的应用程序上有效实现快速算法,因为这种混合乘法器使用的FPGA资源最少。本文的所有操作都是在XILINX 12.1操作系统下,使用ESD工具在Virtex-6 ML605上完成的
{"title":"LUT Optimization In Implementation Of Combinational Karatsuba Ofman On Virtex-6 FPGA","authors":"D. Kapoor, Rahul Yamasani, S. Saurav, Abhishek Bajpai","doi":"10.1145/2916026.2916030","DOIUrl":"https://doi.org/10.1145/2916026.2916030","url":null,"abstract":"This paper discusses different approaches that allow optimizing the combinational logic used in Multipliers for Generic ECC (Elliptic Curve Cryptography) implementation in the Galois field GF(2n) . First,a Combinational Multiplier using Karatsuba Ofman logic with 2*2as a base multiplier has been studied. Proper utilization of Look Up Table (LUT) at base level results in effective optimization of the hardware resources. Hence in order to optimize LUT utilization, designs for combinational logic with 3*3 base and 2*3 base have been explored, keeping the LUT structure of Virtex-6 FPGA in mind. Comparisons have shown that, 3*3 base multipliers designed using Karatsuba Ofman algorithm outperformed 2*2 and 2*3 base Multiplier in terms of resource utilization. To further maximize utilization of hardware resources, the exploration has been further carried out using Shift and Add Algorithm(SAA) and it has been found that SAA remains optimized for lower length operands. Algorithmic and platform oriented optimization results in efficient hardware implementations. The final proposed design is a Hybrid Karatsuba Algorithm, which uses SAA at lower level and at higher level uses Karatsuba Ofman Logic. Again here using 3*3 bit Multiplier with SAA configuration is better than the other two. This approach stands a step closer for efficient implementations of fast algorithm on hardware based applications, as this hybrid multiplier is found to use least number of FPGA resources. All the operations in this paper have been performed based on Virtex-6 ML605 using ESD tool as XILINX 12.1","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"63 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128763650","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Session details: Afternoon Session 1 会议详情:下午会议1
S. Sarkar
{"title":"Session details: Afternoon Session 1","authors":"S. Sarkar","doi":"10.1145/3248634","DOIUrl":"https://doi.org/10.1145/3248634","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"37 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115487903","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The READEX Project for Dynamic Energy Efficiency Tuning READEX动态能源效率调整项目
M. Gerndt
High Performance Computing (HPC) systems consume a lot of energy. The overall energy consumption is one of the biggest challenges on the way towards exascale computers. Therefore, energy reduction techniques have to be applied on all levels from the basic chip technology up to the data center infrastructure. The READEX project explores the potential of dynamically switching application and system parameters, such as the clock frequency of the cores, to reduce the overall energy consumption of applications. An analysis is performed during application design time to precompute a tuning model that is then input to the runtime tuning library. This library switches the application and system configuration at runtime to adapt to varying application characteristics.
高性能计算(HPC)系统消耗大量的能源。总体能源消耗是迈向百亿亿次计算机的最大挑战之一。因此,节能技术必须应用于从基本芯片技术到数据中心基础设施的各个层面。READEX项目探索动态切换应用程序和系统参数的潜力,例如内核的时钟频率,以减少应用程序的总体能耗。在应用程序设计期间执行分析,以预先计算调优模型,然后将该模型输入到运行时调优库。该库在运行时切换应用程序和系统配置,以适应不同的应用程序特征。
{"title":"The READEX Project for Dynamic Energy Efficiency Tuning","authors":"M. Gerndt","doi":"10.1145/2916026.2916033","DOIUrl":"https://doi.org/10.1145/2916026.2916033","url":null,"abstract":"High Performance Computing (HPC) systems consume a lot of energy. The overall energy consumption is one of the biggest challenges on the way towards exascale computers. Therefore, energy reduction techniques have to be applied on all levels from the basic chip technology up to the data center infrastructure. The READEX project explores the potential of dynamically switching application and system parameters, such as the clock frequency of the cores, to reduce the overall energy consumption of applications. An analysis is performed during application design time to precompute a tuning model that is then input to the runtime tuning library. This library switches the application and system configuration at runtime to adapt to varying application characteristics.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115898897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Session details: Afternoon Session 2 会议详情:下午会议2
M. Gerndt
{"title":"Session details: Afternoon Session 2","authors":"M. Gerndt","doi":"10.1145/3248635","DOIUrl":"https://doi.org/10.1145/3248635","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"114281546","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive GPU Array Layout Auto-Tuning 自适应GPU阵列布局自动调整
Nicolas Weber, M. Goesele
Optimal performance is an important goal in compute intensive applications. For GPU applications, this requires a lot of experience and knowledge about the algorithms and the underlying hardware, making them an ideal target for auto-tuning approaches. We present an auto-tuner which optimizes array layouts in CUDA applications. Depending on the data and program parameters, kernels can have varying optimal configurations. We thus adjust array layouts adaptively at runtime and achieve or even exceed performance of hand optimized code. We automatically detect data characteristics to identify different performance scenarios without user input or additional programming. We perform an empirical analysis of the application in order to construct our decision models. Our adaptive optimization requires in principle profiling data for an extremely high number of scenarios which cannot be exhaustively evaluated for complex applications. We solve this by extending a previously published method that is able to efficiently profile single kernel calls and enhance it to find application-wide optimal solutions. Our method is able to optimize applications in a few minutes, reaching speed ups of up to 20% compared to hand optimized code.
在计算密集型应用中,最优性能是一个重要的目标。对于GPU应用程序,这需要大量关于算法和底层硬件的经验和知识,使它们成为自动调优方法的理想目标。我们提出了一个自动调谐器来优化CUDA应用中的阵列布局。根据数据和程序参数的不同,内核可以有不同的最佳配置。因此,我们在运行时自适应地调整数组布局,达到甚至超过手动优化代码的性能。我们自动检测数据特征以识别不同的性能场景,而无需用户输入或额外的编程。为了构建我们的决策模型,我们对应用程序进行了实证分析。我们的自适应优化原则上需要大量场景的分析数据,而这些场景无法对复杂的应用程序进行详尽的评估。我们通过扩展先前发布的方法来解决这个问题,该方法能够有效地分析单个内核调用并增强它以找到应用程序范围内的最佳解决方案。我们的方法能够在几分钟内优化应用程序,与手工优化的代码相比,达到高达20%的速度提升。
{"title":"Adaptive GPU Array Layout Auto-Tuning","authors":"Nicolas Weber, M. Goesele","doi":"10.1145/2916026.2916031","DOIUrl":"https://doi.org/10.1145/2916026.2916031","url":null,"abstract":"Optimal performance is an important goal in compute intensive applications. For GPU applications, this requires a lot of experience and knowledge about the algorithms and the underlying hardware, making them an ideal target for auto-tuning approaches. We present an auto-tuner which optimizes array layouts in CUDA applications. Depending on the data and program parameters, kernels can have varying optimal configurations. We thus adjust array layouts adaptively at runtime and achieve or even exceed performance of hand optimized code. We automatically detect data characteristics to identify different performance scenarios without user input or additional programming. We perform an empirical analysis of the application in order to construct our decision models. Our adaptive optimization requires in principle profiling data for an extremely high number of scenarios which cannot be exhaustively evaluated for complex applications. We solve this by extending a previously published method that is able to efficiently profile single kernel calls and enhance it to find application-wide optimal solutions. Our method is able to optimize applications in a few minutes, reaching speed ups of up to 20% compared to hand optimized code.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115120237","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Implementing an Efficient Path Based Equivalence Checker for Parallel Programs 并行程序中基于路径的等价检验器的实现
S. Bandyopadhyay, K. Banerjee
User written programs, when transformed by optimizing and parallelizing compilers, can be incorrect, if the compiler is not trusted. So, establishing the validity of these transformations is a crucial and challenging task. For program verification, the PRES+ (Petri net Representation of Embedded Systems) is now well accepted as a model to capture the data and control flow of a program. In this paper, an efficient path based equivalence checking method using a simple PRES+ model (which is easier to generate from a program) for validating several optimizing and parallelizing transformations is proposed. The experimental results demonstrate the efficiency of the method.
如果编译器不受信任,用户编写的程序在通过优化和并行化编译器进行转换时可能是不正确的。因此,确定这些转换的有效性是一项至关重要且具有挑战性的任务。对于程序验证,PRES+(嵌入式系统的Petri网表示)现在被广泛接受为捕获数据和程序控制流的模型。本文提出了一种有效的基于路径的等价性检验方法,该方法使用一个简单的PRES+模型(更容易从程序中生成)来验证若干优化和并行化转换。实验结果证明了该方法的有效性。
{"title":"Implementing an Efficient Path Based Equivalence Checker for Parallel Programs","authors":"S. Bandyopadhyay, K. Banerjee","doi":"10.1145/2916026.2916027","DOIUrl":"https://doi.org/10.1145/2916026.2916027","url":null,"abstract":"User written programs, when transformed by optimizing and parallelizing compilers, can be incorrect, if the compiler is not trusted. So, establishing the validity of these transformations is a crucial and challenging task. For program verification, the PRES+ (Petri net Representation of Embedded Systems) is now well accepted as a model to capture the data and control flow of a program. In this paper, an efficient path based equivalence checking method using a simple PRES+ model (which is easier to generate from a program) for validating several optimizing and parallelizing transformations is proposed. The experimental results demonstrate the efficiency of the method.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"6 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132151370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Session details: Morning Session 会话详细信息:上午的会话
Atul Kumar
{"title":"Session details: Morning Session","authors":"Atul Kumar","doi":"10.1145/3248633","DOIUrl":"https://doi.org/10.1145/3248633","url":null,"abstract":"","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"27 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125591553","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Developer Productivity in HPC Application Development: An Overview of Recent Techniques HPC应用程序开发中的开发人员生产力:最新技术概述
S. Sarkar
Increasing computing power with evolving hardware architectures has lead to change in programming paradigm from serial to parallel. Unlike the sequential counterpart, application building for High Performance Computing (HPC) is extremely challenging for developers. In order to improve the programmer productivity, it is necessary to address the challenges such as: i) How to abstract the hardware and low level complexities to make programming easier? ii) What features should a design assistance tool have to simplify application development? iii) How should the programming languages be enhanced for HPC? iv) What sort of prediction techniques can be developed to assist programmers to predict potential speedup? v) Can refactoring techniques solve the issue of parallelizing existing serial code? In this talk we make an attempt to present a landscape of the existing approaches to assist the software building process in HPC from a developer's point of view, and highlight some important research questions. We also discuss the state of practice in the industry and some of the application specific tools developed for HPC.
随着硬件架构的发展,计算能力的提高导致了编程范式从串行到并行的变化。与顺序对应的应用程序不同,为高性能计算(HPC)构建应用程序对开发人员来说极具挑战性。为了提高程序员的生产力,有必要解决以下挑战:i)如何抽象硬件和低级复杂性,使编程更容易?ii)设计辅助工具应该具备哪些特性来简化应用程序开发?iii)应该如何为HPC增强编程语言?iv)可以开发什么样的预测技术来帮助程序员预测潜在的加速?重构技术能解决并行化现有串行代码的问题吗?在这次演讲中,我们试图从开发者的角度来展示现有的方法来帮助HPC中的软件构建过程,并强调一些重要的研究问题。我们还讨论了业界的实践状况以及为高性能计算开发的一些特定于应用程序的工具。
{"title":"Developer Productivity in HPC Application Development: An Overview of Recent Techniques","authors":"S. Sarkar","doi":"10.1145/2916026.2916034","DOIUrl":"https://doi.org/10.1145/2916026.2916034","url":null,"abstract":"Increasing computing power with evolving hardware architectures has lead to change in programming paradigm from serial to parallel. Unlike the sequential counterpart, application building for High Performance Computing (HPC) is extremely challenging for developers. In order to improve the programmer productivity, it is necessary to address the challenges such as: i) How to abstract the hardware and low level complexities to make programming easier? ii) What features should a design assistance tool have to simplify application development? iii) How should the programming languages be enhanced for HPC? iv) What sort of prediction techniques can be developed to assist programmers to predict potential speedup? v) Can refactoring techniques solve the issue of parallelizing existing serial code? In this talk we make an attempt to present a landscape of the existing approaches to assist the software building process in HPC from a developer's point of view, and highlight some important research questions. We also discuss the state of practice in the industry and some of the application specific tools developed for HPC.","PeriodicalId":409042,"journal":{"name":"Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications","volume":"24 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2016-05-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132081203","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Proceedings of the ACM Workshop on Software Engineering Methods for Parallel and High Performance Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1