首页 > 最新文献

2015 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
Speeding-up the fault-tolerance analysis of interconnection networks 加快互联网络容错分析
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237035
Diego F. Bermúdez Garzón, Crispín Gómez Requena, P. López, M. E. Gómez
Analyzing the fault-tolerance of interconnection networks implies checking the connectivity of each source-destination pair. The size of the exploration space of such operation skyrockets with the network size and with the number of link faults. However, this problem is highly parallelizable since the exploration of each path between a source-destination pair is independent of the other paths. This paper presents an approach to analyze the fault-tolerance degree of multistage interconnection networks using GPUs in order to speed-up it. This approach uses CUDA as parallel programming tool on a GPU in order to take advantage of all available cores. Results show that the execution time of the fault-tolerance exploration can be significantly reduced.
分析互连网络的容错性需要检查每个源-目的对的连通性。随着网络规模和链路故障数量的增加,此类操作的探索空间也随之增大。然而,这个问题是高度并行化的,因为在源-目标对之间的每条路径的探索是独立于其他路径的。本文提出了一种利用图形处理器分析多级互连网络容错程度的方法,以加快多级互连网络的速度。这种方法使用CUDA作为GPU上的并行编程工具,以便利用所有可用的内核。结果表明,该方法可以显著缩短容错探测的执行时间。
{"title":"Speeding-up the fault-tolerance analysis of interconnection networks","authors":"Diego F. Bermúdez Garzón, Crispín Gómez Requena, P. López, M. E. Gómez","doi":"10.1109/HPCSim.2015.7237035","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237035","url":null,"abstract":"Analyzing the fault-tolerance of interconnection networks implies checking the connectivity of each source-destination pair. The size of the exploration space of such operation skyrockets with the network size and with the number of link faults. However, this problem is highly parallelizable since the exploration of each path between a source-destination pair is independent of the other paths. This paper presents an approach to analyze the fault-tolerance degree of multistage interconnection networks using GPUs in order to speed-up it. This approach uses CUDA as parallel programming tool on a GPU in order to take advantage of all available cores. Results show that the execution time of the fault-tolerance exploration can be significantly reduced.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129899834","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A Reduced Complexity Instruction Set architecture for low cost embedded processors 一种用于低成本嵌入式处理器的低复杂度指令集架构
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237068
Hanni B. Lozano, M. Ito
Implementing advanced DSP applications in software on a low power and low cost embedded RISC processors is a challenging task because of ISA shortcomings that inhibits performance. An embedded CISC processor can potentially deliver higher performance but not enough to meet the demand of complex DSP applications. We present a novel ISA that eliminates unnecessary overheads and speeds up the performance of embedded DSP applications on resource constrained processors. The implementation of the novel mixed ISA requires minor modification to the base architecture which translates to less than 5% increase in total power consumption. The novel ISA reduces the number of instructions used to implement a complex Fast Fourier Transform by less than half and speeds the processing by three folds leading to a substantial improvement in energy efficiency. Simulation results of a number of embedded benchmark programs show an average two fold increase in performance compared to a RISC processor.
在低功耗和低成本的嵌入式RISC处理器上实现高级DSP应用软件是一项具有挑战性的任务,因为ISA的缺点会抑制性能。嵌入式CISC处理器可以提供更高的性能,但不足以满足复杂DSP应用的需求。我们提出了一种新的ISA,它消除了不必要的开销,并加快了嵌入式DSP应用在资源受限处理器上的性能。新型混合ISA的实现只需要对基本架构进行少量修改,总功耗增加不到5%。新型ISA将用于执行复杂的快速傅里叶变换的指令数量减少了不到一半,并将处理速度提高了三倍,从而大大提高了能源效率。许多嵌入式基准程序的仿真结果表明,与RISC处理器相比,其性能平均提高了两倍。
{"title":"A Reduced Complexity Instruction Set architecture for low cost embedded processors","authors":"Hanni B. Lozano, M. Ito","doi":"10.1109/HPCSim.2015.7237068","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237068","url":null,"abstract":"Implementing advanced DSP applications in software on a low power and low cost embedded RISC processors is a challenging task because of ISA shortcomings that inhibits performance. An embedded CISC processor can potentially deliver higher performance but not enough to meet the demand of complex DSP applications. We present a novel ISA that eliminates unnecessary overheads and speeds up the performance of embedded DSP applications on resource constrained processors. The implementation of the novel mixed ISA requires minor modification to the base architecture which translates to less than 5% increase in total power consumption. The novel ISA reduces the number of instructions used to implement a complex Fast Fourier Transform by less than half and speeds the processing by three folds leading to a substantial improvement in energy efficiency. Simulation results of a number of embedded benchmark programs show an average two fold increase in performance compared to a RISC processor.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"29 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127610487","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
The user support programme and the training infrastructure of the EGI Federated Cloud EGI联邦云的用户支持计划和培训基础设施
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237016
E. Fernández, Diego Scardaci, G. Sipos, D. Wallom, Yin Chen
The EGI Federated Cloud is a standards-based, open cloud system as well as its enabling technologies that federates institutional clouds to offer a scalable computing platform for data and/or compute driven applications and services. The EGI Federated Cloud is based on open standards and open source Cloud Management Frameworks and offers to its users IaaS, PaaS and SaaS capabilities and interfaces tuned towards the needs of users in research and education. The federation enables scientific data, workloads, simulations and services to span across multiple administrative locations, allowing researchers and educators to access and exploit the distributed resources as an integrated system. The EGI Federated Cloud collaboration established a user support model and a training infrastructure to raise visibility of this service within European scientific communities with the overarching goal to increase adoption and, ultimately increase the usage of e-infrastructures for the benefit of the whole European Research Area. The paper describes this scalable user support and training infrastructure models. The training infrastructure is built on top of the production sites to reduce costs and increase its sustainability. Appropriate design solutions were implemented to reduce the security risks due to the cohabitation of production and training resources on the same sites. The EGI Federated Cloud educational program foresees different kind of training events from basic tutorials to spread the knowledge of this new infrastructure to events devoted to specific scientific disciplines teaching how to use tools already integrated in the infrastructure with the assistance of experts identified in the EGI community. The main success metric of this educational program is the number of researchers willing to try the Federated Cloud, which are steered into the EGI world by the EGI Federated Cloud Support Team through a formal process that brings them from the initial tests to fully exploit the production resources.
EGI联邦云是一个基于标准的开放云系统,其支持技术将机构云联合起来,为数据和/或计算驱动的应用和服务提供可扩展的计算平台。EGI联邦云基于开放标准和开源云管理框架,并为其用户提供IaaS、PaaS和SaaS功能和接口,以满足研究和教育用户的需求。该联盟使科学数据、工作负载、模拟和服务跨越多个管理地点,使研究人员和教育工作者能够访问和利用分布式资源作为一个集成系统。EGI联邦云合作建立了一个用户支持模型和一个培训基础设施,以提高欧洲科学界对该服务的可见性,其总体目标是增加采用,并最终增加电子基础设施的使用,从而使整个欧洲研究区受益。本文描述了这种可扩展的用户支持和培训基础结构模型。培训基础设施建立在生产基地之上,以降低成本并提高其可持续性。实施了适当的设计解决方案,以减少由于生产和培训资源在同一地点共存而造成的安全风险。EGI联邦云教育计划预计会有不同类型的培训活动,从传播这种新基础设施知识的基础教程,到专门针对特定科学学科的活动,在EGI社区确定的专家的帮助下,教授如何使用已经集成在基础设施中的工具。这个教育计划的主要成功指标是愿意尝试联邦云的研究人员的数量,这些研究人员由EGI联邦云支持团队通过一个正式的过程引导到EGI世界,使他们从最初的测试到充分利用生产资源。
{"title":"The user support programme and the training infrastructure of the EGI Federated Cloud","authors":"E. Fernández, Diego Scardaci, G. Sipos, D. Wallom, Yin Chen","doi":"10.1109/HPCSim.2015.7237016","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237016","url":null,"abstract":"The EGI Federated Cloud is a standards-based, open cloud system as well as its enabling technologies that federates institutional clouds to offer a scalable computing platform for data and/or compute driven applications and services. The EGI Federated Cloud is based on open standards and open source Cloud Management Frameworks and offers to its users IaaS, PaaS and SaaS capabilities and interfaces tuned towards the needs of users in research and education. The federation enables scientific data, workloads, simulations and services to span across multiple administrative locations, allowing researchers and educators to access and exploit the distributed resources as an integrated system. The EGI Federated Cloud collaboration established a user support model and a training infrastructure to raise visibility of this service within European scientific communities with the overarching goal to increase adoption and, ultimately increase the usage of e-infrastructures for the benefit of the whole European Research Area. The paper describes this scalable user support and training infrastructure models. The training infrastructure is built on top of the production sites to reduce costs and increase its sustainability. Appropriate design solutions were implemented to reduce the security risks due to the cohabitation of production and training resources on the same sites. The EGI Federated Cloud educational program foresees different kind of training events from basic tutorials to spread the knowledge of this new infrastructure to events devoted to specific scientific disciplines teaching how to use tools already integrated in the infrastructure with the assistance of experts identified in the EGI community. The main success metric of this educational program is the number of researchers willing to try the Federated Cloud, which are steered into the EGI world by the EGI Federated Cloud Support Team through a formal process that brings them from the initial tests to fully exploit the production resources.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"95 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115448379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
An efficient implementation of fuzzy edge detection using GPU in MATLAB MATLAB中基于GPU的模糊边缘检测的高效实现
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237100
F. Hoseini, A. Shahbahrami
Edge detection is one of the most important concepts in image processing which is used as an indicator for processing and extraction of some of border characteristics at low levels, also for detection and finding objects at high levels. Due to the inherently parallel nature of edge detection algorithms, they suit well for implementation on a Graphics Processing Unit (GPU). First part of this paper aims to detect and retouch image edges using fuzzy inference system. In the first step RGB images converted to gray scale images. In the second step the input images are converted from unit 8 class to double class. In the third step, fuzzy inference system is defined with two inputs. Fuzzy inference system rules and membership function are applied on these two inputs. The output with black pixels indicates areas with edge and the output with white pixels indicates areas without edge. The second part of this paper, the performance of fuzzy edge detection algorithm is improved using GPU platform by exploiting data-level parallelism and scatter/gather parallel communication pattern in Matlab environment. The experimental results show that the performance is improved for different image sizes of up to 11.8x.
边缘检测是图像处理中最重要的概念之一,它是在低层次上处理和提取某些边界特征的指标,也是在高层次上检测和寻找目标的指标。由于边缘检测算法固有的并行性,它们非常适合在图形处理单元(GPU)上实现。本文的第一部分是利用模糊推理系统对图像边缘进行检测和修饰。在第一步将RGB图像转换为灰度图像。在第二步中,将输入图像从单元8类转换为双类。第三步,定义具有两个输入的模糊推理系统。对这两个输入分别应用模糊推理系统规则和隶属函数。黑色像素的输出表示有边缘的区域,白色像素的输出表示没有边缘的区域。第二部分在Matlab环境下,利用数据级并行性和散/聚并行通信模式,利用GPU平台改进模糊边缘检测算法的性能。实验结果表明,在不同的图像尺寸下,该算法的性能得到了提高,最高可达11.8倍。
{"title":"An efficient implementation of fuzzy edge detection using GPU in MATLAB","authors":"F. Hoseini, A. Shahbahrami","doi":"10.1109/HPCSim.2015.7237100","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237100","url":null,"abstract":"Edge detection is one of the most important concepts in image processing which is used as an indicator for processing and extraction of some of border characteristics at low levels, also for detection and finding objects at high levels. Due to the inherently parallel nature of edge detection algorithms, they suit well for implementation on a Graphics Processing Unit (GPU). First part of this paper aims to detect and retouch image edges using fuzzy inference system. In the first step RGB images converted to gray scale images. In the second step the input images are converted from unit 8 class to double class. In the third step, fuzzy inference system is defined with two inputs. Fuzzy inference system rules and membership function are applied on these two inputs. The output with black pixels indicates areas with edge and the output with white pixels indicates areas without edge. The second part of this paper, the performance of fuzzy edge detection algorithm is improved using GPU platform by exploiting data-level parallelism and scatter/gather parallel communication pattern in Matlab environment. The experimental results show that the performance is improved for different image sizes of up to 11.8x.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"5 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132556709","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Large Java arrays and their applications 大型Java数组及其应用程序
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237077
Piotr Wendykier, B. Borucki, K. Nowinski
All current implementations of Java Virtual Machines allow the creation of one-dimensional arrays of length smaller than 231 elements. In addition, since Java lacks true multidimensional arrays, most of numerical libraries use one-dimensional arrays to store multidimensional data. With the current limitation, it is not possible to store volumes of size larger than 12903. On the other hand, the data from scientific simulations or medical scanners continuously grow in size and it is not uncommon to go beyond that limit. This work addresses the problem of maximal size of one-dimensional Java arrays. JLargeArrays is a Java library of one-dimensional arrays that can store up to 263 elements. Performance comparison with native Java arrays and Fastutil library shows that JLargeArrays is the fastest solution overall. Possible applications in Java collections as well as numerical and visualization frameworks are also discussed.
当前Java虚拟机的所有实现都允许创建长度小于231个元素的一维数组。此外,由于Java缺乏真正的多维数组,大多数数值库使用一维数组来存储多维数据。在当前的限制下,不可能存储大小大于12903的卷。另一方面,来自科学模拟或医疗扫描仪的数据不断增长,超过这个限制并不罕见。这项工作解决了一维Java数组的最大大小问题。JLargeArrays是一个一维数组的Java库,最多可以存储263个元素。与本地Java数组和Fastutil库的性能比较表明,JLargeArrays是最快的解决方案。还讨论了在Java集合以及数值和可视化框架中的可能应用。
{"title":"Large Java arrays and their applications","authors":"Piotr Wendykier, B. Borucki, K. Nowinski","doi":"10.1109/HPCSim.2015.7237077","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237077","url":null,"abstract":"All current implementations of Java Virtual Machines allow the creation of one-dimensional arrays of length smaller than 231 elements. In addition, since Java lacks true multidimensional arrays, most of numerical libraries use one-dimensional arrays to store multidimensional data. With the current limitation, it is not possible to store volumes of size larger than 12903. On the other hand, the data from scientific simulations or medical scanners continuously grow in size and it is not uncommon to go beyond that limit. This work addresses the problem of maximal size of one-dimensional Java arrays. JLargeArrays is a Java library of one-dimensional arrays that can store up to 263 elements. Performance comparison with native Java arrays and Fastutil library shows that JLargeArrays is the fastest solution overall. Possible applications in Java collections as well as numerical and visualization frameworks are also discussed.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"58 2 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"131294042","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Optimizing communications in multi-GPU Lattice Boltzmann simulations 优化通信在多gpu晶格玻尔兹曼模拟
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237021
E. Calore, D. Marchi, S. Schifano, R. Tripiccione
An increasingly large number of scientific applications run on large clusters based on GPU systems. In most cases the large scale parallelism of the applications uses MPI, widely recognized as the de-facto standard for building parallel applications, while several programming languages are used to express the parallelism available in the application and map it onto the parallel resources available on GPUs. Regular grids and stencil codes are used in a subset of these applications, often corresponding to computational “Grand Challenges”. One such class of applications are Lattice Boltzmann Methods (LB) used in computational fluid dynamics. The regular structure of LB algorithms makes them suitable for processor architectures with a large degree of parallelism like GPUs. Scalability of these applications on large clusters requires a careful design of processor-to-processor data communications, exploiting all possibilities to overlap communication and computation. This paper looks at these issues, considering as a use case a state-of-the-art two-dimensional LB model, that accurately reproduces the thermo-hydrodynamics of a 2D-fluid obeying the equation-of-state of a perfect gas. We study in details the interplay between data organization and data layout, data-communication options and overlapping of communication and computation. We derive partial models of some performance features and compare with experimental results for production-grade codes that we run on a large cluster of GPUs.
越来越多的科学应用程序运行在基于GPU系统的大型集群上。在大多数情况下,应用程序的大规模并行性使用MPI, MPI被广泛认为是构建并行应用程序的事实标准,同时使用几种编程语言来表达应用程序中可用的并行性,并将其映射到gpu上可用的并行资源上。规则网格和模板代码在这些应用程序的一个子集中使用,通常对应于计算“大挑战”。其中一类应用是用于计算流体动力学的晶格玻尔兹曼方法(LB)。LB算法的规则结构使其适用于gpu等具有高度并行性的处理器架构。这些应用程序在大型集群上的可伸缩性需要仔细设计处理器到处理器的数据通信,利用所有可能的通信和计算重叠。本文着眼于这些问题,将最先进的二维LB模型作为一个用例,该模型精确地再现了遵循完美气体状态方程的二维流体的热流体力学。详细研究了数据组织与数据布局之间的相互作用、数据通信选择以及通信与计算的重叠。我们推导了一些性能特征的部分模型,并与我们在大型gpu集群上运行的生产级代码的实验结果进行了比较。
{"title":"Optimizing communications in multi-GPU Lattice Boltzmann simulations","authors":"E. Calore, D. Marchi, S. Schifano, R. Tripiccione","doi":"10.1109/HPCSim.2015.7237021","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237021","url":null,"abstract":"An increasingly large number of scientific applications run on large clusters based on GPU systems. In most cases the large scale parallelism of the applications uses MPI, widely recognized as the de-facto standard for building parallel applications, while several programming languages are used to express the parallelism available in the application and map it onto the parallel resources available on GPUs. Regular grids and stencil codes are used in a subset of these applications, often corresponding to computational “Grand Challenges”. One such class of applications are Lattice Boltzmann Methods (LB) used in computational fluid dynamics. The regular structure of LB algorithms makes them suitable for processor architectures with a large degree of parallelism like GPUs. Scalability of these applications on large clusters requires a careful design of processor-to-processor data communications, exploiting all possibilities to overlap communication and computation. This paper looks at these issues, considering as a use case a state-of-the-art two-dimensional LB model, that accurately reproduces the thermo-hydrodynamics of a 2D-fluid obeying the equation-of-state of a perfect gas. We study in details the interplay between data organization and data layout, data-communication options and overlapping of communication and computation. We derive partial models of some performance features and compare with experimental results for production-grade codes that we run on a large cluster of GPUs.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"49 17","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120815416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 9
A workflow-enabled big data analytics software stack for escience 一个支持工作流的escience大数据分析软件堆栈
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237088
Cosimo Palazzo, Andrea Mariello, S. Fiore, Alessandro D'Anca, D. Elia, Dean N. Williams, G. Aloisio
The availability of systems able to process and analyse big amount of data has boosted scientific advances in several fields. Workflows provide an effective tool to define and manage large sets of processing tasks. In the big data analytics area, the Ophidia project provides a cross-domain big data analytics framework for the analysis of scientific, multi-dimensional datasets. The framework exploits a server-side, declarative, parallel approach for data analysis and mining. It also features a complete workflow management system to support the execution of complex scientific data analysis, schedule tasks submission, manage operators dependencies and monitor jobs execution. The workflow management engine allows users to perform a coordinated execution of multiple data analytics operators (both single and massive - parameter sweep) in an effective manner. For the definition of the big data analytics workflow, a JSON schema has been properly designed and implemented. To aid the definition of the workflows, a visual design language consisting of several symbols, named Data Analytics Workflow Modelling Language (DAWML), has been also defined.
能够处理和分析大量数据的系统的可用性推动了几个领域的科学进步。工作流提供了定义和管理大型处理任务集的有效工具。在大数据分析领域,Ophidia项目为科学、多维数据集的分析提供了一个跨领域的大数据分析框架。该框架利用服务器端、声明式、并行的方法进行数据分析和挖掘。它还具有完整的工作流程管理系统,以支持执行复杂的科学数据分析,计划任务提交,管理操作员依赖关系和监控作业执行。工作流管理引擎允许用户以有效的方式执行多个数据分析操作(包括单个和大量参数扫描)的协调执行。对于大数据分析工作流的定义,已经适当地设计和实现了JSON模式。为了帮助工作流的定义,还定义了一种由几个符号组成的视觉设计语言,称为数据分析工作流建模语言(DAWML)。
{"title":"A workflow-enabled big data analytics software stack for escience","authors":"Cosimo Palazzo, Andrea Mariello, S. Fiore, Alessandro D'Anca, D. Elia, Dean N. Williams, G. Aloisio","doi":"10.1109/HPCSim.2015.7237088","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237088","url":null,"abstract":"The availability of systems able to process and analyse big amount of data has boosted scientific advances in several fields. Workflows provide an effective tool to define and manage large sets of processing tasks. In the big data analytics area, the Ophidia project provides a cross-domain big data analytics framework for the analysis of scientific, multi-dimensional datasets. The framework exploits a server-side, declarative, parallel approach for data analysis and mining. It also features a complete workflow management system to support the execution of complex scientific data analysis, schedule tasks submission, manage operators dependencies and monitor jobs execution. The workflow management engine allows users to perform a coordinated execution of multiple data analytics operators (both single and massive - parameter sweep) in an effective manner. For the definition of the big data analytics workflow, a JSON schema has been properly designed and implemented. To aid the definition of the workflows, a visual design language consisting of several symbols, named Data Analytics Workflow Modelling Language (DAWML), has been also defined.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128911291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 12
Identifying patterns towards Algorithm Based Fault Tolerance 基于容错算法的模式识别
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237083
U. Kabir, D. Goswami
Checkpoint and recovery cost imposed by coordinated checkpoint/restart (CCP/R) is a crucial performance issue for high performance computing (HPC) applications. In comparison, Algorithm Based Fault Tolerance (ABFT) is a promising fault tolerance method with low recovery overhead, but it suffers from inadequacy of universal applicability and user non-transparency. In this paper we address the overhead problem of CCP/R and some of the limitations of ABFT, and propose a solution for ABFT based on algorithmic patterns. The proposed solution is a generic fault tolerance strategy for a group of applications that exhibit similar algorithmic (structural and behavioral) features. These features together with the minimal fault recovery data (critical data) determine the fault tolerance strategy for the group of applications. We call this strategy a fault tolerance pattern (FTP). We demonstrate the idea of FTP with parallel iterative deepening A* (PIDA*) search, a generic search algorithm used to solve a wide range of discrete optimization problems (DOP). Theoretical analysis shows that our proposed solution performs better than CCP/R in terms of checkpoint and recovery time overhead. Furthermore, using FTP helps in separation of concerns, which facilitates user transparency.
协调检查点/重启(CCP/R)带来的检查点和恢复成本是高性能计算(HPC)应用的一个关键性能问题。相比之下,基于算法的容错(ABFT)是一种很有前途的容错方法,具有较低的恢复开销,但存在普遍适用性不足和用户不透明的问题。本文针对CCP/R的开销问题和ABFT的局限性,提出了一种基于算法模式的ABFT解决方案。提出的解决方案是一种通用的容错策略,适用于表现出相似算法(结构和行为)特征的一组应用程序。这些特性与最小的故障恢复数据(关键数据)一起决定了应用程序组的容错策略。我们称这种策略为容错模式(FTP)。我们用并行迭代深化A* (PIDA*)搜索来证明FTP的思想,PIDA*是一种用于解决广泛的离散优化问题(DOP)的通用搜索算法。理论分析表明,我们提出的解决方案在检查点和恢复时间开销方面优于CCP/R。此外,使用FTP有助于分离关注点,从而促进用户透明度。
{"title":"Identifying patterns towards Algorithm Based Fault Tolerance","authors":"U. Kabir, D. Goswami","doi":"10.1109/HPCSim.2015.7237083","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237083","url":null,"abstract":"Checkpoint and recovery cost imposed by coordinated checkpoint/restart (CCP/R) is a crucial performance issue for high performance computing (HPC) applications. In comparison, Algorithm Based Fault Tolerance (ABFT) is a promising fault tolerance method with low recovery overhead, but it suffers from inadequacy of universal applicability and user non-transparency. In this paper we address the overhead problem of CCP/R and some of the limitations of ABFT, and propose a solution for ABFT based on algorithmic patterns. The proposed solution is a generic fault tolerance strategy for a group of applications that exhibit similar algorithmic (structural and behavioral) features. These features together with the minimal fault recovery data (critical data) determine the fault tolerance strategy for the group of applications. We call this strategy a fault tolerance pattern (FTP). We demonstrate the idea of FTP with parallel iterative deepening A* (PIDA*) search, a generic search algorithm used to solve a wide range of discrete optimization problems (DOP). Theoretical analysis shows that our proposed solution performs better than CCP/R in terms of checkpoint and recovery time overhead. Furthermore, using FTP helps in separation of concerns, which facilitates user transparency.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"23 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127818255","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
Performance evaluation and improvement in cloud computing environment 云计算环境下的性能评估与改进
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237109
Omar Khedher, M. Jarraya
Cloud computing covers a wide range of applications, from online services for the end user. It becomes the new trends for most organizations to handle their business IT units. Services provided are becoming flexible because the resources and processing power available to each can be adjusted on the fly to meet changes in need [6]. However, infrastructures deployed on a cloud computing environment may induce significant performance penalties for the demanding computing workload. In our doctoral research, we aim to study, analyze, evaluate and improve performance in cloud computing environment based on different criteria. To achieve the thesis objectives, a research performed is based on a quantitative analysis of repeatable empirical experiment.
云计算涵盖了广泛的应用程序,从为最终用户提供的在线服务。它成为大多数组织处理其业务It单元的新趋势。所提供的服务变得越来越灵活,因为每个服务可用的资源和处理能力可以动态调整以满足需求的变化[6]。然而,部署在云计算环境上的基础设施可能会对苛刻的计算工作负载造成严重的性能损失。在我们的博士研究中,我们的目标是基于不同的标准来研究、分析、评估和改进云计算环境下的性能。为了实现论文的目标,进行的研究是基于可重复的实证实验的定量分析。
{"title":"Performance evaluation and improvement in cloud computing environment","authors":"Omar Khedher, M. Jarraya","doi":"10.1109/HPCSim.2015.7237109","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237109","url":null,"abstract":"Cloud computing covers a wide range of applications, from online services for the end user. It becomes the new trends for most organizations to handle their business IT units. Services provided are becoming flexible because the resources and processing power available to each can be adjusted on the fly to meet changes in need [6]. However, infrastructures deployed on a cloud computing environment may induce significant performance penalties for the demanding computing workload. In our doctoral research, we aim to study, analyze, evaluate and improve performance in cloud computing environment based on different criteria. To achieve the thesis objectives, a research performed is based on a quantitative analysis of repeatable empirical experiment.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"91 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"124663699","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
A resilient routing approach for Mobile Ad Hoc Networks 移动Ad Hoc网络的弹性路由方法
Pub Date : 2015-07-20 DOI: 10.1109/HPCSim.2015.7237102
Ming-Yang Su, Chih-Wei Yang
This paper presents a resilient routing algorithm that is more suitable for a Mobile Ad Hoc Network (MANET) with node quick moving or node sparse, than traditional routing algorithms such as Ad-hoc On-demand Distance Vector (AODV) routing. Since AODV routing in MANETs is known for its advantageous properties, the proposed routing algorithm is based on the AODV, and called RAODV (Resilient AODV). In the route discovery phase, it differs from AODV, which only establishes one routing path from a source node to the destination node, whereas RAODV establishes as many routes as possible. Thus, when the primary route breaks, the node can immediately adopt an alternative route without further route research effort. If no possible alternative route exists, the node will transmit the route break information backward to instruct the previous node on the reverse route to select an alternative one, and so on. The proposed RAODV can reduce the number of route rediscovery procedures, and thus improve the packet loss rate and transmission delay, especially in sparse MANETs. Simulator ns2 was used to evaluate the performance of RAODV. In some cases, the proposed RAODV was able to reduce the packet loss rate by 72.61% compared to traditional AODV.
本文提出了一种弹性路由算法,该算法比传统的自组织按需距离矢量(AODV)路由算法更适合于节点快速移动或节点稀疏的移动自组织网络(MANET)。由于在manet中的AODV路由以其优越的特性而闻名,因此提出的路由算法基于AODV,并称为RAODV(弹性AODV)。在路由发现阶段,它与AODV不同,AODV只建立一条从源节点到目的节点的路由路径,而RAODV则建立尽可能多的路由。因此,当主路由中断时,节点可以立即采用替代路由,而无需进行进一步的路由研究。如果不存在可能的备选路由,则该节点将反向发送路由中断信息,指示反向路由上的前一个节点选择备选路由,以此类推。本文提出的RAODV可以减少路由重新发现的次数,从而提高丢包率和传输延迟,特别是在稀疏的manet中。利用模拟器ns2对RAODV的性能进行了评估。在某些情况下,与传统的AODV相比,所提出的RAODV能够将丢包率降低72.61%。
{"title":"A resilient routing approach for Mobile Ad Hoc Networks","authors":"Ming-Yang Su, Chih-Wei Yang","doi":"10.1109/HPCSim.2015.7237102","DOIUrl":"https://doi.org/10.1109/HPCSim.2015.7237102","url":null,"abstract":"This paper presents a resilient routing algorithm that is more suitable for a Mobile Ad Hoc Network (MANET) with node quick moving or node sparse, than traditional routing algorithms such as Ad-hoc On-demand Distance Vector (AODV) routing. Since AODV routing in MANETs is known for its advantageous properties, the proposed routing algorithm is based on the AODV, and called RAODV (Resilient AODV). In the route discovery phase, it differs from AODV, which only establishes one routing path from a source node to the destination node, whereas RAODV establishes as many routes as possible. Thus, when the primary route breaks, the node can immediately adopt an alternative route without further route research effort. If no possible alternative route exists, the node will transmit the route break information backward to instruct the previous node on the reverse route to select an alternative one, and so on. The proposed RAODV can reduce the number of route rediscovery procedures, and thus improve the packet loss rate and transmission delay, especially in sparse MANETs. Simulator ns2 was used to evaluate the performance of RAODV. In some cases, the proposed RAODV was able to reduce the packet loss rate by 72.61% compared to traditional AODV.","PeriodicalId":134009,"journal":{"name":"2015 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2015-07-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"117237854","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
期刊
2015 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1