首页 > 最新文献

2014 International Conference on High Performance Computing & Simulation (HPCS)最新文献

英文 中文
A fault-tolerant acoustic sensor network for monitoring underwater pipelines 用于水下管道监测的容错声学传感器网络
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903782
N. Mohamed, Latifa Al-Muhairi, J. Al-Jaroodi, I. Jawhar
Underwater Acoustic Sensor Networks (UASNs) can be used to monitor long underwater pipeline structures for oil, gas, and water. In this case, a special type of UASNs, UASN-P (UASN for long pipelines) is used. One of the main challenges of using UASN-P is the reliability of the connections among the nodes. Faults in a few contiguous nodes may cause the creation of holes which will result in dividing the network into multiple disconnected segments. As a result, sensor nodes that are located between holes may not be able to deliver their sensed information which negativity affects the network sensing coverage. This paper provides an analysis of the different types of faults in UASN-P and studies their negative impact on the sensing coverage. We utilize Autonomous Underwater Vehicles (AUVs) and develop two models to overcome these faults and enhance coverage. The first model utilizes AUVs to function as mobile sensor nodes to cover the network holes while the second model uses the AUVs to deliver and deploy fixed sensor nodes in the network holes to replace faulty nodes. In both models, placed nodes can provide additional sensing coverage as well as enable connectivity among disconnected segments in the UASN-P. A strategy for best allocation using a limited number of sensors or sensing vehicles is developed. In addition, evaluations and comparison between both models are provided.
水声传感器网络(uasn)可用于监测水下长的石油、天然气和水管道结构。在这种情况下,使用一种特殊类型的UASN- p(用于长管道的UASN)。使用usn - p的主要挑战之一是节点之间连接的可靠性。几个相邻节点的故障可能会导致网络出现漏洞,从而导致网络被分割成多个不相连的网段。因此,位于孔之间的传感器节点可能无法传递其感知信息,从而影响网络的感知覆盖。本文分析了usn - p中不同类型的故障,并研究了它们对遥感覆盖的负面影响。我们利用自主水下航行器(auv)并开发了两种模型来克服这些缺陷并提高覆盖范围。第一种模型利用auv作为移动传感器节点覆盖网络漏洞,第二种模型利用auv在网络漏洞中交付和部署固定传感器节点以替换故障节点。在这两种模型中,放置的节点可以提供额外的传感覆盖范围,并使usn - p中断开的部分之间能够连接。提出了利用有限数量的传感器或传感车辆进行最佳配置的策略。并对两种模型进行了评价和比较。
{"title":"A fault-tolerant acoustic sensor network for monitoring underwater pipelines","authors":"N. Mohamed, Latifa Al-Muhairi, J. Al-Jaroodi, I. Jawhar","doi":"10.1109/HPCSim.2014.6903782","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903782","url":null,"abstract":"Underwater Acoustic Sensor Networks (UASNs) can be used to monitor long underwater pipeline structures for oil, gas, and water. In this case, a special type of UASNs, UASN-P (UASN for long pipelines) is used. One of the main challenges of using UASN-P is the reliability of the connections among the nodes. Faults in a few contiguous nodes may cause the creation of holes which will result in dividing the network into multiple disconnected segments. As a result, sensor nodes that are located between holes may not be able to deliver their sensed information which negativity affects the network sensing coverage. This paper provides an analysis of the different types of faults in UASN-P and studies their negative impact on the sensing coverage. We utilize Autonomous Underwater Vehicles (AUVs) and develop two models to overcome these faults and enhance coverage. The first model utilizes AUVs to function as mobile sensor nodes to cover the network holes while the second model uses the AUVs to deliver and deploy fixed sensor nodes in the network holes to replace faulty nodes. In both models, placed nodes can provide additional sensing coverage as well as enable connectivity among disconnected segments in the UASN-P. A strategy for best allocation using a limited number of sensors or sensing vehicles is developed. In addition, evaluations and comparison between both models are provided.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"02 1","pages":"877-884"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85960472","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 7
A context-aware system for personalized and accessible pedestrian paths 一个环境感知系统,用于个性化和可访问的行人路径
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903776
S. Mirri, Catia Prandi, P. Salomoni
This work presents mPASS (mobile Pervasive Accessibility Social Sensing), a social and ubiquitous context aware system to provide users with personalized and accessible pedestrian paths and maps. In order to collect a complete data set, our system gathers information from different sources: sensing, crowdsourcing and data produced by local authors and disability organizations. Gathered information are tailored to user's needs and preferences on the basis of his/her context, defined by his/her location, his/her profile and quality of data about the personalized path. To support the effectiveness of our approach, we have developed a prototype, which is described in this paper, together with some results of the context-based adaptation.
这项工作提出了mPASS(移动普适可达性社会感知),这是一个社会和无处不在的上下文感知系统,为用户提供个性化和可访问的行人路径和地图。为了收集完整的数据集,我们的系统从不同的来源收集信息:传感、众包以及当地作者和残疾人组织提供的数据。收集到的信息是根据用户的需求和偏好,根据他/她的位置、他/她的个人资料和有关个性化路径的数据质量来定义的。为了支持我们方法的有效性,我们开发了一个原型,在本文中进行了描述,以及基于上下文的适应的一些结果。
{"title":"A context-aware system for personalized and accessible pedestrian paths","authors":"S. Mirri, Catia Prandi, P. Salomoni","doi":"10.1109/HPCSim.2014.6903776","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903776","url":null,"abstract":"This work presents mPASS (mobile Pervasive Accessibility Social Sensing), a social and ubiquitous context aware system to provide users with personalized and accessible pedestrian paths and maps. In order to collect a complete data set, our system gathers information from different sources: sensing, crowdsourcing and data produced by local authors and disability organizations. Gathered information are tailored to user's needs and preferences on the basis of his/her context, defined by his/her location, his/her profile and quality of data about the personalized path. To support the effectiveness of our approach, we have developed a prototype, which is described in this paper, together with some results of the context-based adaptation.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"51 1","pages":"833-840"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"81141549","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
Analysis of classic algorithms on GPUs gpu上经典算法的分析
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903670
Lin Ma, R. Chamberlain, Kunal Agrawal
The recently developed Threaded Many-core Memory (TMM) model provides a framework for analyzing algorithms for highly-threaded many-core machines such as GPUs. In particular, it tries to capture the fact that these machines hide memory latencies via the use of a large number of threads and large memory bandwidth. The TMM model analysis contains two components: computational complexity and memory complexity. A model is only useful if it can explain and predict empirical data. In this work, we investigate the effectiveness of the TMM model. We analyze algorithms for 5 classic problems - suffix tree/array for string matching, fast Fourier transform, merge sort, list ranking, and all-pairs shortest paths-under this model, and compare the results of the analysis with the experimental findings of ours and other researchers who have implemented and measured the performance of these algorithms on an spectrum of diverse GPUs. We find that the TMM model is able to predict important and sometimes previously unexplained trends and artifacts in the experimental data.
最近开发的线程多核内存(TMM)模型为高线程多核机器(如gpu)的算法分析提供了一个框架。特别是,它试图捕捉这些机器通过使用大量线程和大内存带宽来隐藏内存延迟的事实。TMM模型分析包含计算复杂度和内存复杂度两个部分。一个模型只有在能够解释和预测经验数据时才有用。在这项工作中,我们研究了TMM模型的有效性。我们在该模型下分析了5个经典问题的算法-字符串匹配的后缀树/数组,快速傅里叶变换,合并排序,列表排序和全对最短路径,并将分析结果与我们和其他研究人员的实验结果进行了比较,这些研究人员已经在各种gpu上实现并测量了这些算法的性能。我们发现TMM模型能够预测实验数据中重要的,有时是以前无法解释的趋势和伪像。
{"title":"Analysis of classic algorithms on GPUs","authors":"Lin Ma, R. Chamberlain, Kunal Agrawal","doi":"10.1109/HPCSim.2014.6903670","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903670","url":null,"abstract":"The recently developed Threaded Many-core Memory (TMM) model provides a framework for analyzing algorithms for highly-threaded many-core machines such as GPUs. In particular, it tries to capture the fact that these machines hide memory latencies via the use of a large number of threads and large memory bandwidth. The TMM model analysis contains two components: computational complexity and memory complexity. A model is only useful if it can explain and predict empirical data. In this work, we investigate the effectiveness of the TMM model. We analyze algorithms for 5 classic problems - suffix tree/array for string matching, fast Fourier transform, merge sort, list ranking, and all-pairs shortest paths-under this model, and compare the results of the analysis with the experimental findings of ours and other researchers who have implemented and measured the performance of these algorithms on an spectrum of diverse GPUs. We find that the TMM model is able to predict important and sometimes previously unexplained trends and artifacts in the experimental data.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"1 1","pages":"65-73"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"85846575","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 14
Efficient analysis methodology for huge application traces 用于大规模应用程序跟踪的有效分析方法
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903791
D. Dosimont, Generoso Pagano, Guillaume Huard, Vania Marangozova-Martin, J. Vincent
The growing complexity of computer system hardware and software makes their behavior analysis a challenging task. In this context, tracing appears to be a promising solution as it provides relevant information about the system execution. However, trace analysis techniques and tools lack in providing the analyst the way to perform an efficient analysis flow because of several issues. First, traces contain a huge volume of data difficult to store, load in memory and work with. Then, the analysis flow is hindered by various result formats, provided by different analysis techniques, often incompatible. Last, analysis frameworks lack an entry point to understand the traced application general behavior. Indeed, traditional visualization techniques suffer from time and space scalability issues due to screen size, and are not able to represent the full trace. In this article, we present how to do an efficient analysis by using the Shneiderman's mantra: “Overview first, zoom and filter, then details on demand”. Our methodology is based on FrameSoC, a trace management infrastructure that provides solutions for trace storage, data access, and analysis flow, managing analysis results and tool. Ocelotl, a visualization tool, takes advantage of FrameSoC and shows a synthetic representation of a trace by using a time aggregation. This visualization solves scalability issues and provides an entry point for the analysis by showing phases and behavior disruptions, with the objective of getting more details by focusing on the interesting trace parts.
计算机系统硬件和软件的日益复杂,使其行为分析成为一项具有挑战性的任务。在这种情况下,跟踪似乎是一个很有前途的解决方案,因为它提供了有关系统执行的相关信息。然而,由于几个问题,跟踪分析技术和工具缺乏为分析人员提供执行有效分析流程的方法。首先,轨迹包含大量难以存储、加载到内存和处理的数据。然后,分析流程受到不同分析技术提供的各种结果格式的阻碍,这些格式通常是不兼容的。最后,分析框架缺乏一个入口点来理解跟踪的应用程序的一般行为。实际上,由于屏幕大小的原因,传统的可视化技术存在时间和空间可伸缩性问题,并且无法表示完整的跟踪。在本文中,我们将介绍如何使用Shneiderman的口头禅来进行有效的分析:“首先概述,放大和过滤,然后按需详细说明”。我们的方法基于FrameSoC,这是一个跟踪管理基础设施,为跟踪存储、数据访问和分析流提供解决方案,管理分析结果和工具。Ocelotl是一种可视化工具,它利用了FrameSoC,并通过使用时间聚合来显示跟踪的合成表示。这种可视化解决了可伸缩性问题,并通过显示阶段和行为中断为分析提供了切入点,其目标是通过关注有趣的跟踪部分来获得更多细节。
{"title":"Efficient analysis methodology for huge application traces","authors":"D. Dosimont, Generoso Pagano, Guillaume Huard, Vania Marangozova-Martin, J. Vincent","doi":"10.1109/HPCSim.2014.6903791","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903791","url":null,"abstract":"The growing complexity of computer system hardware and software makes their behavior analysis a challenging task. In this context, tracing appears to be a promising solution as it provides relevant information about the system execution. However, trace analysis techniques and tools lack in providing the analyst the way to perform an efficient analysis flow because of several issues. First, traces contain a huge volume of data difficult to store, load in memory and work with. Then, the analysis flow is hindered by various result formats, provided by different analysis techniques, often incompatible. Last, analysis frameworks lack an entry point to understand the traced application general behavior. Indeed, traditional visualization techniques suffer from time and space scalability issues due to screen size, and are not able to represent the full trace. In this article, we present how to do an efficient analysis by using the Shneiderman's mantra: “Overview first, zoom and filter, then details on demand”. Our methodology is based on FrameSoC, a trace management infrastructure that provides solutions for trace storage, data access, and analysis flow, managing analysis results and tool. Ocelotl, a visualization tool, takes advantage of FrameSoC and shows a synthetic representation of a trace by using a time aggregation. This visualization solves scalability issues and provides an entry point for the analysis by showing phases and behavior disruptions, with the objective of getting more details by focusing on the interesting trace parts.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"23 1","pages":"951-958"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"77832099","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 6
A benchmark-based performance model for memory-bound HPC applications 一个基于基准的高性能计算应用程序的性能模型
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903790
B. Putigny, Brice Goglin, Denis Barthou
The increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder and programmers need ways to understand the performance behavior. We present an innovative approach for predicting the performance of memory-bound multi-threaded applications. It relies on micro-benchmarks and a compositional model, combining measures of micro-benchmarks in order to model larger codes. Our memory model takes into account cache sizes and cache coherence protocols, having a large impact on performance of multi-threaded codes. Applying this model to real world HPC kernels shows that it can predict their performance with good accuracy, helping taking optimization decisions to increase application's performance.
随着服务器计算能力的不断增强,通过多核、多级缓存和NUMA架构,其复杂性也急剧增加。利用计算能力变得越来越困难,程序员需要理解性能行为的方法。我们提出了一种创新的方法来预测内存受限的多线程应用程序的性能。它依赖于微基准测试和组合模型,结合微基准测试的度量来模拟更大的代码。我们的内存模型考虑了缓存大小和缓存一致性协议,这对多线程代码的性能有很大的影响。将该模型应用于实际的高性能计算内核,结果表明,该模型能够较准确地预测高性能计算内核的性能,有助于做出优化决策以提高应用程序的性能。
{"title":"A benchmark-based performance model for memory-bound HPC applications","authors":"B. Putigny, Brice Goglin, Denis Barthou","doi":"10.1109/HPCSim.2014.6903790","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903790","url":null,"abstract":"The increasing computation capability of servers comes with a dramatic increase of their complexity through many cores, multiple levels of caches and NUMA architectures. Exploiting the computing power is increasingly harder and programmers need ways to understand the performance behavior. We present an innovative approach for predicting the performance of memory-bound multi-threaded applications. It relies on micro-benchmarks and a compositional model, combining measures of micro-benchmarks in order to model larger codes. Our memory model takes into account cache sizes and cache coherence protocols, having a large impact on performance of multi-threaded codes. Applying this model to real world HPC kernels shows that it can predict their performance with good accuracy, helping taking optimization decisions to increase application's performance.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"77 1","pages":"943-950"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"83878670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 15
An approach for scalable parallel execution of ant algorithms 蚁群算法的可扩展并行执行方法
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903683
F. Cicirelli, Agostino Forestiero, Andrea Giordano, C. Mastroianni
This paper presents an approach for the efficient parallel/distributed execution of ant algorithms, based on multi-agent systems. A very popular clustering problem, i.e., the spatially sorting of items belonging to a number of predefined classes, is taken as a use case. The approach consists in partitioning the problem space to a number of parallel nodes. Data consistency and conflict issues, which may arise when multiple agents concurrently access shared data, are transparently handled using a purposely developed notion of logical time. The developer remains in charge only of defining the behavior of the agents modeling the ants, without coping with issues related to parallel/distributed programming and performance optimization. Experimental results show that the approach is scalable and can be adopted to speed up the ant algorithm execution when the problem size is large, as may be in the case of massive data analysis and clustering.
本文提出了一种基于多智能体系统的高效并行/分布式蚂蚁算法执行方法。一个非常流行的聚类问题,即属于许多预定义类的项目的空间排序,被作为一个用例。该方法包括将问题空间划分为许多并行节点。当多个代理并发访问共享数据时,可能会出现数据一致性和冲突问题,使用专门开发的逻辑时间概念来透明地处理这些问题。开发人员仍然只负责定义代理对蚂蚁建模的行为,而不处理与并行/分布式编程和性能优化相关的问题。实验结果表明,该方法具有可扩展性,当问题规模较大时,如在海量数据分析和聚类的情况下,可以采用该方法加快蚂蚁算法的执行速度。
{"title":"An approach for scalable parallel execution of ant algorithms","authors":"F. Cicirelli, Agostino Forestiero, Andrea Giordano, C. Mastroianni","doi":"10.1109/HPCSim.2014.6903683","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903683","url":null,"abstract":"This paper presents an approach for the efficient parallel/distributed execution of ant algorithms, based on multi-agent systems. A very popular clustering problem, i.e., the spatially sorting of items belonging to a number of predefined classes, is taken as a use case. The approach consists in partitioning the problem space to a number of parallel nodes. Data consistency and conflict issues, which may arise when multiple agents concurrently access shared data, are transparently handled using a purposely developed notion of logical time. The developer remains in charge only of defining the behavior of the agents modeling the ants, without coping with issues related to parallel/distributed programming and performance optimization. Experimental results show that the approach is scalable and can be adopted to speed up the ant algorithm execution when the problem size is large, as may be in the case of massive data analysis and clustering.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"12 1","pages":"170-177"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"88935556","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
EMS@CNR: An Energy monitoring sensor network infrastructure for in-building location-based services EMS@CNR:能源监测传感器网络基础设施,用于建筑物内基于位置的服务
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903779
P. Barsocchi, E. Ferro, L. Fortunati, Fabio Mavilia, Filippo Palumbo
The increasing demand for building services and comfort levels, together with the rise in time spent inside buildings, assure an upward trend in energy demand for the future. In this paper we present a long term energy monitoring system called EMS@CNR that is able to measure the energy consumed by end users in office environments. The system has been tested monitoring the power consumption of a testbed room of the CNR research area in Pisa. The proposed infrastructure stands as an enabling technology for future in-building location-based services. As preliminary results we showed the potentiality of EMS@CNR in long term monitoring of the user working behaviors.
对建筑服务和舒适水平的需求不断增加,加上在建筑物内花费时间的增加,确保了未来能源需求的上升趋势。在本文中,我们提出了一个名为EMS@CNR的长期能源监测系统,该系统能够测量办公环境中最终用户消耗的能源。该系统已在比萨北车研究区一个试验室的电力消耗监测中进行了测试。拟议的基础设施是未来建筑内基于位置的服务的使能技术。作为初步结果,我们显示了EMS@CNR在长期监测用户工作行为方面的潜力。
{"title":"EMS@CNR: An Energy monitoring sensor network infrastructure for in-building location-based services","authors":"P. Barsocchi, E. Ferro, L. Fortunati, Fabio Mavilia, Filippo Palumbo","doi":"10.1109/HPCSim.2014.6903779","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903779","url":null,"abstract":"The increasing demand for building services and comfort levels, together with the rise in time spent inside buildings, assure an upward trend in energy demand for the future. In this paper we present a long term energy monitoring system called EMS@CNR that is able to measure the energy consumed by end users in office environments. The system has been tested monitoring the power consumption of a testbed room of the CNR research area in Pisa. The proposed infrastructure stands as an enabling technology for future in-building location-based services. As preliminary results we showed the potentiality of EMS@CNR in long term monitoring of the user working behaviors.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"98 1","pages":"857-862"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"79217580","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 20
Insertion of PETSc in the NEMO stack software driving NEMO towards exascale computing 在NEMO堆栈软件中插入PETSc,推动NEMO向百亿亿次计算方向发展
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903761
L. D’Amore, A. Murli, V. Boccia, L. Carracciuolo
This paper addresses the scientific challenges related to high level implementation strategies which steer the NEMO (Nucleus for European Modelling of the Ocean) code toward the effective exploitation of the opportunities offered by exascale systems. We consider, as case studies, two components of the NEMO ocean model (OPA-Ocean PArallelization): the Sea Surface Height equation solver and the Variational Data Assimilation module. The advantages rising from the insertion of consolidated scientific libraries in the NEMO code are highlighted: such advantages concern both the “software quality” improvement (see the software quality parameters like robustness, portability, resilience, etc.) and the reduction of time spent for software development and maintenance. Finally, we consider the Shallow Water equations as a toy model for NEMO ocean model to show how the use of PETSc objects predisposes the application to gain a good level of scalability and efficiency when the most suitable level of abstraction is used.
本文解决了与高级实施策略相关的科学挑战,这些策略引导NEMO(欧洲海洋建模核心)代码有效利用百亿亿级系统提供的机会。作为案例研究,我们考虑了NEMO海洋模式(OPA-Ocean PArallelization)的两个组成部分:海面高度方程求解器和变分数据同化模块。在NEMO代码中插入整合的科学库所带来的优势被强调了出来:这种优势涉及到“软件质量”的改进(参见软件质量参数,如健壮性、可移植性、弹性等)和减少用于软件开发和维护的时间。最后,我们将浅水方程视为NEMO海洋模型的玩具模型,以展示在使用最合适的抽象级别时,PETSc对象的使用如何使应用程序获得良好的可扩展性和效率。
{"title":"Insertion of PETSc in the NEMO stack software driving NEMO towards exascale computing","authors":"L. D’Amore, A. Murli, V. Boccia, L. Carracciuolo","doi":"10.1109/HPCSim.2014.6903761","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903761","url":null,"abstract":"This paper addresses the scientific challenges related to high level implementation strategies which steer the NEMO (Nucleus for European Modelling of the Ocean) code toward the effective exploitation of the opportunities offered by exascale systems. We consider, as case studies, two components of the NEMO ocean model (OPA-Ocean PArallelization): the Sea Surface Height equation solver and the Variational Data Assimilation module. The advantages rising from the insertion of consolidated scientific libraries in the NEMO code are highlighted: such advantages concern both the “software quality” improvement (see the software quality parameters like robustness, portability, resilience, etc.) and the reduction of time spent for software development and maintenance. Finally, we consider the Shallow Water equations as a toy model for NEMO ocean model to show how the use of PETSc objects predisposes the application to gain a good level of scalability and efficiency when the most suitable level of abstraction is used.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"63 1","pages":"724-731"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"91043766","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 5
Forensic disk image indexing and search in an HPC environment 在HPC环境中的法医磁盘映像索引和搜索
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903735
M. Bernaschi, Marco Cianfriglia, Antonio Di Marco, A. Sabellico, G. Me, Giancarlo Carbone, G. Totaro
We describe a solution for fast indexing and searching within large heterogeneous data sets whose main purpose is to support investigators that need to analyze forensic disk images originated by seizures or created from bodies of evidence. Our approach is based on a combination of techniques aimed at improving efficiency and reliability of the indexing process.We do not rely on existing frameworks like Hadoop but borrow concepts from different contexts including High Performance Computing and Database management.
我们描述了一个在大型异构数据集中快速索引和搜索的解决方案,其主要目的是支持需要分析由缉获或从证据体中创建的法医磁盘图像的调查人员。我们的方法是基于旨在提高索引过程的效率和可靠性的技术组合。我们不依赖于现有的框架,比如Hadoop,而是从不同的环境中借鉴概念,包括高性能计算和数据库管理。
{"title":"Forensic disk image indexing and search in an HPC environment","authors":"M. Bernaschi, Marco Cianfriglia, Antonio Di Marco, A. Sabellico, G. Me, Giancarlo Carbone, G. Totaro","doi":"10.1109/HPCSim.2014.6903735","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903735","url":null,"abstract":"We describe a solution for fast indexing and searching within large heterogeneous data sets whose main purpose is to support investigators that need to analyze forensic disk images originated by seizures or created from bodies of evidence. Our approach is based on a combination of techniques aimed at improving efficiency and reliability of the indexing process.We do not rely on existing frameworks like Hadoop but borrow concepts from different contexts including High Performance Computing and Database management.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"26 1","pages":"558-565"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78417697","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc) 使用硬件局部性(hwloc)管理异构集群节点的拓扑
Pub Date : 2014-07-21 DOI: 10.1109/HPCSim.2014.6903671
Brice Goglin
Modern computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA architectures. Parallel applications developers have to take locality into account before they can expect good efficiency on these platforms. Thus there is a strong need for a portable tool gathering and exposing this information. The Hardware Locality project (hwloc) offers a tree representation of the hardware based on the inclusion and localities of the CPU and memory resources. It is already widely used for affinity-based task placement in high performance computing. In this article we present how hwloc is extended to describe more than computing and memory resources. Indeed, I/O device locality is becoming another important aspect of locality since high performance GPUs, network or InfiniBand interfaces possess privileged access to some of the cores and memory banks. hwloc integrates this knowledge into its topology representation and offers an interoperability API to extend existing libraries such as CUDA with locality information. We also describe how hwloc now helps process managers and batch schedulers to deal with the topology of multiple cluster nodes, together with compression for better scalability up to thousands of nodes.
现代计算平台越来越复杂,有多核、共享缓存和NUMA架构。并行应用程序开发人员在期望在这些平台上获得良好的效率之前,必须考虑局部性。因此,非常需要一种便携式工具来收集和公开这些信息。Hardware Locality项目(hwloc)基于CPU和内存资源的包含和位置提供了硬件的树表示。它已经广泛应用于高性能计算中基于亲和力的任务放置。在本文中,我们将介绍如何扩展hwloc来描述计算和内存资源以外的其他资源。实际上,I/O设备局部性正在成为局部性的另一个重要方面,因为高性能gpu、网络或InfiniBand接口拥有对某些核心和内存库的特权访问。hwloc将这些知识集成到它的拓扑表示中,并提供了一个互操作性API来扩展现有的库,如CUDA和本地信息。我们还描述了hwloc现在如何帮助进程管理器和批调度程序处理多个集群节点的拓扑结构,以及如何通过压缩来获得更好的可伸缩性,最多可扩展到数千个节点。
{"title":"Managing the topology of heterogeneous cluster nodes with hardware locality (hwloc)","authors":"Brice Goglin","doi":"10.1109/HPCSim.2014.6903671","DOIUrl":"https://doi.org/10.1109/HPCSim.2014.6903671","url":null,"abstract":"Modern computing platforms are increasingly complex, with multiple cores, shared caches, and NUMA architectures. Parallel applications developers have to take locality into account before they can expect good efficiency on these platforms. Thus there is a strong need for a portable tool gathering and exposing this information. The Hardware Locality project (hwloc) offers a tree representation of the hardware based on the inclusion and localities of the CPU and memory resources. It is already widely used for affinity-based task placement in high performance computing. In this article we present how hwloc is extended to describe more than computing and memory resources. Indeed, I/O device locality is becoming another important aspect of locality since high performance GPUs, network or InfiniBand interfaces possess privileged access to some of the cores and memory banks. hwloc integrates this knowledge into its topology representation and offers an interoperability API to extend existing libraries such as CUDA with locality information. We also describe how hwloc now helps process managers and batch schedulers to deal with the topology of multiple cluster nodes, together with compression for better scalability up to thousands of nodes.","PeriodicalId":6469,"journal":{"name":"2014 International Conference on High Performance Computing & Simulation (HPCS)","volume":"34 1","pages":"74-81"},"PeriodicalIF":0.0,"publicationDate":"2014-07-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"84928044","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 36
期刊
2014 International Conference on High Performance Computing & Simulation (HPCS)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1