Proceedings. Symposium on Computer Architecture and High Performance Computing最新文献

英文中文

A Parallel Algorithm for the Facility Location Problem Applied to Oil and Gas Logistics 油气物流设施选址问题的并行算法

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2015-10-18 DOI: 10.1109/SBAC-PADW.2015.9

T. Pinheiro, M. D. Castro

One of the most relevant problems at large organizations is the choice of locations for establishing facilities, distribution centers or retail stores. This logistics issue involves a strategic decision which may cause significant impact at the effective cost of the product. There are several papers tackling this issue, known as the Facility Location Problem. The objective of this paper is to analyze applicable heuristics previously developed by other authors and to define a mathematical formulation to the fuel distribution industry in Brazil. It started from the analysis of the upstream and downstream flow in practice in this segment and the respective transportation cost formation, including taxes. Thereby, we propose the use of parallel programming techniques using the Message Passing Interface (MPI) with the objective of reducing transportation costs in a reasonable execution time. Results show that this approach provides interesting performance gains, when compared to serial execution.

在大型组织中，最相关的问题之一是选择建立设施、配送中心或零售商店的地点。这个物流问题涉及到一个战略决策，它可能对产品的有效成本产生重大影响。有几篇论文讨论了这个问题，即所谓的设施选址问题。本文的目的是分析其他作者先前开发的适用启发式方法，并为巴西的燃料分销行业定义一个数学公式。本文从分析该环节实践中上下游的流量以及各自运输成本的形成(包括税费)入手。因此，我们建议使用使用消息传递接口(MPI)的并行编程技术，目的是在合理的执行时间内减少传输成本。结果表明，与串行执行相比，这种方法提供了有趣的性能提升。

引用次数: 0

Efficient irregular wavefront propagation algorithms on Intel^® Xeon Phi^™. Intel®Xeon Phi™上高效的不规则波前传播算法。

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2015-10-01 DOI: 10.1109/SBAC-PAD.2015.13

Jeremias M Gomes, George Teodoro, Alba de Melo, Jun Kong, Tahsin Kurc, Joel H Saltz

We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel^® Xeon Phi^™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations.

我们研究了在Intel®Xeon Phi™协处理器上执行不规则波前传播模式(IWPP)，这是几种图像分析操作中使用的基本计算结构。在Xeon Phi处理器上有效实现IWPP是一个具有挑战性的问题，因为IWPP具有不规则性，并且在原始IWPP算法中使用原子指令来解决竞争条件。在Xeon Phi处理器上，SIMD和矢量化指令的使用对于获得高性能至关重要。但是，SIMD原子指令不受支持。因此，我们提出了一种新的IWPP算法，可以利用支持的SIMD指令集。我们还评估了一个替代存储容器(优先队列)来跟踪波前中的活动元素，以提高并行算法的效率。以形态重构和填充操作为例，对新的IWPP算法进行了评估。我们的结果表明，由于矢量化，在原始IWPP的基础上，性能提高了5.63倍。此外，与高效的CPU和GPU实现相比，新的IWPP分别实现了45.7倍和1.62倍的速度提升。

{"title":"Efficient irregular wavefront propagation algorithms on Intel® Xeon Phi™.","authors":"Jeremias M Gomes, George Teodoro, Alba de Melo, Jun Kong, Tahsin Kurc, Joel H Saltz","doi":"10.1109/SBAC-PAD.2015.13","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2015.13","url":null,"abstract":"We investigate the execution of the Irregular Wavefront Propagation Pattern (IWPP), a fundamental computing structure used in several image analysis operations, on the Intel® Xeon Phi™ co-processor. An efficient implementation of IWPP on the Xeon Phi is a challenging problem because of IWPP's irregularity and the use of atomic instructions in the original IWPP algorithm to resolve race conditions. On the Xeon Phi, the use of SIMD and vectorization instructions is critical to attain high performance. However, SIMD atomic instructions are not supported. Therefore, we propose a new IWPP algorithm that can take advantage of the supported SIMD instruction set. We also evaluate an alternate storage container (priority queue) to track active elements in the wavefront in an effort to improve the parallel algorithm efficiency. The new IWPP algorithm is evaluated with Morphological Reconstruction and Imfill operations as use cases. Our results show performance improvements of up to 5.63× on top of the original IWPP due to vectorization. Moreover, the new IWPP achieves speedups of 45.7× and 1.62×, respectively, as compared to efficient CPU and GPU implementations.","PeriodicalId":91389,"journal":{"name":"Proceedings. Symposium on Computer Architecture and High Performance Computing","volume":"2015 ","pages":"25-32"},"PeriodicalIF":0.0,"publicationDate":"2015-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1109/SBAC-PAD.2015.13","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"34574305","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 10

Fast LH

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2013-10-23 DOI: 10.1109/SBAC-PAD.2013.15

Juan Chabkinian, Thomas J. E. Schwarz

Linear Hashing is a widely used and efficient version of extensible hashing. A distributed version of Linear Hashing is LH* that stores key-indexed records on up to hundreds of thousands of sites in a distributed system. LH* implements the dictionary data structure efficiently since it does not use a central component for the key-based operations of insertion, deletion, actualization, and retrieval and for the scan operation. LH* allows a client or a server to commit an addressing error by sending a request to the wrong server. In this case, the server forwards to the correct server directly or in one more forward operation. We discuss here methods to avoid the double forward, which is rare but might breach quality of service guarantees. We compare our methods with LH* P2P that pushes information about changes in the file structure to clients, whether they are active or not.

线性哈希是一种广泛使用且高效的可扩展哈希。线性哈希的一个分布式版本是LH*，它在分布式系统中存储多达数十万个站点的键索引记录。LH*有效地实现了字典数据结构，因为它不使用基于键的插入、删除、实现和检索操作以及扫描操作的中心组件。LH*允许客户端或服务器通过向错误的服务器发送请求来提交寻址错误。在这种情况下，服务器直接转发到正确的服务器，或者再进行一次转发操作。本文讨论了避免双重转发的方法，这种转发很少见，但可能违反服务质量保证。我们将我们的方法与LH* P2P进行比较，LH* P2P将有关文件结构更改的信息推送到客户端，无论客户端是否处于活动状态。

引用次数: 1

Mapping Pipelined Applications with Replication to Increase Throughput and Reliability 用复制映射流水线应用程序以提高吞吐量和可靠性

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2010-10-27 DOI: 10.1109/SBAC-PAD.2010.16

A. Benoit, L. Marchal, Y. Robert, O. Sinnen

Mapping and scheduling an application onto the processors of a parallel system is a difficult problem. This is true when performance is the only objective, but becomes worse when a second optimization criterion like reliability is involved. In this paper we investigate the problem of mapping an application consisting of several consecutive stages, i.e., a pipeline, onto heterogeneous processors, while considering both the performance, measured as throughput, and the reliability. The mechanism of replication, which refers to the mapping of an application stage onto more than one processor, can be used to increase throughput but also to increase reliability. Finding the right replication trade-off plays a pivotal role for this bi-criteria optimization problem. Our formal model includes heterogeneous processors, both in terms of execution speed as well as in terms of reliability. We study the complexity of the various sub problems and show how a solution can be obtained for the polynomial cases. For the general NP-hard problem, heuristics are presented and experimentally evaluated. We further propose the design of an exact algorithm based on A* state space search which allows us to evaluate the performance of our heuristics for small problem instances.

将应用程序映射和调度到并行系统的处理器上是一个难题。当性能是唯一的目标时，情况确实如此，但当涉及到第二个优化标准(如可靠性)时，情况就变得更糟了。在本文中，我们研究了将由几个连续阶段组成的应用程序(即管道)映射到异构处理器上的问题，同时考虑了性能(以吞吐量衡量)和可靠性。复制机制指的是将一个应用程序阶段映射到多个处理器上，可以用来提高吞吐量，也可以提高可靠性。找到正确的复制权衡对于这个双条件优化问题起着关键作用。我们的正式模型包括异构处理器，在执行速度和可靠性方面都是如此。我们研究了各种子问题的复杂性，并展示了如何获得多项式情况的解。对于一般NP-hard问题，提出了启发式方法并进行了实验评估。我们进一步提出了一种基于A*状态空间搜索的精确算法的设计，它允许我们评估我们的启发式算法在小问题实例中的性能。

{"title":"Mapping Pipelined Applications with Replication to Increase Throughput and Reliability","authors":"A. Benoit, L. Marchal, Y. Robert, O. Sinnen","doi":"10.1109/SBAC-PAD.2010.16","DOIUrl":"https://doi.org/10.1109/SBAC-PAD.2010.16","url":null,"abstract":"Mapping and scheduling an application onto the processors of a parallel system is a difficult problem. This is true when performance is the only objective, but becomes worse when a second optimization criterion like reliability is involved. In this paper we investigate the problem of mapping an application consisting of several consecutive stages, i.e., a pipeline, onto heterogeneous processors, while considering both the performance, measured as throughput, and the reliability. The mechanism of replication, which refers to the mapping of an application stage onto more than one processor, can be used to increase throughput but also to increase reliability. Finding the right replication trade-off plays a pivotal role for this bi-criteria optimization problem. Our formal model includes heterogeneous processors, both in terms of execution speed as well as in terms of reliability. We study the complexity of the various sub problems and show how a solution can be obtained for the polynomial cases. For the general NP-hard problem, heuristics are presented and experimentally evaluated. We further propose the design of an exact algorithm based on A* state space search which allows us to evaluate the performance of our heuristics for small problem instances.","PeriodicalId":91389,"journal":{"name":"Proceedings. Symposium on Computer Architecture and High Performance Computing","volume":"134 1","pages":"55-62"},"PeriodicalIF":0.0,"publicationDate":"2010-10-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"78029937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 5

memu: Unifying Application Modeling and Cluster Exploitation memu:统一应用建模和集群开发

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2004-01-01 DOI: 10.1109/CAHPC.2004.23

A. Alves, A. Pina, J. Exposto, J. Rufino

引用次数: 1

On the Combined Scheduling of Malleable and Rigid Jobs 柔性作业与刚性作业的联合调度研究

Proceedings. Symposium on Computer Architecture and High Performance Computing

Pub Date : 2004-01-01 DOI: 10.1109/CAHPC.2004.27

Jan Hungershöfer

引用次数: 28

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

Proceedings. Symposium on Computer Architecture and High Performance Computing

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀