2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)最新文献

英文中文

Introduction to RAW 2021 RAW 2021简介

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00020

引用次数: 0

Performance Study of Multi-tenant Cloud FPGAs 多租户云fpga的性能研究

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00032

Joel Mandebi Mbongue, S. Saha, C. Bobda

Cloud deployments now increasingly provision FPGA accelerators as part of virtual instances. While commercial clouds still essentially expose single-tenant FPGAs to the users, the growing demand for hardware acceleration raises the need for architectures supporting FPGA multi-tenancy. In this work, we explore the trade-off between hardware consolidation and performance. Experiments show that FPGA multi-tenancy increases hardware utilization and decreases IO performance in the order of microseconds compared to the single-tenant model. The experiments also demonstrate that implementing on-chip communication between the hardware workloads of a cloud user significantly reduces the overall communication overhead.

云部署现在越来越多地提供FPGA加速器作为虚拟实例的一部分。虽然商业云基本上仍然向用户公开单租户FPGA，但对硬件加速的不断增长的需求提高了对支持FPGA多租户架构的需求。在这项工作中，我们将探讨硬件整合和性能之间的权衡。实验表明，与单租户模型相比，FPGA多租户模型提高了硬件利用率，并降低了微秒级的IO性能。实验还表明，在云用户的硬件工作负载之间实现片上通信显着降低了总体通信开销。

引用次数: 3

Evaluating the Performance of Integer Sum Reduction on an Intel GPU 在Intel GPU上评估整数和约简的性能

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00099

Zheming Jin, J. Vetter

Sum reduction is a primitive operation in parallel computing while SYCL is a promising heterogeneous programming language. In this paper, we describe the SYCL implementations of integer sum reduction using atomic functions, shared local memory, vectorized memory accesses, and parameterized workload sizes. Evaluating the reduction kernels shows that we can achieve 1.4X speedup over the open-source implementations of sum reduction for a sufficiently large number of integers on an Intel integrated GPU.

和约简是并行计算中的基本运算，而SYCL是一种很有前途的异构编程语言。在本文中，我们描述了使用原子函数、共享本地内存、向量化内存访问和参数化工作负载大小的整数和约简的SYCL实现。对约简内核的评估表明，在英特尔集成GPU上，对于足够大的整数，我们可以实现比开源实现的和约简提高1.4倍的速度。

引用次数: 0

RRNS Base Extension Error-Correcting Code for Performance Optimization of Scalable Reliable Distributed Cloud Data Storage 面向可扩展可靠分布式云数据存储性能优化的RRNS基扩展纠错码

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00087

M. Babenko, A. Tchernykh, Luis Bernardo Pulido-Gaytan, J. M. Cortés-Mendoza, Egor Shiryaev, E. Golimblevskaia, A. Avetisyan, S. Nesmachnow

Ensuring reliable data storage in a cloud environment is a challenging problem. One of the efficient mechanisms used to solve it is the Redundant Residue Number System (RRNS) with the projection method, a commonly used mechanism for detecting errors. However, the error correction based on the projection method has exponential complexity depending on the number of control and working moduli. In this paper, we propose an optimization mechanism using a base extension and Hamming distance to reduce the number of calculated projections. We show that they can be reduced up to three times than the classical projection method and, hence, the time complexity of data recovery in the distributed cloud data storage.

在云环境中确保可靠的数据存储是一个具有挑战性的问题。基于投影法的冗余余数系统(RRNS)是解决这一问题的有效机制之一，是一种常用的误差检测机制。然而，基于投影法的误差修正具有指数复杂度，这取决于控制数量和工作模量。在本文中，我们提出了一种利用基扩展和汉明距离来减少计算投影数量的优化机制。我们表明，它们可以比经典的投影方法减少三倍，因此，在分布式云数据存储中数据恢复的时间复杂性。

引用次数: 6

Checkpointing vs. Supervision Resilience Approaches for Dynamic Independent Tasks 动态独立任务的检查点与监督弹性方法

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00089

Jonas Posner, Lukas Reitz, Claudia Fohry

With the advent of exascale computing, issues such as application irregularity and permanent hardware failure are growing in importance. Irregularity is often addressed by task-based parallel programming coupled with work stealing. At the task level, resilience can be provided by two principal approaches, namely checkpointing and supervision. For both, particular algorithms have been worked out recently. They perform local recovery and continue the program execution on a reduced set of resources. The checkpointing algorithms regularly save task descriptors explicitly, while the supervision algorithms exploit their natural duplication during work stealing and may be coupled with steal tracking to minimize the number of task re-executions. Thus far, the two groups of algorithms have been targeted at different task models: checkpointing algorithms at dynamic independent tasks, and supervision algorithms at nested fork-join programs.This paper transfers the most advanced supervision algorithm to the dynamic independent tasks model, thus enabling a comparison between checkpointing and supervision. Our comparison includes experiments and running time predictions. Results consistently show typical resilience overheads below 1% for both approaches. The overheads are lower for supervision in practically relevant cases, but checkpointing takes over for order millions of processes.

随着百亿亿次计算的出现，诸如应用程序不规则和永久硬件故障等问题变得越来越重要。不规则性通常通过基于任务的并行编程和工作窃取来解决。在任务层面，弹性可以通过两种主要方法提供，即检查点和监督。对于这两种情况，最近都制定出了特定的算法。它们执行本地恢复，并在减少的资源集上继续执行程序。检查点算法定期显式地保存任务描述符，而监督算法在工作窃取过程中利用它们的自然复制，并可以与窃取跟踪相结合，以减少任务重新执行的次数。到目前为止，两组算法针对不同的任务模型:针对动态独立任务的检查点算法和针对嵌套fork-join程序的监督算法。本文将最先进的监督算法转化为动态独立任务模型，从而实现了检查点与监督的比较。我们的比较包括实验和运行时间预测。结果一致表明，两种方法的典型弹性开销都低于1%。在实际相关的情况下，监督的费用较低，但检查点接管了数以百万计的流程。

{"title":"Checkpointing vs. Supervision Resilience Approaches for Dynamic Independent Tasks","authors":"Jonas Posner, Lukas Reitz, Claudia Fohry","doi":"10.1109/IPDPSW52791.2021.00089","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00089","url":null,"abstract":"With the advent of exascale computing, issues such as application irregularity and permanent hardware failure are growing in importance. Irregularity is often addressed by task-based parallel programming coupled with work stealing. At the task level, resilience can be provided by two principal approaches, namely checkpointing and supervision. For both, particular algorithms have been worked out recently. They perform local recovery and continue the program execution on a reduced set of resources. The checkpointing algorithms regularly save task descriptors explicitly, while the supervision algorithms exploit their natural duplication during work stealing and may be coupled with steal tracking to minimize the number of task re-executions. Thus far, the two groups of algorithms have been targeted at different task models: checkpointing algorithms at dynamic independent tasks, and supervision algorithms at nested fork-join programs.This paper transfers the most advanced supervision algorithm to the dynamic independent tasks model, thus enabling a comparison between checkpointing and supervision. Our comparison includes experiments and running time predictions. Results consistently show typical resilience overheads below 1% for both approaches. The overheads are lower for supervision in practically relevant cases, but checkpointing takes over for order millions of processes.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129583250","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 3

Introduction to PAISE 2021 PAISE 2021简介

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00125

引用次数: 0

An Auto-tuning with Adaptation of A64 Scalable Vector Extension for SPIRAL 螺旋的A64可伸缩矢量扩展自适应调谐

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00117

Naruya Kitai, D. Takahashi, F. Franchetti, T. Katagiri, S. Ohshima, Toru Nagai

In this paper, we propose an auto-tuning (AT) system by adapting the A64 Scalable Vector Extension for SPIRAL to generate discrete Fourier transform (DFT) implementations. The performance of our method is evaluated using the Supercomputer "Flow" at Nagoya University. The A64 scalable vector extension applied DFT codes are up to 1.98 times faster than scalar DFT codes and up to 3.63 times higher in terms of the SIMD instruction rate. In addition, we obtain a factor of maximum speedup 2.32 by adapting proposed AT system for loop unrolling.

在本文中，我们提出了一种自动调谐(AT)系统，该系统采用A64可伸缩向量扩展来生成离散傅里叶变换(DFT)实现。使用名古屋大学的超级计算机“Flow”对我们的方法进行了性能评估。A64可扩展向量扩展应用DFT码比标量DFT码快1.98倍，SIMD指令速率高3.63倍。此外，采用所提出的AT系统进行环展开，得到最大加速系数为2.32。

引用次数: 0

IPDPS 2021 PhD Forum Welcome and Abstracts IPDPS 2021博士论坛欢迎和摘要

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00160

引用次数: 0

Introduction to PDCO 2021 PDCO 2021简介

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/ipdpsw52791.2021.00080

引用次数: 0

Efficient Memory Management in Likelihood-based Phylogenetic Placement 基于似然系统发育定位的高效内存管理

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

Pub Date : 2021-06-01 DOI: 10.1109/IPDPSW52791.2021.00041

P. Barbera, A. Stamatakis

Maximum likelihood based phylogenetic methods score phylogenetic tree topologies comprising a set of molecular sequences of the species under study, using statistical models of evolution. The scoring procedure relies on storing intermediate results at inner nodes of the tree during the tree traversal. This induces comparatively high memory requirements compared to less compute-intensive methods such as parsimony, for instance.The memory requirements are particularly large for maximum likelihood phylogenetic placement, as further intermediate results should be stored at all branches of the tree to maximize runtime performance. This has hindered numerous users of our phylogenetic placement tool EPA-NG from performing placement on large phylogenetic trees.Here, we present an approach to reduce the memory footprint of EPA-NG. Further, we have generalized our implementation and integrated it into our phylogenetic likelihood library, libpll-2, such that it can be used by other tools for phylogenetic inference. On an empirical dataset, we were able to reduce the memory requirements by up to 96% at the cost of increasing execution times by 23 times. Hence, there exists a trade-off between decreasing memory requirements and increasing execution times which we investigate. When increasing the amount of memory available for placement to a certain level, execution times are only approximately 4 times lower for the most challenging dataset we have tested. This now allows for conducting maximum likelihood based placement on substantially larger trees within reasonable times. Finally, we show that the active memory management approach introduces new challenges for parallelization and outline possible solutions.

基于最大似然的系统发育方法使用进化的统计模型，对系统发育树拓扑结构进行评分，该拓扑结构包括被研究物种的一组分子序列。评分过程依赖于在树遍历期间将中间结果存储在树的内部节点上。与计算密集度较低的方法(例如parsimony)相比，这会导致相对较高的内存需求。对于最大似然系统发育放置，内存需求特别大，因为进一步的中间结果应该存储在树的所有分支中，以最大化运行时性能。这阻碍了我们的系统发育定位工具EPA-NG的许多用户在大型系统发育树上进行定位。在这里，我们提出了一种减少EPA-NG内存占用的方法。此外，我们对我们的实现进行了一般化，并将其集成到我们的系统发生可能性库libpll-2中，以便其他工具可以使用它进行系统发生推断。在一个经验数据集上，我们能够以将执行时间增加23倍为代价，将内存需求减少多达96%。因此，在减少内存需求和增加执行时间之间存在权衡，我们对此进行了研究。当将可用于放置的内存量增加到一定程度时，对于我们测试过的最具挑战性的数据集，执行时间只降低了大约4倍。现在，这允许在合理的时间内对更大的树进行基于最大可能性的放置。最后，我们展示了主动内存管理方法为并行化带来了新的挑战，并概述了可能的解决方案。

{"title":"Efficient Memory Management in Likelihood-based Phylogenetic Placement","authors":"P. Barbera, A. Stamatakis","doi":"10.1109/IPDPSW52791.2021.00041","DOIUrl":"https://doi.org/10.1109/IPDPSW52791.2021.00041","url":null,"abstract":"Maximum likelihood based phylogenetic methods score phylogenetic tree topologies comprising a set of molecular sequences of the species under study, using statistical models of evolution. The scoring procedure relies on storing intermediate results at inner nodes of the tree during the tree traversal. This induces comparatively high memory requirements compared to less compute-intensive methods such as parsimony, for instance.The memory requirements are particularly large for maximum likelihood phylogenetic placement, as further intermediate results should be stored at all branches of the tree to maximize runtime performance. This has hindered numerous users of our phylogenetic placement tool EPA-NG from performing placement on large phylogenetic trees.Here, we present an approach to reduce the memory footprint of EPA-NG. Further, we have generalized our implementation and integrated it into our phylogenetic likelihood library, libpll-2, such that it can be used by other tools for phylogenetic inference. On an empirical dataset, we were able to reduce the memory requirements by up to 96% at the cost of increasing execution times by 23 times. Hence, there exists a trade-off between decreasing memory requirements and increasing execution times which we investigate. When increasing the amount of memory available for placement to a certain level, execution times are only approximately 4 times lower for the most challenging dataset we have tested. This now allows for conducting maximum likelihood based placement on substantially larger trees within reasonable times. Finally, we show that the active memory management approach introduces new challenges for parallelization and outline possible solutions.","PeriodicalId":170832,"journal":{"name":"2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2021-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128517737","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

首页上一页

下一页尾页

类型

全部化学•材料生命科学医学物理工程技术环境•农林材料科学地球科学法学管理学化学环境科学与生态学计算机科学教育学经济学农林科学人文科学生物学数学物理与天体物理心理学综合性期刊其他工业工程理学历史学农学文学信息工程

数据库

全部 ACS Publications Elsevier ieeexplore Springer The Royal Society of Chemistry Wiley

期刊

2021 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)

全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.

﹀