首页 > 最新文献

Parallel Computing最新文献

英文 中文
ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight ESA:一种用于神威太湖之光生物数据库检索的高效序列比对算法
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103043
Hao Zhang , Zhiyi Huang , Yawen Chen , Jianguo Liang , Xiran Gao

In computational biology, biological database search has been playing a very important role. Since the COVID-19 outbreak, it has provided significant help in identifying common characteristics of viruses and developing vaccines and drugs. Sequence alignment, a method finding similarity, homology and other information between gene/protein sequences, is the usual tool in the database search. With the explosive growth of biological databases, the search process has become extremely time-consuming. However, existing parallel sequence alignment algorithms cannot deliver efficient database search due to low utilization of the resources such as cache memory and performance issues such as load imbalance and high communication overhead. In this paper, we propose an efficient sequence alignment algorithm on Sunway TaihuLight, called ESA, for biological database search. ESA adopts a novel hybrid alignment algorithm combining local and global alignments, which has higher accuracy than other sequence alignment algorithms. Further, ESA has several optimizations including cache-aware sequence alignment, capacity-aware load balancing and bandwidth-aware data transfer. They are implemented in a heterogeneous processor SW26010 adopted in the world’s 6th fastest supercomputer, Sunway TaihuLight. The implementation of ESA is evaluated with the Swiss-Prot database on Sunway TaihuLight and other platforms. Our experimental results show that ESA has a speedup of 34.5 on a single core group (with 65 cores) of Sunway TaihuLight. The strong and weak scalabilities of ESA are tested with 1 to 1024 core groups of Sunway TaihuLight. The results show that ESA has linear weak scalability and very impressive strong scalability. For strong scalability, ESA achieves a speedup of 338.04 with 1024 core groups compared with a single core group. We also show that our proposed optimizations are also applicable to GPU, Intel multicore processors, and heterogeneous computing platforms.

在计算生物学中,生物数据库搜索一直扮演着非常重要的角色。自2019冠状病毒病暴发以来,它为确定病毒的共同特征以及开发疫苗和药物提供了重大帮助。序列比对是一种寻找基因/蛋白质序列之间相似性、同源性等信息的方法,是数据库检索中常用的工具。随着生物数据库的爆炸式增长,搜索过程变得非常耗时。然而,现有的并行序列对齐算法由于缓存等资源利用率低、负载不平衡、通信开销大等性能问题,无法实现高效的数据库搜索。本文提出了一种高效的“神威太湖之光”序列比对算法(ESA),用于生物数据库检索。ESA采用了一种结合局部和全局比对的新型混合比对算法,比其他序列比对算法具有更高的精度。此外,ESA还进行了一些优化,包括缓存感知序列对齐、容量感知负载平衡和带宽感知数据传输。它们是在世界第六快的超级计算机神威太湖之光采用的异构处理器SW26010中实现的。利用“神威太湖之光”等平台上的Swiss-Prot数据库对ESA的实施情况进行了评估。实验结果表明,ESA在神威太湖之光单核心组(65核)上的加速速度为34.5。用神威太湖之光1 ~ 1024个核心组测试ESA的强弱可扩展性。结果表明,ESA具有线性弱可扩展性和令人印象深刻的强可扩展性。为了获得较强的可扩展性,ESA使用1024个核心组比单个核心组的速度提升338.04。我们还表明,我们提出的优化也适用于GPU、Intel多核处理器和异构计算平台。
{"title":"ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight","authors":"Hao Zhang ,&nbsp;Zhiyi Huang ,&nbsp;Yawen Chen ,&nbsp;Jianguo Liang ,&nbsp;Xiran Gao","doi":"10.1016/j.parco.2023.103043","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103043","url":null,"abstract":"<div><p>In computational biology, biological database search has been playing a very important role. Since the COVID-19 outbreak, it has provided significant help in identifying common characteristics of viruses and developing vaccines and drugs. Sequence alignment<span><span>, a method finding similarity, </span>homology<span> and other information between gene/protein sequences, is the usual tool in the database search. With the explosive growth of biological databases, the search process has become extremely time-consuming. However, existing parallel sequence alignment algorithms cannot deliver efficient database search due to low utilization of the resources such as cache memory and performance issues such as load imbalance and high communication overhead<span><span>. In this paper, we propose an efficient sequence alignment algorithm on Sunway TaihuLight, called ESA, for biological database search. ESA adopts a novel hybrid alignment algorithm combining local and global alignments, which has higher accuracy than other sequence alignment algorithms. Further, ESA has several optimizations including cache-aware sequence alignment, capacity-aware load balancing and bandwidth-aware data transfer. They are implemented in a heterogeneous processor SW26010 adopted in the world’s 6th fastest supercomputer<span><span>, Sunway TaihuLight. The implementation of ESA is evaluated with the Swiss-Prot database on Sunway TaihuLight and other platforms. Our experimental results show that ESA has a speedup of 34.5 on a single core group (with 65 cores) of Sunway TaihuLight. The strong and weak scalabilities of ESA are tested with 1 to 1024 core groups of Sunway TaihuLight. The results show that ESA has linear weak scalability and very impressive strong scalability. For strong scalability, ESA achieves a speedup of 338.04 with 1024 core groups compared with a single core group. We also show that our proposed optimizations are also applicable to GPU, Intel </span>multicore processors, and </span></span>heterogeneous computing platforms.</span></span></span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103043"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877449","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A flexible sparse matrix data format and parallel algorithms for the assembly of finite element matrices on shared memory systems 一种灵活的稀疏矩阵数据格式及其在共享存储系统上有限元矩阵装配的并行算法
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103039
Adam Sky , César Polindara , Ingo Muench , Carolin Birk

Finite element methods require the composition of the global stiffness matrix from local finite element contributions. The composition process combines the computation of element stiffness matrices and their assembly into the global stiffness matrix, which is commonly sparse. In this paper we focus on the assembly process of the global stiffness matrix and explore different algorithms and their efficiency on shared memory systems using C++. A key aspect of our investigation is the use of atomic synchronization primitives for the derivation of data-race free algorithms and data structures. Furthermore, we propose a new flexible storage format for sparse matrices and compare its performance with the compressed row storage format using abstract benchmarks based on common characteristics of finite element problems.

有限元方法要求由局部有限元贡献组成整体刚度矩阵。组合过程将单元刚度矩阵的计算及其组装成全局刚度矩阵,该矩阵通常是稀疏的。本文以全局刚度矩阵的装配过程为研究对象,探讨了不同的装配算法及其在共享存储系统上的效率。我们研究的一个关键方面是使用原子同步原语来派生无数据竞争的算法和数据结构。此外,我们提出了一种新的稀疏矩阵的灵活存储格式,并使用基于有限元问题共同特征的抽象基准将其性能与压缩行存储格式进行了比较。
{"title":"A flexible sparse matrix data format and parallel algorithms for the assembly of finite element matrices on shared memory systems","authors":"Adam Sky ,&nbsp;César Polindara ,&nbsp;Ingo Muench ,&nbsp;Carolin Birk","doi":"10.1016/j.parco.2023.103039","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103039","url":null,"abstract":"<div><p><span>Finite element methods<span><span> require the composition of the global stiffness matrix from local finite element contributions. The composition process combines the computation of </span>element stiffness matrices<span> and their assembly into the global stiffness matrix, which is commonly sparse. In this paper we focus on the assembly process of the global stiffness matrix and explore different algorithms and their efficiency on shared memory systems using C</span></span></span><span>++</span><span>. A key aspect of our investigation is the use of atomic synchronization primitives for the derivation of data-race free algorithms and data structures. Furthermore, we propose a new flexible storage format for sparse matrices and compare its performance with the compressed row storage format using abstract benchmarks based on common characteristics of finite element problems.</span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103039"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877861","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New YARN sharing GPU based on graphics memory granularity scheduling 基于图形内存粒度调度的新型YARN共享GPU
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103038
Jinliang Shi , Dewu Chen , Jiabi Liang , Lin Li , Yue Lin , Jianjiang Li

As one of the most widely used cluster scheduling frameworks, Hadoop YARN only supported CPU and memory scheduling in the past. Furthermore, due to the widespread use of AI, the demand for GPU is also increasing. So Hadoop YARN V3.0 adds GPU scheduling, but the granularity is on the whole card yet, rather than finer-grained graphics memory scheduling. However, during daily training, although the graphics memory required by tasks may be much smaller than the whole GPU card, they will occupy the whole card, which results in wasted resources. To address this issue, Tensorflow provides the API for graphics memory control. Therefore, we propose to introduce this feature into Hadoop YARN so that it can support the heterogeneous scheduling: CPU, memory and graphics memory. Then we take HadoopV2.7 source code as the underlying architecture and design a new scheduler GSHARE. Compared with previous scheduling strategies, with 3 nodes, 3 GPU cards per node, and 12G graphics memory per card, GSHARE improves efficiency by up to 74% for Tensorflow tasks with 2G of graphics memory. Meanwhile, it minimizes the problem of wasted graphics memory caused by the inability to control graphics memory proportionally by the API of Tensorflow for multiple-card.

作为使用最广泛的集群调度框架之一,Hadoop YARN过去只支持CPU和内存调度。此外,由于人工智能的广泛使用,对GPU的需求也在增加。所以Hadoop YARN V3.0增加了GPU调度,但粒度是在整个卡上,而不是更细粒度的图形内存调度。然而,在日常训练中,虽然任务所需的图形内存可能比整个GPU卡小得多,但它们会占用整个GPU卡,造成资源浪费。为了解决这个问题,Tensorflow提供了图形内存控制的API。因此,我们建议在Hadoop YARN中引入这个特性,使其能够支持异构调度:CPU、内存和图形内存。然后以HadoopV2.7源代码为底层架构,设计了一个新的调度器GSHARE。与以前的调度策略相比,GSHARE使用3个节点,每个节点3个GPU卡,每个卡12G显存,对于使用2G显存的Tensorflow任务,效率提高了74%。同时,它最大限度地减少了由于Tensorflow的多卡API无法按比例控制图形内存而导致的图形内存浪费问题。
{"title":"New YARN sharing GPU based on graphics memory granularity scheduling","authors":"Jinliang Shi ,&nbsp;Dewu Chen ,&nbsp;Jiabi Liang ,&nbsp;Lin Li ,&nbsp;Yue Lin ,&nbsp;Jianjiang Li","doi":"10.1016/j.parco.2023.103038","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103038","url":null,"abstract":"<div><p>As one of the most widely used cluster scheduling frameworks, Hadoop<span> YARN only supported CPU and memory scheduling in the past. Furthermore, due to the widespread use of AI<span>, the demand for GPU<span> is also increasing. So Hadoop YARN V3.0 adds GPU scheduling, but the granularity<span> is on the whole card yet, rather than finer-grained graphics memory scheduling. However, during daily training, although the graphics memory required by tasks may be much smaller than the whole GPU card, they will occupy the whole card, which results in wasted resources. To address this issue, Tensorflow provides the API for graphics memory control. Therefore, we propose to introduce this feature into Hadoop YARN so that it can support the heterogeneous scheduling: CPU, memory and graphics memory. Then we take HadoopV2.7 source code as the underlying architecture and design a new scheduler GSHARE. Compared with previous scheduling strategies, with 3 nodes, 3 GPU cards per node, and 12G graphics memory per card, GSHARE improves efficiency by up to 74% for Tensorflow tasks with 2G of graphics memory. Meanwhile, it minimizes the problem of wasted graphics memory caused by the inability to control graphics memory proportionally by the API of Tensorflow for multiple-card.</span></span></span></span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103038"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877862","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using heterogeneous GPU nodes with a Cabana-based implementation of MPCD 使用异构GPU节点和基于cabana的MPCD实现
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103033
Rene Halver , Christoph Junghans , Godehard Sutmann

The Kokkos based library Cabana, which has been developed in the Co-design Center for Particle Applications (CoPA), is used for the implementation of Multi-Particle Collision Dynamics (MPCD), a particle-based description of hydrodynamic interactions. Cabana allows for a function portable implementation, which has been used to study the interplay between CPU and GPU usage on a multi-node system as well as analysis of said interplay with performance analysis tools. As a result, we see most advantages in a homogeneous GPU usage, but we also discuss the extent to which heterogeneous applications might be more performant, using both CPU and GPU concurrently.

基于Kokkos的库Cabana是由粒子应用协同设计中心(CoPA)开发的,用于实现多粒子碰撞动力学(MPCD),这是一种基于粒子的流体动力相互作用描述。Cabana允许功能可移植实现,它已被用于研究多节点系统上CPU和GPU使用之间的相互作用,以及使用性能分析工具分析所述相互作用。因此,我们看到了同质GPU使用的最大优势,但我们也讨论了在同时使用CPU和GPU的情况下,异构应用程序的性能可能会更高的程度。
{"title":"Using heterogeneous GPU nodes with a Cabana-based implementation of MPCD","authors":"Rene Halver ,&nbsp;Christoph Junghans ,&nbsp;Godehard Sutmann","doi":"10.1016/j.parco.2023.103033","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103033","url":null,"abstract":"<div><p><span>The Kokkos based library Cabana, which has been developed in the Co-design Center for Particle Applications (CoPA), is used for the implementation of Multi-Particle Collision Dynamics<span> (MPCD), a particle-based description of hydrodynamic interactions. Cabana allows for a function portable implementation, which has been used to study the </span></span>interplay<span> between CPU<span> and GPU usage on a multi-node system as well as analysis of said interplay with performance analysis tools. As a result, we see most advantages in a homogeneous GPU usage, but we also discuss the extent to which heterogeneous applications might be more performant, using both CPU and GPU concurrently.</span></span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103033"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877857","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Editorial on Advances in High Performance Programming 关于高性能编程进展的社论
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103037
Ami Marowka , Przemysław Stpiczyński
{"title":"Editorial on Advances in High Performance Programming","authors":"Ami Marowka ,&nbsp;Przemysław Stpiczyński","doi":"10.1016/j.parco.2023.103037","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103037","url":null,"abstract":"","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103037"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877858","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptively parallel runtime verification based on distributed network for temporal properties 基于分布式网络的时间属性自适应并行运行时验证
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103034
Bin Yu , Xu Lu , Cong Tian , Meng Wang , Chu Chen , Ming Lei , Zhenhua Duan

Runtime verification is a lightweight verification technique that verifies whether a monitored program execution satisfies a desired property. Online runtime verification faces challenges regarding efficiency and property expressiveness, which limit its widespread adoption. However, there is a lack of research that addresses both of these issues. With the basis of a distributed network, we propose an adaptively parallel approach to verify full regular temporal properties of C programs in an online manner. During program execution, segments of the generated state sequence are verified by distributed machines concurrently, while each segment is also verified in each multi-core machine with an adaptive number of threads. Experimental results demonstrate that, with supporting more expressive properties, our approach has a speedup of 2.5X–5.0X compared with other runtime verification approaches.

运行时验证是一种轻量级的验证技术,用于验证被监视的程序执行是否满足所需的属性。在线运行时验证面临着效率和属性表达性方面的挑战,这限制了它的广泛采用。然而,缺乏解决这两个问题的研究。在分布式网络的基础上,我们提出了一种自适应并行方法来在线验证C程序的全正则时间特性。在程序执行过程中,生成的状态序列的段由分布式机器并发地进行验证,同时每个段也在每个多核机器中以自适应的线程数进行验证。实验结果表明,与其他运行时验证方法相比,我们的方法在支持更多表达属性的情况下,速度提高了2.5 - 5.0 x。
{"title":"Adaptively parallel runtime verification based on distributed network for temporal properties","authors":"Bin Yu ,&nbsp;Xu Lu ,&nbsp;Cong Tian ,&nbsp;Meng Wang ,&nbsp;Chu Chen ,&nbsp;Ming Lei ,&nbsp;Zhenhua Duan","doi":"10.1016/j.parco.2023.103034","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103034","url":null,"abstract":"<div><p>Runtime verification<span><span> is a lightweight verification technique that verifies whether a monitored program execution satisfies a desired property. Online runtime verification faces challenges regarding efficiency and property expressiveness, which limit its widespread adoption. However, there is a lack of research that addresses both of these issues. With the basis of a distributed network, we propose an adaptively parallel approach to verify full regular temporal properties of C programs in an online manner. During program execution, segments of the generated state sequence are verified by distributed machines concurrently, while each segment is also verified in each multi-core machine with an adaptive number of </span>threads. Experimental results demonstrate that, with supporting more expressive properties, our approach has a speedup of 2.5X–5.0X compared with other runtime verification approaches.</span></p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103034"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877448","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Big data BPMN workflow resource optimization in the cloud 云中的大数据BPMN工作流资源优化
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103025
Srđan Daniel Simić, Nikola Tanković, Darko Etinger

Cloud computing is one of the critical technologies that meet the demand of various businesses for the high-capacity computational processing power needed to gain knowledge from their ever-growing business data. When utilizing cloud computing resources to deal with Big Data processing, companies face the challenge of determining the optimal use of resources within their business processes. The miscalculation of the necessary resources directly affects their budget and can cause delays in the cycle time of their key processes. This study investigates the simulation of cloud resource optimization for Big Data workflows modeled with the Business Process Modeling Notation (BPMN). To this end, a BPMN performance evaluation framework was developed. The framework’s capabilities were presented using real-world data science workflow and later evaluated on workflows consisting of 13, 52, and 104 tasks. The results show that the developed framework is adequate for estimating the overall run-time distribution and optimizing the cloud resource deployment and that the BPMN can be utilized for Big Data processing workflows. Therefore, this study contributes to BPMN practitioners by providing a tool to apply BPMN for their Big Data workflows and decision-makers by giving them critical insights into their key business processes. The framework source code is available at https://github.com/ntankovic/python-bpmn-engine.

云计算是满足各种业务对高容量计算处理能力的需求的关键技术之一,这些能力需要从不断增长的业务数据中获取知识。在利用云计算资源处理大数据处理时,企业面临的挑战是确定其业务流程中资源的最佳使用。对必要资源的错误计算直接影响到他们的预算,并可能导致关键流程周期时间的延迟。本研究探讨了用业务流程建模符号(BPMN)建模的大数据工作流的云资源优化模拟。为此,开发了BPMN性能评估框架。该框架的功能是使用真实的数据科学工作流来展示的,随后在包含13、52和104个任务的工作流上进行了评估。结果表明,所开发的框架足以估计整体运行时分布和优化云资源部署,并且可以将BPMN用于大数据处理工作流。因此,本研究为BPMN从业者提供了一个将BPMN应用于其大数据工作流的工具,并为决策者提供了对其关键业务流程的关键见解,从而为他们做出了贡献。该框架的源代码可从https://github.com/ntankovic/python-bpmn-engine获得。
{"title":"Big data BPMN workflow resource optimization in the cloud","authors":"Srđan Daniel Simić,&nbsp;Nikola Tanković,&nbsp;Darko Etinger","doi":"10.1016/j.parco.2023.103025","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103025","url":null,"abstract":"<div><p>Cloud computing is one of the critical technologies that meet the demand of various businesses for the high-capacity computational processing power needed to gain knowledge from their ever-growing business data. When utilizing cloud computing resources to deal with Big Data processing, companies face the challenge of determining the optimal use of resources within their business processes. The miscalculation of the necessary resources directly affects their budget and can cause delays in the cycle time of their key processes. This study investigates the simulation of cloud resource optimization for Big Data workflows modeled with the Business Process Modeling Notation (BPMN). To this end, a BPMN performance evaluation framework was developed. The framework’s capabilities were presented using real-world data science workflow and later evaluated on workflows consisting of 13, 52, and 104 tasks. The results show that the developed framework is adequate for estimating the overall run-time distribution and optimizing the cloud resource deployment and that the BPMN can be utilized for Big Data processing workflows. Therefore, this study contributes to BPMN practitioners by providing a tool to apply BPMN for their Big Data workflows and decision-makers by giving them critical insights into their key business processes. The framework source code is available at <span>https://github.com/ntankovic/python-bpmn-engine</span><svg><path></path></svg>.</p></div>","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 ","pages":"Article 103025"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"49877447","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Using heterogeneous GPU nodes with a Cabana-based implementation of MPCD 使用异构GPU节点和基于cabana的MPCD实现
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-09-01 DOI: 10.1016/j.parco.2023.103033
R. Halver, Christoph Junghans, G. Sutmann
{"title":"Using heterogeneous GPU nodes with a Cabana-based implementation of MPCD","authors":"R. Halver, Christoph Junghans, G. Sutmann","doi":"10.1016/j.parco.2023.103033","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103033","url":null,"abstract":"","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 1","pages":"103033"},"PeriodicalIF":1.4,"publicationDate":"2023-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55107193","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight ESA:一种用于神威太湖之光生物数据库检索的高效序列比对算法
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-08-01 DOI: 10.1016/j.parco.2023.103043
H. Zhang, Zhiyi Huang, Yawen Chen, Jianguo Liang, Xiran Gao
{"title":"ESA: An efficient sequence alignment algorithm for biological database search on Sunway TaihuLight","authors":"H. Zhang, Zhiyi Huang, Yawen Chen, Jianguo Liang, Xiran Gao","doi":"10.1016/j.parco.2023.103043","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103043","url":null,"abstract":"","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"117 1","pages":"103043"},"PeriodicalIF":1.4,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55107483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Finding inputs that trigger floating-point exceptions in heterogeneous computing via Bayesian optimization 通过贝叶斯优化查找异构计算中触发浮点异常的输入
IF 1.4 4区 计算机科学 Q2 COMPUTER SCIENCE, THEORY & METHODS Pub Date : 2023-08-01 DOI: 10.1016/j.parco.2023.103042
I. Laguna, Anh Tran, G. Gopalakrishnan
{"title":"Finding inputs that trigger floating-point exceptions in heterogeneous computing via Bayesian optimization","authors":"I. Laguna, Anh Tran, G. Gopalakrishnan","doi":"10.1016/j.parco.2023.103042","DOIUrl":"https://doi.org/10.1016/j.parco.2023.103042","url":null,"abstract":"","PeriodicalId":54642,"journal":{"name":"Parallel Computing","volume":"62 1","pages":"103042"},"PeriodicalIF":1.4,"publicationDate":"2023-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"55107870","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Parallel Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1