首页 > 最新文献

2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)最新文献

英文 中文
Implementing Central Force optimization on the Intel Xeon Phi 在Intel Xeon Phi处理器上实现中央力优化
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00091
Thomas Charest, R. Green
Central Force optimization (CFO) is a fully deterministic population based metaheuristic algorithm based on the analogy of classical kinematics. CFO yields more accurate and consistent results compared to other population based metaheuristics like Particle Swarm optimization and Genetic Algorithms, but does so at the cost of higher computational complexity, leading to increased computational time. This study presents a parallel implementation of CFO written in C++ using OpenMP as implemented for both a multi-core CPU and the Intel Xeon Phi Co-processor. Results show that parallelizing CFO provides promising speedup values from 5-35 on the multi-core CPU and 1-12 on the Intel Xeon Phi.
中心力优化(CFO)是一种基于经典运动学类比的全确定性种群元启发式算法。与粒子群优化和遗传算法等其他基于群体的元启发式算法相比,CFO产生的结果更准确、更一致,但这样做的代价是更高的计算复杂度,导致计算时间增加。本研究提出了一个用c++编写的CFO并行实现,使用OpenMP在多核CPU和Intel Xeon Phi协处理器上实现。结果表明,并行CFO提供了有希望的加速值,在多核CPU上为5-35,在Intel Xeon Phi上为1-12。
{"title":"Implementing Central Force optimization on the Intel Xeon Phi","authors":"Thomas Charest, R. Green","doi":"10.1109/IPDPSW50202.2020.00091","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00091","url":null,"abstract":"Central Force optimization (CFO) is a fully deterministic population based metaheuristic algorithm based on the analogy of classical kinematics. CFO yields more accurate and consistent results compared to other population based metaheuristics like Particle Swarm optimization and Genetic Algorithms, but does so at the cost of higher computational complexity, leading to increased computational time. This study presents a parallel implementation of CFO written in C++ using OpenMP as implemented for both a multi-core CPU and the Intel Xeon Phi Co-processor. Results show that parallelizing CFO provides promising speedup values from 5-35 on the multi-core CPU and 1-12 on the Intel Xeon Phi.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"172 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"116142574","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weightless Neural Networks Applied to Nonintrusive Load Monitoring 无重力神经网络在非侵入式负荷监测中的应用
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00143
Guilherme C. De Lello, Juliano Caldeira, M. Aredes, F. França, P. Lima
It is well known that energy efficiency plays a key role in ensuring sustainable development. Concerns regarding energy include greenhouse gas emissions, which contribute to global warming, and the possibility of supply interruptions and delivery constraints in some countries and regions of the world. Several studies have suggested that feedback on specific electrical appliances’ consumption could be one of the cheapest and most eco-friendly ways to encourage utility customers in energy conservation. Moreover, there is evidence that the best rates of savings are achieved when the appliance load information is delivered directly to the customers’ smartphones or dedicated displays inside their homes. Nonintrusive Load Monitoring (NILM) is a technique that estimates the energy consumption of individual appliance loads without requiring the installation of sensors in each appliance. In order to provide feedback directly to the end-user, NILM applications could be embedded in IoT smart devices. However, the amount of computational resources required by NILM algorithms proposed in previous research often discourages embedded applications. On the other hand, the Weightless Neural Network model WiSARD is capable of solving pattern recognition tasks by using a memory-based architecture and some of the most simple computational operations: addition and comparison. Those properties suggest that this particular machine learning model is suited for efficiently solving NILM problem. This paper describes and evaluates a new approach to NILM in which the electric loads are disaggregated by using the Weightless Neural Network model WiSARD. Experimental results using the Brazilian Appliance Dataset (BRAD) indicate that it is feasible to embed WiSARD-based NILM algorithms in low-cost IoT smart energy meters.
众所周知,能源效率在确保可持续发展方面起着关键作用。能源方面的担忧包括导致全球变暖的温室气体排放,以及世界上一些国家和地区可能出现的供应中断和交付限制。几项研究表明,对特定电器消费的反馈可能是鼓励公用事业客户节约能源的最便宜和最环保的方式之一。此外,有证据表明,当电器负载信息直接传递到客户的智能手机或家中的专用显示器时,可以实现最佳的节电率。非侵入式负载监测(NILM)是一种无需在每个设备中安装传感器即可估计单个设备负载能耗的技术。为了直接向最终用户提供反馈,NILM应用程序可以嵌入物联网智能设备中。然而,先前研究中提出的NILM算法所需的计算资源数量往往阻碍嵌入式应用。另一方面,失重神经网络模型WiSARD能够通过使用基于内存的架构和一些最简单的计算操作(加法和比较)来解决模式识别任务。这些特性表明,这种特殊的机器学习模型适合于有效地解决NILM问题。本文描述并评价了一种利用无重力神经网络模型WiSARD对电力负荷进行分解的新方法。使用巴西电器数据集(BRAD)的实验结果表明,在低成本物联网智能电表中嵌入基于wisard的NILM算法是可行的。
{"title":"Weightless Neural Networks Applied to Nonintrusive Load Monitoring","authors":"Guilherme C. De Lello, Juliano Caldeira, M. Aredes, F. França, P. Lima","doi":"10.1109/IPDPSW50202.2020.00143","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00143","url":null,"abstract":"It is well known that energy efficiency plays a key role in ensuring sustainable development. Concerns regarding energy include greenhouse gas emissions, which contribute to global warming, and the possibility of supply interruptions and delivery constraints in some countries and regions of the world. Several studies have suggested that feedback on specific electrical appliances’ consumption could be one of the cheapest and most eco-friendly ways to encourage utility customers in energy conservation. Moreover, there is evidence that the best rates of savings are achieved when the appliance load information is delivered directly to the customers’ smartphones or dedicated displays inside their homes. Nonintrusive Load Monitoring (NILM) is a technique that estimates the energy consumption of individual appliance loads without requiring the installation of sensors in each appliance. In order to provide feedback directly to the end-user, NILM applications could be embedded in IoT smart devices. However, the amount of computational resources required by NILM algorithms proposed in previous research often discourages embedded applications. On the other hand, the Weightless Neural Network model WiSARD is capable of solving pattern recognition tasks by using a memory-based architecture and some of the most simple computational operations: addition and comparison. Those properties suggest that this particular machine learning model is suited for efficiently solving NILM problem. This paper describes and evaluates a new approach to NILM in which the electric loads are disaggregated by using the Weightless Neural Network model WiSARD. Experimental results using the Brazilian Appliance Dataset (BRAD) indicate that it is feasible to embed WiSARD-based NILM algorithms in low-cost IoT smart energy meters.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125392962","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Load Balancing Run-Times and Space Usage for Computing the Power Set 计算功率集的负载均衡运行时间和空间使用情况
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00090
R. Goodwin
This paper discusses load balancing the number of sets on multiple processors to compute the power set. The algorithm complexity measures (e.g. run-time and space usage) are merely functions of the load balance.This paper presents two approaches to computing the power set in a parallel environment. Both approaches use different round robin load balancing algorithms. Because of the load balance assignments, both approaches have different production algorithms for computing the power set. This paper covers both the run-time analyses and the space usage analyses of the parallel algorithms. We present mathematical formulas for the space usage analyses.We present a third, non-parallel algorithm for benchmark purposes. The purpose of the algorithm is to give the total amount of space needed to save the power set. The non-parallel algorithm also gives some idea of the maximum run-time an algorithm should take to find any of the sub-sets to the power set on any given processor in a parallel environment.
本文讨论了在多处理器上负载均衡集的数量来计算功率集。算法复杂度度量(例如运行时间和空间使用)仅仅是负载平衡的函数。本文提出了并行环境下计算功率集的两种方法。这两种方法使用不同的轮循负载平衡算法。由于负载均衡分配,这两种方法具有不同的计算功率集的生成算法。本文涵盖了并行算法的运行时分析和空间使用分析。提出了空间利用分析的数学公式。我们提出了第三种用于基准测试的非并行算法。该算法的目的是给出节省功率集所需的总空间量。非并行算法还给出了算法在并行环境中找到任意给定处理器上幂集的任意子集所需要的最大运行时间的概念。
{"title":"Load Balancing Run-Times and Space Usage for Computing the Power Set","authors":"R. Goodwin","doi":"10.1109/IPDPSW50202.2020.00090","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00090","url":null,"abstract":"This paper discusses load balancing the number of sets on multiple processors to compute the power set. The algorithm complexity measures (e.g. run-time and space usage) are merely functions of the load balance.This paper presents two approaches to computing the power set in a parallel environment. Both approaches use different round robin load balancing algorithms. Because of the load balance assignments, both approaches have different production algorithms for computing the power set. This paper covers both the run-time analyses and the space usage analyses of the parallel algorithms. We present mathematical formulas for the space usage analyses.We present a third, non-parallel algorithm for benchmark purposes. The purpose of the algorithm is to give the total amount of space needed to save the power set. The non-parallel algorithm also gives some idea of the maximum run-time an algorithm should take to find any of the sub-sets to the power set on any given processor in a parallel environment.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"98 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126093402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Dynamic Provisioning of Storage Resources: A Case Study with Burst Buffers 存储资源的动态供应:突发缓冲区的案例研究
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00173
François Tessier, Maxime Martinasso, M. Chesi, Mark Klein, M. Gila
Complex applications and workflows needs are often exclusively expressed in terms of computational resources on HPC systems. In many cases, other resources like storage or network are not allocatable and are shared across the entire HPC system. By looking at the storage resources in particular, any workflow or application should be able to select both its preferred data manager and its required storage capability or capacity. To achieve such a goal, new mechanisms should be introduced. In this work, we present such a tool that dynamically provisions a data management system on top of storage devices. We propose a proof-of-concept that is able to deploy, on-demand, a parallel file-system across intermediate storage nodes on a Cray XC50 system. We show how this mechanism can be easily extended to support more data managers and any type of intermediate storage. Finally, we evaluate the performance of the provisioned storage system with a set of benchmarks.
复杂的应用程序和工作流需求通常只以高性能计算系统上的计算资源来表示。在许多情况下,存储或网络等其他资源是不可分配的,而是在整个HPC系统中共享的。通过特别查看存储资源,任何工作流或应用程序都应该能够选择其首选数据管理器和所需的存储能力或容量。为了实现这一目标,应该引入新的机制。在这项工作中,我们提出了这样一个工具,它可以动态地在存储设备上提供数据管理系统。我们提出了一个概念验证,它能够跨Cray XC50系统上的中间存储节点按需部署并行文件系统。我们将展示如何轻松扩展此机制以支持更多数据管理器和任何类型的中间存储。最后,我们使用一组基准测试来评估所配置存储系统的性能。
{"title":"Dynamic Provisioning of Storage Resources: A Case Study with Burst Buffers","authors":"François Tessier, Maxime Martinasso, M. Chesi, Mark Klein, M. Gila","doi":"10.1109/IPDPSW50202.2020.00173","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00173","url":null,"abstract":"Complex applications and workflows needs are often exclusively expressed in terms of computational resources on HPC systems. In many cases, other resources like storage or network are not allocatable and are shared across the entire HPC system. By looking at the storage resources in particular, any workflow or application should be able to select both its preferred data manager and its required storage capability or capacity. To achieve such a goal, new mechanisms should be introduced. In this work, we present such a tool that dynamically provisions a data management system on top of storage devices. We propose a proof-of-concept that is able to deploy, on-demand, a parallel file-system across intermediate storage nodes on a Cray XC50 system. We show how this mechanism can be easily extended to support more data managers and any type of intermediate storage. Finally, we evaluate the performance of the provisioned storage system with a set of benchmarks.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"72 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126281085","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 4
HCW 2020 Keynote Speaker Edge Intelligence Empowering IoT Data Analytics HCW 2020主题演讲:边缘智能助力物联网数据分析
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00011
Albert Y. Zomaya
Along with the many developments in computing and communication technologies and the surge of mobile devices, a brand new paradigm, Edge Computing (EC), has gained a lot of momentum in recent times. In parallel, Artificial Intelligence (AI) applications have been thriving fuelled by breakthroughs in deep learning and the emergence of new hardware architectures. Billions of bytes of data, generated at the network edge which is composed of a multitude of heterogeneous devices, put great demands on data processing and structural optimization which led to a genuine demand for the integration of EC and AI leading to what is known today as Edge Intelligence (EI).
随着计算和通信技术的发展以及移动设备的激增,一种全新的模式——边缘计算(Edge computing, EC)近年来获得了很大的发展势头。与此同时,人工智能(AI)应用在深度学习的突破和新硬件架构的出现的推动下蓬勃发展。由众多异构设备组成的网络边缘产生数十亿字节的数据,对数据处理和结构优化提出了很高的要求,这导致了对EC和AI集成的真正需求,从而导致了今天所谓的边缘智能(EI)。
{"title":"HCW 2020 Keynote Speaker Edge Intelligence Empowering IoT Data Analytics","authors":"Albert Y. Zomaya","doi":"10.1109/IPDPSW50202.2020.00011","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00011","url":null,"abstract":"Along with the many developments in computing and communication technologies and the surge of mobile devices, a brand new paradigm, Edge Computing (EC), has gained a lot of momentum in recent times. In parallel, Artificial Intelligence (AI) applications have been thriving fuelled by breakthroughs in deep learning and the emergence of new hardware architectures. Billions of bytes of data, generated at the network edge which is composed of a multitude of heterogeneous devices, put great demands on data processing and structural optimization which led to a genuine demand for the integration of EC and AI leading to what is known today as Edge Intelligence (EI).","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"65 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128277011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Random Forests in Chapel 教堂里的随机森林
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00118
Ben Albrecht
This talk will present the ongoing work of developing a Chapel implementation of Random Forest, a popular ensembling learning method utilized both for predictive modeling and feature selection. Language features in Chapel make it possible to easily express shared-memory and distributed-memory implementations of this algorithm. Furthermore, Chapel’s built-in python interoperability functionality made it easier to implement a python front-end, making it accessible to a language popular among data scientists.
本次演讲将介绍随机森林的Chapel实现的开发工作,随机森林是一种流行的集成学习方法,用于预测建模和特征选择。Chapel的语言特性使得它可以很容易地表达该算法的共享内存和分布式内存实现。此外,Chapel内置的python互操作性功能使其更容易实现python前端,使其成为数据科学家中流行的语言。
{"title":"Random Forests in Chapel","authors":"Ben Albrecht","doi":"10.1109/IPDPSW50202.2020.00118","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00118","url":null,"abstract":"This talk will present the ongoing work of developing a Chapel implementation of Random Forest, a popular ensembling learning method utilized both for predictive modeling and feature selection. Language features in Chapel make it possible to easily express shared-memory and distributed-memory implementations of this algorithm. Furthermore, Chapel’s built-in python interoperability functionality made it easier to implement a python front-end, making it accessible to a language popular among data scientists.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"41 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"120838922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
New Approaches for Performance Optimization and Analysis of Large-Scale Dynamic Social Network Analysis using Anytime Anywhere Algorithms 使用随时随地算法的大规模动态社会网络分析性能优化和分析的新方法
Pub Date : 2020-05-01 DOI: 10.1109/ipdpsw50202.2020.00186
Eunice E. Santos, Vairavan Murugappan, John Korah
During the last decade, the availability of large amounts of social network information from various social and socio-technical networks has increased dramatically. These data sources are inherently dynamic with constantly evolving relationships and connections between entities. Research in this area must address the challenge of analyzing these dynamic datasets under potentially strict time constraints. In addition, due to the sheer size of these networks, they tend to be stored and analyzed on distributed platforms. In our previous work, we designed methodologies which are anytime and anywhere to design scalable parallel/distributed algorithms that incorporate different forms of network changes. In this work, we will investigate various schemas to balance the incorporation of dynamic network changes that will substantially reduce idleness and load imbalances among processors. We will show theoretically that in most cases our buffer-based methodology performs better than the more common way of handling changes as they come in.
在过去十年中,来自各种社会和社会技术网络的大量社会网络信息的可用性急剧增加。这些数据源本质上是动态的,具有实体之间不断发展的关系和连接。该领域的研究必须解决在潜在严格的时间限制下分析这些动态数据集的挑战。此外,由于这些网络的庞大规模,它们倾向于在分布式平台上存储和分析。在我们之前的工作中,我们设计了可以随时随地设计可扩展的并行/分布式算法的方法,这些算法可以结合不同形式的网络变化。在这项工作中,我们将研究各种模式,以平衡动态网络变化的合并,这将大大减少处理器之间的空闲和负载不平衡。我们将从理论上说明,在大多数情况下,我们基于缓冲区的方法比更常见的处理更改的方法执行得更好。
{"title":"New Approaches for Performance Optimization and Analysis of Large-Scale Dynamic Social Network Analysis using Anytime Anywhere Algorithms","authors":"Eunice E. Santos, Vairavan Murugappan, John Korah","doi":"10.1109/ipdpsw50202.2020.00186","DOIUrl":"https://doi.org/10.1109/ipdpsw50202.2020.00186","url":null,"abstract":"During the last decade, the availability of large amounts of social network information from various social and socio-technical networks has increased dramatically. These data sources are inherently dynamic with constantly evolving relationships and connections between entities. Research in this area must address the challenge of analyzing these dynamic datasets under potentially strict time constraints. In addition, due to the sheer size of these networks, they tend to be stored and analyzed on distributed platforms. In our previous work, we designed methodologies which are anytime and anywhere to design scalable parallel/distributed algorithms that incorporate different forms of network changes. In this work, we will investigate various schemas to balance the incorporation of dynamic network changes that will substantially reduce idleness and load imbalances among processors. We will show theoretically that in most cases our buffer-based methodology performs better than the more common way of handling changes as they come in.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"85 8","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121115171","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Parallelizing Maximal Clique Enumeration on Modern Manycore Processors 现代多核处理器上并行化最大团枚举
Pub Date : 2020-05-01 DOI: 10.1109/IPDPSW50202.2020.00047
J. Blanuša, R. Stoica, P. Ienne, K. Atasu
Many fundamental graph mining problems, such as maximal clique enumeration and subgraph isomorphism, can be solved using combinatorial algorithms that are naturally expressed in a recursive form. However, recursive graph mining algorithms suffer from a high algorithmic complexity and long execution times. Moreover, because the recursive nature of these algorithms causes unpredictable execution and memory access patterns, parallelizing them on modern computer architectures poses challenges. In this work, we describe an efficient manycore CPU implementation of maximal clique enumeration (MCE), a basic building block of several social and biological network mining algorithms. First, we improve the single-thread performance of MCE by accelerating its computation-intensive kernels through cache-conscious data structures and vector instructions. Then, we develop a multi-core solution and eliminate its scalability bottlenecks by minimizing the scheduling and the memory-management overheads. On highly-parallel modern CPUs, we demonstrate an up to 19-fold performance improvement compared to a state-of-the-art multi-core implementation of MCE.
许多基本的图挖掘问题,如最大团枚举和子图同构,可以使用自然地以递归形式表示的组合算法来解决。然而,递归图挖掘算法存在算法复杂度高、执行时间长等问题。此外,由于这些算法的递归性质导致不可预测的执行和内存访问模式,因此在现代计算机体系结构上并行化它们带来了挑战。在这项工作中,我们描述了一个高效的多核CPU实现最大团枚举(MCE),这是几个社会和生物网络挖掘算法的基本构建块。首先,我们通过缓存敏感的数据结构和矢量指令加速其计算密集型内核,从而提高了MCE的单线程性能。然后,我们开发了一个多核解决方案,并通过最小化调度和内存管理开销来消除其可伸缩性瓶颈。在高度并行的现代cpu上,与最先进的MCE多核实现相比,我们展示了高达19倍的性能提升。
{"title":"Parallelizing Maximal Clique Enumeration on Modern Manycore Processors","authors":"J. Blanuša, R. Stoica, P. Ienne, K. Atasu","doi":"10.1109/IPDPSW50202.2020.00047","DOIUrl":"https://doi.org/10.1109/IPDPSW50202.2020.00047","url":null,"abstract":"Many fundamental graph mining problems, such as maximal clique enumeration and subgraph isomorphism, can be solved using combinatorial algorithms that are naturally expressed in a recursive form. However, recursive graph mining algorithms suffer from a high algorithmic complexity and long execution times. Moreover, because the recursive nature of these algorithms causes unpredictable execution and memory access patterns, parallelizing them on modern computer architectures poses challenges. In this work, we describe an efficient manycore CPU implementation of maximal clique enumeration (MCE), a basic building block of several social and biological network mining algorithms. First, we improve the single-thread performance of MCE by accelerating its computation-intensive kernels through cache-conscious data structures and vector instructions. Then, we develop a multi-core solution and eliminate its scalability bottlenecks by minimizing the scheduling and the memory-management overheads. On highly-parallel modern CPUs, we demonstrate an up to 19-fold performance improvement compared to a state-of-the-art multi-core implementation of MCE.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"132441365","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Message from the 2020 General Co-Chairs 2020年共同主席致辞
Pub Date : 2020-05-01 DOI: 10.1109/ipdpsw50202.2020.00005
{"title":"Message from the 2020 General Co-Chairs","authors":"","doi":"10.1109/ipdpsw50202.2020.00005","DOIUrl":"https://doi.org/10.1109/ipdpsw50202.2020.00005","url":null,"abstract":"","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127062092","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards Stability in the Chapel Language 走向稳定的教堂语言
Pub Date : 2020-05-01 DOI: 10.1109/ipdpsw50202.2020.00116
Michael P. Ferguson
Language stability is an important upcoming feature of the Chapel programming language. Chapel users have both requested big changes to the language and also requested that the language become stable. This talk will discuss recent efforts to complete the big changes to the Chapel language so that the language can stabilize.
语言稳定性是Chapel编程语言即将到来的一个重要特性。Chapel用户既要求对该语言进行重大修改,也要求该语言变得稳定。本次演讲将讨论最近为完成Chapel语言的重大变化所做的努力,以便该语言能够稳定。
{"title":"Towards Stability in the Chapel Language","authors":"Michael P. Ferguson","doi":"10.1109/ipdpsw50202.2020.00116","DOIUrl":"https://doi.org/10.1109/ipdpsw50202.2020.00116","url":null,"abstract":"Language stability is an important upcoming feature of the Chapel programming language. Chapel users have both requested big changes to the language and also requested that the language become stable. This talk will discuss recent efforts to complete the big changes to the Chapel language so that the language can stabilize.","PeriodicalId":398819,"journal":{"name":"2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"93 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2020-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"127067082","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
2020 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1