首页 > 最新文献

Parallel Processing and Applied Mathematics最新文献

英文 中文
Parking Search in Urban Street Networks: Taming Down the Complexity of the Search-Time Problem via a Coarse-Graining Approach 城市街道网络中的停车搜索:用粗粒度方法降低搜索时间问题的复杂性
Pub Date : 2023-04-11 DOI: 10.1007/978-3-031-30445-3_39
L'eo Bulckaen, Nilankur Dutta, Alexandre Nicolas
{"title":"Parking Search in Urban Street Networks: Taming Down the Complexity of the Search-Time Problem via a Coarse-Graining Approach","authors":"L'eo Bulckaen, Nilankur Dutta, Alexandre Nicolas","doi":"10.1007/978-3-031-30445-3_39","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_39","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"167 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2023-04-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"125526126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kokkos-Based Implementation of MPCD on Heterogeneous Nodes 基于kokkos的异构节点MPCD实现
Pub Date : 2022-12-22 DOI: 10.1007/978-3-031-30445-3_1
R. Halver, Christoph Junghans, G. Sutmann
{"title":"Kokkos-Based Implementation of MPCD on Heterogeneous Nodes","authors":"R. Halver, Christoph Junghans, G. Sutmann","doi":"10.1007/978-3-031-30445-3_1","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_1","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"03 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"129958367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Acceptance Rates of Invertible Neural Networks on Electron Spectra from Near-Critical Laser-Plasmas: A Comparison 可逆神经网络对近临界激光等离子体电子能谱的接受率比较
Pub Date : 2022-12-12 DOI: 10.1007/978-3-031-30445-3_23
T. Miethlinger, N. Hoffmann, T. Kluge
{"title":"Acceptance Rates of Invertible Neural Networks on Electron Spectra from Near-Critical Laser-Plasmas: A Comparison","authors":"T. Miethlinger, N. Hoffmann, T. Kluge","doi":"10.1007/978-3-031-30445-3_23","DOIUrl":"https://doi.org/10.1007/978-3-031-30445-3_23","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-12-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128259174","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed Work Stealing in a Task-Based Dataflow Runtime 基于任务的数据流运行时中的分布式工作窃取
Pub Date : 2022-11-02 DOI: 10.48550/arXiv.2211.00838
Joseph John, Joshua Milthorpe, P. Strazdins
The task-based dataflow programming model has emerged as an alternative to the process-centric programming model for extreme-scale applications. However, load balancing is still a challenge in task-based dataflow runtimes. In this paper, we present extensions to the PaR-SEC runtime to demonstrate that distributed work stealing is an effective load-balancing method for task-based dataflow runtimes. In contrast to shared-memory work stealing, we find that each process should consider future tasks and the expected waiting time for execution when determining whether to steal. We demonstrate the effectiveness of the proposed work-stealing policies for a sparse Cholesky factorization, which shows a speedup of up to 35% compared to a static division of work.
基于任务的数据流编程模型已经出现,它可以替代以流程为中心的编程模型,用于极端规模的应用程序。然而,在基于任务的数据流运行时中,负载平衡仍然是一个挑战。在本文中,我们提出了PaR-SEC运行时的扩展,以证明分布式工作窃取是基于任务的数据流运行时的有效负载平衡方法。与共享内存工作窃取相比,我们发现在决定是否窃取时,每个进程都应该考虑未来的任务和预期的执行等待时间。我们证明了所提出的工作窃取策略对于稀疏Cholesky分解的有效性,与静态工作分工相比,它的加速高达35%。
{"title":"Distributed Work Stealing in a Task-Based Dataflow Runtime","authors":"Joseph John, Joshua Milthorpe, P. Strazdins","doi":"10.48550/arXiv.2211.00838","DOIUrl":"https://doi.org/10.48550/arXiv.2211.00838","url":null,"abstract":"The task-based dataflow programming model has emerged as an alternative to the process-centric programming model for extreme-scale applications. However, load balancing is still a challenge in task-based dataflow runtimes. In this paper, we present extensions to the PaR-SEC runtime to demonstrate that distributed work stealing is an effective load-balancing method for task-based dataflow runtimes. In contrast to shared-memory work stealing, we find that each process should consider future tasks and the expected waiting time for execution when determining whether to steal. We demonstrate the effectiveness of the proposed work-stealing policies for a sparse Cholesky factorization, which shows a speedup of up to 35% compared to a static division of work.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"12 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-11-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"130967655","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
High Performance Dataframes from Parallel Processing Patterns 来自并行处理模式的高性能数据帧
Pub Date : 2022-09-13 DOI: 10.1007/978-3-031-30442-2_22
Niranda Perera, Supun Kamburugamuve, Chathura Widanage, V. Abeykoon, A. Uyar, Kaiying Shan, Hasara Maithree, Damitha Sandeepa Lenadora, Thejaka Amila Kanewala, Geoffrey Fox
{"title":"High Performance Dataframes from Parallel Processing Patterns","authors":"Niranda Perera, Supun Kamburugamuve, Chathura Widanage, V. Abeykoon, A. Uyar, Kaiying Shan, Hasara Maithree, Damitha Sandeepa Lenadora, Thejaka Amila Kanewala, Geoffrey Fox","doi":"10.1007/978-3-031-30442-2_22","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_22","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"15 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"126785402","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
Breaking Down the Parallel Performance of GROMACS, a High-Performance Molecular Dynamics Software 高性能分子动力学软件GROMACS的并行性能分析
Pub Date : 2022-08-29 DOI: 10.1007/978-3-031-30442-2_25
Måns I. Andersson, N. A. Murugan, Artur Podobas, S. Markidis
{"title":"Breaking Down the Parallel Performance of GROMACS, a High-Performance Molecular Dynamics Software","authors":"Måns I. Andersson, N. A. Murugan, Artur Podobas, S. Markidis","doi":"10.1007/978-3-031-30442-2_25","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_25","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"61 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115617024","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 3
Distributed Objective Function Evaluation for Optimization of Radiation Therapy Treatment Plans 分布式目标函数评价在放射治疗方案优化中的应用
Pub Date : 2022-08-24 DOI: 10.1007/978-3-031-30442-2_29
Felix Liu, Måns I. Andersson, A. Fredriksson, S. Markidis
{"title":"Distributed Objective Function Evaluation for Optimization of Radiation Therapy Treatment Plans","authors":"Felix Liu, Måns I. Andersson, A. Fredriksson, S. Markidis","doi":"10.1007/978-3-031-30442-2_29","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_29","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"115453300","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs 基于牛顿共轭梯度法的多gpu神经网络
Pub Date : 2022-08-03 DOI: 10.48550/arXiv.2208.02017
Severin Reiz, T. Neckel, H. Bungartz
Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative neural network problems, including regression and very deep networks from computer vision or variational autoencoders. For the largest setup, we efficiently parallelized the optimizers with Horovod and applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.
训练深度神经网络消耗越来越多的计算中心的计算资源份额。通常,使用蛮力方法来获取超参数值。我们的目标是(1)通过在大规模神经网络中启用具有更少超参数的二阶优化方法来增强这一点;(2)对特定任务的性能优化器进行调查,以向用户建议最适合他们问题的性能优化器。我们引入了一种新的二阶优化方法,该方法只需要对向量产生Hessian的影响,并且避免了为大规模网络显式设置Hessian的巨大成本。我们将提出的二阶方法与两种最先进的优化器在五个代表性神经网络问题上进行比较,包括回归和来自计算机视觉或变分自编码器的非常深的网络。对于最大的设置,我们有效地将优化器与Horovod并行化,并将其应用于一台8 GPU的NVIDIA P100 (DGX-1)机器。
{"title":"Neural Nets with a Newton Conjugate Gradient Method on Multiple GPUs","authors":"Severin Reiz, T. Neckel, H. Bungartz","doi":"10.48550/arXiv.2208.02017","DOIUrl":"https://doi.org/10.48550/arXiv.2208.02017","url":null,"abstract":"Training deep neural networks consumes increasing computational resource shares in many compute centers. Often, a brute force approach to obtain hyperparameter values is employed. Our goal is (1) to enhance this by enabling second-order optimization methods with fewer hyperparameters for large-scale neural networks and (2) to perform a survey of the performance optimizers for specific tasks to suggest users the best one for their problem. We introduce a novel second-order optimization method that requires the effect of the Hessian on a vector only and avoids the huge cost of explicitly setting up the Hessian for large-scale networks. We compare the proposed second-order method with two state-of-the-art optimizers on five representative neural network problems, including regression and very deep networks from computer vision or variational autoencoders. For the largest setup, we efficiently parallelized the optimizers with Horovod and applied it to a 8 GPU NVIDIA P100 (DGX-1) machine.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"18 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-08-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"121337011","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms MD-Bench:最先进的分子动力学算法的通用代理应用程序工具箱
Pub Date : 2022-07-26 DOI: 10.48550/arXiv.2207.13094
R. Machado, Jan Eitzinger, H. Köstler, G. Wellein
Proxy-apps, or mini-apps, are simple self-contained benchmark codes with performance-relevant kernels extracted from real applications. Initially used to facilitate software-hardware co-design, they are a crucial ingredient for serious performance engineering, especially when dealing with large-scale production codes. MD-Bench is a new proxy-app in the area of classical short-range molecular dynamics. In contrast to existing proxy-apps in MD (e.g. miniMD and coMD) it does not resemble a single application code, but implements state-of-the art algorithms from multiple applications (currently LAMMPS and GROMACS). The MD-Bench source code is understandable, extensible and suited for teaching, benchmarking and researching MD algorithms. Primary design goals are transparency and simplicity, a developer is able to tinker with the source code down to the assembly level. This paper introduces MD-Bench, explains its design and structure, covers implemented optimization variants, and illustrates its usage on three examples.
代理应用程序或迷你应用程序是简单的自包含基准代码,具有从实际应用程序中提取的与性能相关的内核。最初用于促进软硬件协同设计,它们是严肃的性能工程的关键成分,特别是在处理大规模生产代码时。MD-Bench是经典短程分子动力学领域的一种新型代理应用程序。与MD中现有的代理应用程序(例如miniMD和coMD)相比,它不像单个应用程序代码,而是实现了来自多个应用程序(目前为LAMMPS和GROMACS)的最先进算法。MD- bench源代码是可理解的,可扩展的,适合教学,基准测试和研究MD算法。主要的设计目标是透明和简单,开发人员可以对源代码进行修改,直到汇编级别。本文介绍了MD-Bench,解释了它的设计和结构,涵盖了实现的优化变体,并通过三个例子说明了它的使用方法。
{"title":"MD-Bench: A generic proxy-app toolbox for state-of-the-art molecular dynamics algorithms","authors":"R. Machado, Jan Eitzinger, H. Köstler, G. Wellein","doi":"10.48550/arXiv.2207.13094","DOIUrl":"https://doi.org/10.48550/arXiv.2207.13094","url":null,"abstract":"Proxy-apps, or mini-apps, are simple self-contained benchmark codes with performance-relevant kernels extracted from real applications. Initially used to facilitate software-hardware co-design, they are a crucial ingredient for serious performance engineering, especially when dealing with large-scale production codes. MD-Bench is a new proxy-app in the area of classical short-range molecular dynamics. In contrast to existing proxy-apps in MD (e.g. miniMD and coMD) it does not resemble a single application code, but implements state-of-the art algorithms from multiple applications (currently LAMMPS and GROMACS). The MD-Bench source code is understandable, extensible and suited for teaching, benchmarking and researching MD algorithms. Primary design goals are transparency and simplicity, a developer is able to tinker with the source code down to the assembly level. This paper introduces MD-Bench, explains its design and structure, covers implemented optimization variants, and illustrates its usage on three examples.","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"17 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-07-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128367727","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 1
Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications mpi -并行应用中自发异步性分析技术探索
Pub Date : 2022-05-27 DOI: 10.1007/978-3-031-30442-2_12
Ayesha Afzal, G. Hager, G. Wellein, S. Markidis
{"title":"Exploring Techniques for the Analysis of Spontaneous Asynchronicity in MPI-Parallel Applications","authors":"Ayesha Afzal, G. Hager, G. Wellein, S. Markidis","doi":"10.1007/978-3-031-30442-2_12","DOIUrl":"https://doi.org/10.1007/978-3-031-30442-2_12","url":null,"abstract":"","PeriodicalId":431607,"journal":{"name":"Parallel Processing and Applied Mathematics","volume":"38 1","pages":"0"},"PeriodicalIF":0.0,"publicationDate":"2022-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"128508072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 2
期刊
Parallel Processing and Applied Mathematics
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1