An On-the-Fly Method to Exchange Vector Clocks in Distributed-Memory Programs

Simon Schwitanski, Felix Tomski, Joachim Protze, C. Terboven, Matthias S. Müller
{"title":"An On-the-Fly Method to Exchange Vector Clocks in Distributed-Memory Programs","authors":"Simon Schwitanski, Felix Tomski, Joachim Protze, C. Terboven, Matthias S. Müller","doi":"10.1109/IPDPSW55747.2022.00093","DOIUrl":null,"url":null,"abstract":"Vector clocks are logical timestamps used in correctness tools to analyze the happened-before relation between events in parallel program executions. In particular, race detectors use them to find concurrent conflicting memory accesses, and replay tools use them to reproduce or find alternative execution paths. To record the happened-before relation with vector clocks, tool developers have to consider the different synchronization concepts of a programming model, e.g., barriers, locks, or message exchanges. Especially in distributed-memory programs, various concepts result in explicit and implicit synchronization between processes. Previously implemented vector clock exchanges are often specific to a single programming model, and a translation to other programming models is not trivial. Consequently, analyses relying on the vector clock exchange remain model-specific. This paper proposes an abstraction layer for on-the-fly vector clock exchanges for distributed-memory programs. Based on the programming models MPI, OpenSHMEM, and GASPI, we define common synchronization primitives and explain how model-specific procedures map to our model-agnostic abstraction layer. The exchange model is general enough also to support synchronization concepts of other parallel programming models. We present our implementation of the vector clock abstraction layer based on the Generic Tool Infrastructure with translators for MPI and OpenSHMEM. In an overhead study using the SPEC MPI 2007 benchmarks, the slowdown of the implemented vector clock exchange ranges from 1.1x to 12.6x for runs with up to 768 processes.","PeriodicalId":286968,"journal":{"name":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","volume":"58 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2022-05-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2022 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IPDPSW55747.2022.00093","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

Abstract

Vector clocks are logical timestamps used in correctness tools to analyze the happened-before relation between events in parallel program executions. In particular, race detectors use them to find concurrent conflicting memory accesses, and replay tools use them to reproduce or find alternative execution paths. To record the happened-before relation with vector clocks, tool developers have to consider the different synchronization concepts of a programming model, e.g., barriers, locks, or message exchanges. Especially in distributed-memory programs, various concepts result in explicit and implicit synchronization between processes. Previously implemented vector clock exchanges are often specific to a single programming model, and a translation to other programming models is not trivial. Consequently, analyses relying on the vector clock exchange remain model-specific. This paper proposes an abstraction layer for on-the-fly vector clock exchanges for distributed-memory programs. Based on the programming models MPI, OpenSHMEM, and GASPI, we define common synchronization primitives and explain how model-specific procedures map to our model-agnostic abstraction layer. The exchange model is general enough also to support synchronization concepts of other parallel programming models. We present our implementation of the vector clock abstraction layer based on the Generic Tool Infrastructure with translators for MPI and OpenSHMEM. In an overhead study using the SPEC MPI 2007 benchmarks, the slowdown of the implemented vector clock exchange ranges from 1.1x to 12.6x for runs with up to 768 processes.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
分布式内存程序中动态交换矢量时钟的方法
矢量时钟是正确性工具中使用的逻辑时间戳,用于分析并行程序执行中事件之间发生之前的关系。特别是,竞争检测器使用它们来查找并发的冲突内存访问,重播工具使用它们来重现或查找替代执行路径。为了用矢量时钟记录之前发生的关系,工具开发人员必须考虑编程模型的不同同步概念,例如,屏障、锁或消息交换。特别是在分布式内存程序中,各种概念导致进程之间显式和隐式同步。以前实现的矢量时钟交换通常特定于单个编程模型,并且转换到其他编程模型也不是微不足道的。因此,依赖于矢量时钟交换的分析仍然是特定于模型的。本文提出了一种用于分布式内存程序的动态矢量时钟交换的抽象层。基于编程模型MPI、OpenSHMEM和GASPI,我们定义了常见的同步原语,并解释了特定于模型的过程如何映射到与模型无关的抽象层。交换模型也足够通用,可以支持其他并行编程模型的同步概念。我们提出了基于通用工具基础设施的矢量时钟抽象层的实现,并为MPI和OpenSHMEM提供了翻译。在使用SPEC MPI 2007基准测试的开销研究中,对于运行多达768个进程的运行,实现的矢量时钟交换的减速范围从1.1倍到12.6倍。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
(CGRA4HPC) 2022 Invited Speaker: Pushing the Boundaries of HPC with the Integration of AI Moving from Composable to Programmable Energy-aware neural architecture selection and hyperparameter optimization Smoothing on Dynamic Concurrency Throttling An Analysis of Mapping Polybench Kernels to HPC CGRAs
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1