使用NUMA-BTLP和NUMA-BTDM静态算法进行线程分类和线程类型感知映射,通过平衡数据局域性提高运行时性能和能耗

Iulia Ştirb
{"title":"使用NUMA-BTLP和NUMA-BTDM静态算法进行线程分类和线程类型感知映射,通过平衡数据局域性提高运行时性能和能耗","authors":"Iulia Ştirb","doi":"10.1504/ijcse.2020.10029352","DOIUrl":null,"url":null,"abstract":"Extending compilers like LLVM with NUMA-aware optimisations significantly improves runtime performance and energy consumption on NUMA systems. The paper presents NUMA-BTDM algorithm, which is a compile-time thread-type dependent mapping algorithm that performs the mapping uniformly based on the type of each thread given by NUMA-BTLP algorithm following a static analysis on the code. First, the compiler inserts in the program code architecture dependent code that detects at runtime the characteristics of the underlying architecture for Intel processors, and then the mapping is performed at runtime (using specific functions calls from the PThreads library) depending on these characteristics following a compile-time mapping analysis which gives the CPU affinity of each thread. NUMA-BTDM allows the application to customise, control and optimise the thread mapping and achieves balanced data locality on NUMA systems for C parallel code that combine PThreads based task parallelism with OpenMP based loop parallelism.","PeriodicalId":340410,"journal":{"name":"Int. J. Comput. Sci. Eng.","volume":"23 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-05-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Improving runtime performance and energy consumption through balanced data locality with NUMA-BTLP and NUMA-BTDM static algorithms for thread classification and thread type-aware mapping\",\"authors\":\"Iulia Ştirb\",\"doi\":\"10.1504/ijcse.2020.10029352\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Extending compilers like LLVM with NUMA-aware optimisations significantly improves runtime performance and energy consumption on NUMA systems. The paper presents NUMA-BTDM algorithm, which is a compile-time thread-type dependent mapping algorithm that performs the mapping uniformly based on the type of each thread given by NUMA-BTLP algorithm following a static analysis on the code. First, the compiler inserts in the program code architecture dependent code that detects at runtime the characteristics of the underlying architecture for Intel processors, and then the mapping is performed at runtime (using specific functions calls from the PThreads library) depending on these characteristics following a compile-time mapping analysis which gives the CPU affinity of each thread. NUMA-BTDM allows the application to customise, control and optimise the thread mapping and achieves balanced data locality on NUMA systems for C parallel code that combine PThreads based task parallelism with OpenMP based loop parallelism.\",\"PeriodicalId\":340410,\"journal\":{\"name\":\"Int. J. Comput. Sci. Eng.\",\"volume\":\"23 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-05-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Int. J. Comput. Sci. Eng.\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1504/ijcse.2020.10029352\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Int. J. Comput. Sci. Eng.","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1504/ijcse.2020.10029352","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 1

摘要

用NUMA感知优化扩展像LLVM这样的编译器,可以显著提高NUMA系统上的运行时性能和能耗。本文提出了NUMA-BTDM算法,它是一种编译时线程类型相关的映射算法,通过对代码进行静态分析,根据NUMA-BTLP算法给出的每个线程类型统一执行映射。首先,编译器在程序代码体系结构相关的代码中插入代码,这些代码在运行时检测英特尔处理器的底层体系结构的特征,然后在运行时(使用来自PThreads库的特定函数调用)根据编译时映射分析后的这些特征执行映射,该分析给出每个线程的CPU亲缘性。NUMA- btdm允许应用程序定制、控制和优化线程映射,并在NUMA系统上为C并行代码实现平衡的数据局部性,这些代码结合了基于PThreads的任务并行性和基于OpenMP的循环并行性。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
Improving runtime performance and energy consumption through balanced data locality with NUMA-BTLP and NUMA-BTDM static algorithms for thread classification and thread type-aware mapping
Extending compilers like LLVM with NUMA-aware optimisations significantly improves runtime performance and energy consumption on NUMA systems. The paper presents NUMA-BTDM algorithm, which is a compile-time thread-type dependent mapping algorithm that performs the mapping uniformly based on the type of each thread given by NUMA-BTLP algorithm following a static analysis on the code. First, the compiler inserts in the program code architecture dependent code that detects at runtime the characteristics of the underlying architecture for Intel processors, and then the mapping is performed at runtime (using specific functions calls from the PThreads library) depending on these characteristics following a compile-time mapping analysis which gives the CPU affinity of each thread. NUMA-BTDM allows the application to customise, control and optimise the thread mapping and achieves balanced data locality on NUMA systems for C parallel code that combine PThreads based task parallelism with OpenMP based loop parallelism.
求助全文
通过发布文献求助,成功后即可免费获取论文全文。 去求助
来源期刊
自引率
0.00%
发文量
0
期刊最新文献
ECC-based lightweight mutual authentication protocol for fog enabled IoT system using three-way authentication procedure Gene selection and classification combining information gain ratio with fruit fly optimisation algorithm for single-cell RNA-seq data Attitude control of an unmanned patrol helicopter based on an optimised spiking neural membrane system for use in coal mines CEMP-IR: a novel location aware cache invalidation and replacement policy Prediction of consumer preference for the bottom of the pyramid using EEG-based deep model
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1