通过综合基准评估Chapel的内存访问性能

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing Pub Date : 2015-05-04 DOI:10.1109/CCGrid.2015.157

Engin Kayraklioglu, T. El-Ghazawi

{"title":"通过综合基准评估Chapel的内存访问性能","authors":"Engin Kayraklioglu, T. El-Ghazawi","doi":"10.1109/CCGrid.2015.157","DOIUrl":null,"url":null,"abstract":"The Partitioned Global Address Space(PGAS) programming model strikes a balance between high performance and locality awareness. As a PGAS language, Chapel relieves programmers from handling details of data movement in a distributed memory environment, by presenting a flat memory space that is logically partitioned among executing entities. Traversing such a space requires address mapping to the system virtual address space, and as such, this abstraction inevitably causes major overheads during memory accesses. In this paper, we analyzed the extent of this overhead by implementing a micro benchmark to test different types of memory accesses that can be observed in Chapel. We showed that, as the locality gets exploited speedup gains up to 35x can be achieved. This was demonstrated through hand tuning, however. More productive means should be provided to deliver such performance improvement without excessively burdening programmers. Therefore, we also discuss possibilities to increase Chapel's performance through standard libraries, compiler, runtime and/or hardware support to handle different types of memory accesses more efficiently.","PeriodicalId":6664,"journal":{"name":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","volume":"7 1","pages":"1147-1150"},"PeriodicalIF":0.0000,"publicationDate":"2015-05-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Assessing Memory Access Performance of Chapel through Synthetic Benchmarks\",\"authors\":\"Engin Kayraklioglu, T. El-Ghazawi\",\"doi\":\"10.1109/CCGrid.2015.157\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Partitioned Global Address Space(PGAS) programming model strikes a balance between high performance and locality awareness. As a PGAS language, Chapel relieves programmers from handling details of data movement in a distributed memory environment, by presenting a flat memory space that is logically partitioned among executing entities. Traversing such a space requires address mapping to the system virtual address space, and as such, this abstraction inevitably causes major overheads during memory accesses. In this paper, we analyzed the extent of this overhead by implementing a micro benchmark to test different types of memory accesses that can be observed in Chapel. We showed that, as the locality gets exploited speedup gains up to 35x can be achieved. This was demonstrated through hand tuning, however. More productive means should be provided to deliver such performance improvement without excessively burdening programmers. Therefore, we also discuss possibilities to increase Chapel's performance through standard libraries, compiler, runtime and/or hardware support to handle different types of memory accesses more efficiently.\",\"PeriodicalId\":6664,\"journal\":{\"name\":\"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing\",\"volume\":\"7 1\",\"pages\":\"1147-1150\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-05-04\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CCGrid.2015.157\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CCGrid.2015.157","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 3

摘要

分区全局地址空间(PGAS)编程模型在高性能和局域意识之间取得了平衡。作为一种PGAS语言，Chapel通过在执行实体之间逻辑分区的平面内存空间，将程序员从处理分布式内存环境中数据移动的细节中解脱出来。遍历这样的空间需要将地址映射到系统虚拟地址空间，因此，这种抽象不可避免地会导致内存访问期间的主要开销。在本文中，我们通过实现一个微基准来测试在Chapel中可以观察到的不同类型的内存访问，从而分析了这种开销的程度。我们表明，随着局部性得到充分利用，加速增益可以达到35倍。然而，这是通过手动调优来证明的。应该提供更有效的方法来交付这样的性能改进，而不会给程序员带来过多的负担。因此，我们还讨论了通过标准库、编译器、运行时和/或硬件支持来提高Chapel性能的可能性，以更有效地处理不同类型的内存访问。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Assessing Memory Access Performance of Chapel through Synthetic Benchmarks

The Partitioned Global Address Space(PGAS) programming model strikes a balance between high performance and locality awareness. As a PGAS language, Chapel relieves programmers from handling details of data movement in a distributed memory environment, by presenting a flat memory space that is logically partitioned among executing entities. Traversing such a space requires address mapping to the system virtual address space, and as such, this abstraction inevitably causes major overheads during memory accesses. In this paper, we analyzed the extent of this overhead by implementing a micro benchmark to test different types of memory accesses that can be observed in Chapel. We showed that, as the locality gets exploited speedup gains up to 35x can be achieved. This was demonstrated through hand tuning, however. More productive means should be provided to deliver such performance improvement without excessively burdening programmers. Therefore, we also discuss possibilities to increase Chapel's performance through standard libraries, compiler, runtime and/or hardware support to handle different types of memory accesses more efficiently.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2015 15th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing

自引率

0.00%

发文量