Andrey Prokopenko, Daniel Arndt, Damien Lebrun-Grandié, Bruno Turcksin, Nicholas Frontiere, J. D. Emberson, Michael Buehlmann
{"title":"支持超大规模应用的 ArborX 技术进步","authors":"Andrey Prokopenko, Daniel Arndt, Damien Lebrun-Grandié, Bruno Turcksin, Nicholas Frontiere, J. D. Emberson, Michael Buehlmann","doi":"arxiv-2409.10743","DOIUrl":null,"url":null,"abstract":"ArborX is a performance portable geometric search library developed as part\nof the Exascale Computing Project (ECP). In this paper, we explore a\ncollaboration between ArborX and a cosmological simulation code HACC. Large\ncosmological simulations on exascale platforms encounter a bottleneck due to\nthe in-situ analysis requirements of halo finding, a problem of identifying\ndense clusters of dark matter (halos). This problem is solved by using a\ndensity-based DBSCAN clustering algorithm. With each MPI rank handling hundreds\nof millions of particles, it is imperative for the DBSCAN implementation to be\nefficient. In addition, the requirement to support exascale supercomputers from\ndifferent vendors necessitates performance portability of the algorithm. We\ndescribe how this challenge problem guided ArborX development, and enhanced the\nperformance and the scope of the library. We explore the improvements in the\nbasic algorithms for the underlying search index to improve the performance,\nand describe several implementations of DBSCAN in ArborX. Further, we report\nthe history of the changes in ArborX and their effect on the time to solve a\nrepresentative benchmark problem, as well as demonstrate the real world impact\non production end-to-end cosmology simulations.","PeriodicalId":501207,"journal":{"name":"arXiv - PHYS - Cosmology and Nongalactic Astrophysics","volume":"315 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-09-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Advances in ArborX to support exascale applications\",\"authors\":\"Andrey Prokopenko, Daniel Arndt, Damien Lebrun-Grandié, Bruno Turcksin, Nicholas Frontiere, J. D. Emberson, Michael Buehlmann\",\"doi\":\"arxiv-2409.10743\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"ArborX is a performance portable geometric search library developed as part\\nof the Exascale Computing Project (ECP). In this paper, we explore a\\ncollaboration between ArborX and a cosmological simulation code HACC. Large\\ncosmological simulations on exascale platforms encounter a bottleneck due to\\nthe in-situ analysis requirements of halo finding, a problem of identifying\\ndense clusters of dark matter (halos). This problem is solved by using a\\ndensity-based DBSCAN clustering algorithm. With each MPI rank handling hundreds\\nof millions of particles, it is imperative for the DBSCAN implementation to be\\nefficient. In addition, the requirement to support exascale supercomputers from\\ndifferent vendors necessitates performance portability of the algorithm. We\\ndescribe how this challenge problem guided ArborX development, and enhanced the\\nperformance and the scope of the library. We explore the improvements in the\\nbasic algorithms for the underlying search index to improve the performance,\\nand describe several implementations of DBSCAN in ArborX. Further, we report\\nthe history of the changes in ArborX and their effect on the time to solve a\\nrepresentative benchmark problem, as well as demonstrate the real world impact\\non production end-to-end cosmology simulations.\",\"PeriodicalId\":501207,\"journal\":{\"name\":\"arXiv - PHYS - Cosmology and Nongalactic Astrophysics\",\"volume\":\"315 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-09-16\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - PHYS - Cosmology and Nongalactic Astrophysics\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2409.10743\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - PHYS - Cosmology and Nongalactic Astrophysics","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2409.10743","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Advances in ArborX to support exascale applications
ArborX is a performance portable geometric search library developed as part
of the Exascale Computing Project (ECP). In this paper, we explore a
collaboration between ArborX and a cosmological simulation code HACC. Large
cosmological simulations on exascale platforms encounter a bottleneck due to
the in-situ analysis requirements of halo finding, a problem of identifying
dense clusters of dark matter (halos). This problem is solved by using a
density-based DBSCAN clustering algorithm. With each MPI rank handling hundreds
of millions of particles, it is imperative for the DBSCAN implementation to be
efficient. In addition, the requirement to support exascale supercomputers from
different vendors necessitates performance portability of the algorithm. We
describe how this challenge problem guided ArborX development, and enhanced the
performance and the scope of the library. We explore the improvements in the
basic algorithms for the underlying search index to improve the performance,
and describe several implementations of DBSCAN in ArborX. Further, we report
the history of the changes in ArborX and their effect on the time to solve a
representative benchmark problem, as well as demonstrate the real world impact
on production end-to-end cosmology simulations.