{"title":"高密度共享内存HPC节点能耗量化研究","authors":"Milos Puzovic, Srilatha Manne, Shay GalOn, M. Ono","doi":"10.1109/E2SC.2016.7","DOIUrl":null,"url":null,"abstract":"In this paper we introduce a novel, dense, system-on-chip many-core Lenovo NeXtScale System® server based on the Cavium THUNDERX® ARMv8 processor that was designed for performance, energy efficiency and programmability. THUNDERX processor was designed to scale up to 96 cores in a cache coherent, shared memory architecture. Furthermore, this hardware system has a power interface board (PIB) that measures with high accuracy power draw across the server board in the NeXtScale™ chassis. We use data obtainable from PIB to measure the energy use of PARSEC and Splash-2 benchmarks and demonstrate how to use available hardware counters from THUNDERX processor in order to quantify the amount of energy that is used by different aspects of shared memory programming, such as cache coherent communication. We show that energy used required to keep caches coherent is negligible and demonstrate that shared memory programming paradigm is viable candidate for future energy aware HPC designs.","PeriodicalId":424743,"journal":{"name":"2016 4th International Workshop on Energy Efficient Supercomputing (E2SC)","volume":"21 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2016-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Quantifying Energy Use in Dense Shared Memory HPC Node\",\"authors\":\"Milos Puzovic, Srilatha Manne, Shay GalOn, M. Ono\",\"doi\":\"10.1109/E2SC.2016.7\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"In this paper we introduce a novel, dense, system-on-chip many-core Lenovo NeXtScale System® server based on the Cavium THUNDERX® ARMv8 processor that was designed for performance, energy efficiency and programmability. THUNDERX processor was designed to scale up to 96 cores in a cache coherent, shared memory architecture. Furthermore, this hardware system has a power interface board (PIB) that measures with high accuracy power draw across the server board in the NeXtScale™ chassis. We use data obtainable from PIB to measure the energy use of PARSEC and Splash-2 benchmarks and demonstrate how to use available hardware counters from THUNDERX processor in order to quantify the amount of energy that is used by different aspects of shared memory programming, such as cache coherent communication. We show that energy used required to keep caches coherent is negligible and demonstrate that shared memory programming paradigm is viable candidate for future energy aware HPC designs.\",\"PeriodicalId\":424743,\"journal\":{\"name\":\"2016 4th International Workshop on Energy Efficient Supercomputing (E2SC)\",\"volume\":\"21 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-11-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 4th International Workshop on Energy Efficient Supercomputing (E2SC)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/E2SC.2016.7\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 4th International Workshop on Energy Efficient Supercomputing (E2SC)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/E2SC.2016.7","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Quantifying Energy Use in Dense Shared Memory HPC Node
In this paper we introduce a novel, dense, system-on-chip many-core Lenovo NeXtScale System® server based on the Cavium THUNDERX® ARMv8 processor that was designed for performance, energy efficiency and programmability. THUNDERX processor was designed to scale up to 96 cores in a cache coherent, shared memory architecture. Furthermore, this hardware system has a power interface board (PIB) that measures with high accuracy power draw across the server board in the NeXtScale™ chassis. We use data obtainable from PIB to measure the energy use of PARSEC and Splash-2 benchmarks and demonstrate how to use available hardware counters from THUNDERX processor in order to quantify the amount of energy that is used by different aspects of shared memory programming, such as cache coherent communication. We show that energy used required to keep caches coherent is negligible and demonstrate that shared memory programming paradigm is viable candidate for future energy aware HPC designs.