Chia-Jung Chang, Yin-Chi Peng, Chien-Chih Chen, Tien-Fu Chen, P. Yew
{"title":"多核系统中及时预取的自适应粒度和协调管理","authors":"Chia-Jung Chang, Yin-Chi Peng, Chien-Chih Chen, Tien-Fu Chen, P. Yew","doi":"10.1109/VLSI-DAT.2015.7114578","DOIUrl":null,"url":null,"abstract":"For the last decade, there have been varying techniques for hardware prefetching to improve the system performance. However, untimely prefetching may pollution caches and resulting into significant performance degradation. In this work, we introduce an Adaptive Granularity and coordinated Prefetching (AGP) that consists of a coarse-grained and fine-grained prefetched mechanism to provide a better caching environment for parallel applications. AGP targets on the degree-adjusting and location-choosing and tries to minimize the influence caused by prefetcher for each core. AGP could produce more timely prefetched requests reducing the cache pollutions and contentions. Across a variety of PARSEC benchmarks, AGP can contribute 6.5% (up to 36%) of performance improvement on a 4-core multicore system compared to the non-prefetching.","PeriodicalId":369130,"journal":{"name":"VLSI Design, Automation and Test(VLSI-DAT)","volume":"16 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"Adaptive granularity and coordinated management for timely prefetching in multi-core systems\",\"authors\":\"Chia-Jung Chang, Yin-Chi Peng, Chien-Chih Chen, Tien-Fu Chen, P. Yew\",\"doi\":\"10.1109/VLSI-DAT.2015.7114578\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"For the last decade, there have been varying techniques for hardware prefetching to improve the system performance. However, untimely prefetching may pollution caches and resulting into significant performance degradation. In this work, we introduce an Adaptive Granularity and coordinated Prefetching (AGP) that consists of a coarse-grained and fine-grained prefetched mechanism to provide a better caching environment for parallel applications. AGP targets on the degree-adjusting and location-choosing and tries to minimize the influence caused by prefetcher for each core. AGP could produce more timely prefetched requests reducing the cache pollutions and contentions. Across a variety of PARSEC benchmarks, AGP can contribute 6.5% (up to 36%) of performance improvement on a 4-core multicore system compared to the non-prefetching.\",\"PeriodicalId\":369130,\"journal\":{\"name\":\"VLSI Design, Automation and Test(VLSI-DAT)\",\"volume\":\"16 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-27\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"VLSI Design, Automation and Test(VLSI-DAT)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/VLSI-DAT.2015.7114578\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"VLSI Design, Automation and Test(VLSI-DAT)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/VLSI-DAT.2015.7114578","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Adaptive granularity and coordinated management for timely prefetching in multi-core systems
For the last decade, there have been varying techniques for hardware prefetching to improve the system performance. However, untimely prefetching may pollution caches and resulting into significant performance degradation. In this work, we introduce an Adaptive Granularity and coordinated Prefetching (AGP) that consists of a coarse-grained and fine-grained prefetched mechanism to provide a better caching environment for parallel applications. AGP targets on the degree-adjusting and location-choosing and tries to minimize the influence caused by prefetcher for each core. AGP could produce more timely prefetched requests reducing the cache pollutions and contentions. Across a variety of PARSEC benchmarks, AGP can contribute 6.5% (up to 36%) of performance improvement on a 4-core multicore system compared to the non-prefetching.