{"title":"PARS:支持多种区域大小的模式感知空间数据预取器","authors":"Yiquan Lin;Wenhai Lin;Jiexiong Xu;Yiquan Chen;Zhen Jin;Jingchang Qin;Jiahao He;Shishun Cai;Yuzhong Zhang;Zonghui Wang;Wenzhi Chen","doi":"10.1109/TCAD.2024.3442981","DOIUrl":null,"url":null,"abstract":"Hardware data prefetching is a well-studied technique to bridge the processor-memory performance gap. Bit-pattern-based prefetchers are one of the most promising spatial data prefetchers that achieve substantial performance gains. In bit-pattern-based prefetchers, the region size is a crucial parameter, which denotes the memory size that can be recorded by a pattern or prefetched by a prediction. However, existing bit-pattern-based prefetchers only support one fixed region size. Our experiment shows that the fixed region size cannot meet the requirements for numerous applications and leads to suboptimal performance and high hardware overhead. In this article, we propose PARS, a pattern-aware spatial data prefetcher supporting multiple region sizes. The key idea of PARS is that it supports multiple region sizes, enabling it to simultaneously enhance application performance while reducing the hardware overhead. Moreover, PARS supports dynamically switching appropriate region sizes for different patterns through an adaptive RS-switching mechanism. We evaluated PARS on numerous workloads and results show that PARS provides an average performance improvement of 40.6% over a baseline with no data prefetchers and outperforms the two state-of-the-art prefetchers Bingo by 2.1% (up to 24.4%) and Pythia by 3.9% (up to 111.2%) in the single-core system. In the four-core system, PARS outperforms Bingo by 5.0% (up to 66.0%) and Pythia by 5.4% (up to 177.9%).","PeriodicalId":13251,"journal":{"name":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","volume":"43 11","pages":"3638-3649"},"PeriodicalIF":2.7000,"publicationDate":"2024-11-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"PARS: A Pattern-Aware Spatial Data Prefetcher Supporting Multiple Region Sizes\",\"authors\":\"Yiquan Lin;Wenhai Lin;Jiexiong Xu;Yiquan Chen;Zhen Jin;Jingchang Qin;Jiahao He;Shishun Cai;Yuzhong Zhang;Zonghui Wang;Wenzhi Chen\",\"doi\":\"10.1109/TCAD.2024.3442981\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Hardware data prefetching is a well-studied technique to bridge the processor-memory performance gap. Bit-pattern-based prefetchers are one of the most promising spatial data prefetchers that achieve substantial performance gains. In bit-pattern-based prefetchers, the region size is a crucial parameter, which denotes the memory size that can be recorded by a pattern or prefetched by a prediction. However, existing bit-pattern-based prefetchers only support one fixed region size. Our experiment shows that the fixed region size cannot meet the requirements for numerous applications and leads to suboptimal performance and high hardware overhead. In this article, we propose PARS, a pattern-aware spatial data prefetcher supporting multiple region sizes. The key idea of PARS is that it supports multiple region sizes, enabling it to simultaneously enhance application performance while reducing the hardware overhead. Moreover, PARS supports dynamically switching appropriate region sizes for different patterns through an adaptive RS-switching mechanism. We evaluated PARS on numerous workloads and results show that PARS provides an average performance improvement of 40.6% over a baseline with no data prefetchers and outperforms the two state-of-the-art prefetchers Bingo by 2.1% (up to 24.4%) and Pythia by 3.9% (up to 111.2%) in the single-core system. In the four-core system, PARS outperforms Bingo by 5.0% (up to 66.0%) and Pythia by 5.4% (up to 177.9%).\",\"PeriodicalId\":13251,\"journal\":{\"name\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"volume\":\"43 11\",\"pages\":\"3638-3649\"},\"PeriodicalIF\":2.7000,\"publicationDate\":\"2024-11-06\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems\",\"FirstCategoryId\":\"94\",\"ListUrlMain\":\"https://ieeexplore.ieee.org/document/10745807/\",\"RegionNum\":3,\"RegionCategory\":\"计算机科学\",\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q2\",\"JCRName\":\"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10745807/","RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"COMPUTER SCIENCE, HARDWARE & ARCHITECTURE","Score":null,"Total":0}
PARS: A Pattern-Aware Spatial Data Prefetcher Supporting Multiple Region Sizes
Hardware data prefetching is a well-studied technique to bridge the processor-memory performance gap. Bit-pattern-based prefetchers are one of the most promising spatial data prefetchers that achieve substantial performance gains. In bit-pattern-based prefetchers, the region size is a crucial parameter, which denotes the memory size that can be recorded by a pattern or prefetched by a prediction. However, existing bit-pattern-based prefetchers only support one fixed region size. Our experiment shows that the fixed region size cannot meet the requirements for numerous applications and leads to suboptimal performance and high hardware overhead. In this article, we propose PARS, a pattern-aware spatial data prefetcher supporting multiple region sizes. The key idea of PARS is that it supports multiple region sizes, enabling it to simultaneously enhance application performance while reducing the hardware overhead. Moreover, PARS supports dynamically switching appropriate region sizes for different patterns through an adaptive RS-switching mechanism. We evaluated PARS on numerous workloads and results show that PARS provides an average performance improvement of 40.6% over a baseline with no data prefetchers and outperforms the two state-of-the-art prefetchers Bingo by 2.1% (up to 24.4%) and Pythia by 3.9% (up to 111.2%) in the single-core system. In the four-core system, PARS outperforms Bingo by 5.0% (up to 66.0%) and Pythia by 5.4% (up to 177.9%).
期刊介绍:
The purpose of this Transactions is to publish papers of interest to individuals in the area of computer-aided design of integrated circuits and systems composed of analog, digital, mixed-signal, optical, or microwave components. The aids include methods, models, algorithms, and man-machine interfaces for system-level, physical and logical design including: planning, synthesis, partitioning, modeling, simulation, layout, verification, testing, hardware-software co-design and documentation of integrated circuit and system designs of all complexities. Design tools and techniques for evaluating and designing integrated circuits and systems for metrics such as performance, power, reliability, testability, and security are a focus.