Susana Rebolledo Ruiz, Borja Perez, Jose Luis Bosque, Peter Hsu
{"title":"双核高速缓存:矢量架构的分离式高速缓存","authors":"Susana Rebolledo Ruiz, Borja Perez, Jose Luis Bosque, Peter Hsu","doi":"arxiv-2407.15440","DOIUrl":null,"url":null,"abstract":"The Bicameral Cache is a cache organization proposal for a vector\narchitecture that segregates data according to their access type,\ndistinguishing scalar from vector references. Its aim is to avoid both types of\nreferences from interfering in each other's data locality, with a special focus\non prioritizing the performance on vector references. The proposed system\nincorporates an additional, non-polluting prefetching mechanism to help\npopulate the long vector cache lines in advance to increase the hit rate by\nfurther exploiting the spatial locality on vector data. Its evaluation was\nconducted on the Cavatools simulator, comparing the performance to a standard\nconventional cache, over different typical vector benchmarks for several vector\nlengths. The results proved the proposed cache speeds up performance on\nstride-1 vector benchmarks, while hardly impacting non-stride-1's. In addition,\nthe prefetching feature consistently provided an additional value.","PeriodicalId":501291,"journal":{"name":"arXiv - CS - Performance","volume":"57 1","pages":""},"PeriodicalIF":0.0000,"publicationDate":"2024-07-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"The Bicameral Cache: a split cache for vector architectures\",\"authors\":\"Susana Rebolledo Ruiz, Borja Perez, Jose Luis Bosque, Peter Hsu\",\"doi\":\"arxiv-2407.15440\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"The Bicameral Cache is a cache organization proposal for a vector\\narchitecture that segregates data according to their access type,\\ndistinguishing scalar from vector references. Its aim is to avoid both types of\\nreferences from interfering in each other's data locality, with a special focus\\non prioritizing the performance on vector references. The proposed system\\nincorporates an additional, non-polluting prefetching mechanism to help\\npopulate the long vector cache lines in advance to increase the hit rate by\\nfurther exploiting the spatial locality on vector data. Its evaluation was\\nconducted on the Cavatools simulator, comparing the performance to a standard\\nconventional cache, over different typical vector benchmarks for several vector\\nlengths. The results proved the proposed cache speeds up performance on\\nstride-1 vector benchmarks, while hardly impacting non-stride-1's. In addition,\\nthe prefetching feature consistently provided an additional value.\",\"PeriodicalId\":501291,\"journal\":{\"name\":\"arXiv - CS - Performance\",\"volume\":\"57 1\",\"pages\":\"\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-07-22\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"arXiv - CS - Performance\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/arxiv-2407.15440\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"arXiv - CS - Performance","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/arxiv-2407.15440","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
The Bicameral Cache: a split cache for vector architectures
The Bicameral Cache is a cache organization proposal for a vector
architecture that segregates data according to their access type,
distinguishing scalar from vector references. Its aim is to avoid both types of
references from interfering in each other's data locality, with a special focus
on prioritizing the performance on vector references. The proposed system
incorporates an additional, non-polluting prefetching mechanism to help
populate the long vector cache lines in advance to increase the hit rate by
further exploiting the spatial locality on vector data. Its evaluation was
conducted on the Cavatools simulator, comparing the performance to a standard
conventional cache, over different typical vector benchmarks for several vector
lengths. The results proved the proposed cache speeds up performance on
stride-1 vector benchmarks, while hardly impacting non-stride-1's. In addition,
the prefetching feature consistently provided an additional value.