嵌入式应用中基于跟踪的分割阵列缓存设计

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools Pub Date : 2010-09-01 DOI:10.1109/DSD.2010.33

A. Tokarnia, Marina Tachibana

{"title":"嵌入式应用中基于跟踪的分割阵列缓存设计","authors":"A. Tokarnia, Marina Tachibana","doi":"10.1109/DSD.2010.33","DOIUrl":null,"url":null,"abstract":"Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to optimize performance and power consumption. In this paper, we propose a method based on the analysis of accesses to vector, arrays, and other complex data structures to design a size-constrained two-partition array cache. This method reorganizes the ways of set-associative arrays caches into partitions with different line sizes and defines array-partition mappings so as to minimize the average memory access energy-delay product. Experimental results have shown that these split array caches have lower average energy-delay product for memory accesses as compared with unified set-associative array caches of the same size. For an MPEG-2 decoder, even with no parallel accesses to cache partitions, the average memory access energy-delay product of an 8K-byte trace-based split array cache is reduced by 50% as compared to that of the unified set-associative array cache with the lowest energy-delay product. If 25% of the accesses occur in pairs, there is an additional reduction of 9%.","PeriodicalId":356885,"journal":{"name":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","volume":"7 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-09-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"1","resultStr":"{\"title\":\"Design of Trace-Based Split Array Caches for Embedded Applications\",\"authors\":\"A. Tokarnia, Marina Tachibana\",\"doi\":\"10.1109/DSD.2010.33\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to optimize performance and power consumption. In this paper, we propose a method based on the analysis of accesses to vector, arrays, and other complex data structures to design a size-constrained two-partition array cache. This method reorganizes the ways of set-associative arrays caches into partitions with different line sizes and defines array-partition mappings so as to minimize the average memory access energy-delay product. Experimental results have shown that these split array caches have lower average energy-delay product for memory accesses as compared with unified set-associative array caches of the same size. For an MPEG-2 decoder, even with no parallel accesses to cache partitions, the average memory access energy-delay product of an 8K-byte trace-based split array cache is reduced by 50% as compared to that of the unified set-associative array cache with the lowest energy-delay product. If 25% of the accesses occur in pairs, there is an additional reduction of 9%.\",\"PeriodicalId\":356885,\"journal\":{\"name\":\"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools\",\"volume\":\"7 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-09-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"1\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DSD.2010.33\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DSD.2010.33","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 1

摘要

由于许多嵌入式系统执行一组预定义的程序，因此将系统组件调优到应用程序和数据是许多设计技术选择的方法，以优化性能和功耗。在本文中，我们提出了一种基于对向量、数组和其他复杂数据结构的访问分析的方法来设计一个大小受限的双分区数组缓存。该方法将集合关联数组缓存的方式重新组织成不同行大小的分区，并定义数组-分区映射，使平均存储器访问能量延迟积最小。实验结果表明，与相同大小的统一集合关联数组缓存相比，这些分割数组缓存具有更低的内存访问平均能量延迟积。对于MPEG-2解码器，即使没有并行访问缓存分区，与具有最低能量延迟积的统一集关联数组缓存相比，8k字节基于跟踪的分割数组缓存的平均内存访问能量延迟积减少了50%。如果25%的访问是成对进行的，则会额外减少9%。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Design of Trace-Based Split Array Caches for Embedded Applications

Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to optimize performance and power consumption. In this paper, we propose a method based on the analysis of accesses to vector, arrays, and other complex data structures to design a size-constrained two-partition array cache. This method reorganizes the ways of set-associative arrays caches into partitions with different line sizes and defines array-partition mappings so as to minimize the average memory access energy-delay product. Experimental results have shown that these split array caches have lower average energy-delay product for memory accesses as compared with unified set-associative array caches of the same size. For an MPEG-2 decoder, even with no parallel accesses to cache partitions, the average memory access energy-delay product of an 8K-byte trace-based split array cache is reduced by 50% as compared to that of the unified set-associative array cache with the lowest energy-delay product. If 25% of the accesses occur in pairs, there is an additional reduction of 9%.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2010 13th Euromicro Conference on Digital System Design: Architectures, Methods and Tools

自引率

0.00%

发文量

期刊最新文献

A Multicore SDR Architecture for Reconfigurable WiMAX Downlink Design of Testable Universal Logic Gate Targeting Minimum Wire-Crossings in QCA Logic Circuit Low Latency Recovery from Transient Faults for Pipelined Processor Architectures System Level Hardening by Computing with Matrices Reconfigurable Grid Alu Processor: Optimization and Design Space Exploration