{"title":"多核嵌入式系统二级缓存的功耗感知设计","authors":"M. Rani, A. Asaduzzaman","doi":"10.1109/SECON.2010.5453931","DOIUrl":null,"url":null,"abstract":"Designing efficient cache, memory, and storage subsystem for modern embedded systems supporting a variety of applications is a great need. Embedded systems are being deployed with multicore processors to help parallel and distributed computing in order to meet the requirements for increased processing speed. Multiple cores offer manifold options to organize multi-level caches. A mixture of cache memory hierarchies are proposed to satisfy the requirements of high-performance low-power multicore embedded systems. In this paper, we investigate the impact of CL2 organizations on the performance and power consumption for multicore embedded systems. We simulate two 4-core architectures, one with shared CL2 and the other one with private CL2s. We use MPEG4, FFT, MI, and DFT applications/algorithms in our experiment. Simulation results depict that the mean delay and total power consumption significantly vary with the variations of CL2 organization and applications. It is observed that reductions in total power consumption and mean delay per task of up to 43% and 36%, respectively, are possible with optimized CL2, with an optimal choice of 256KB CL2 cache, 64 B CL2 line size, and 8-way CL2 associativity level.","PeriodicalId":286940,"journal":{"name":"Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon)","volume":"8 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2010-03-18","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"3","resultStr":"{\"title\":\"Power aware design of second level cache for multicore embedded systems\",\"authors\":\"M. Rani, A. Asaduzzaman\",\"doi\":\"10.1109/SECON.2010.5453931\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Designing efficient cache, memory, and storage subsystem for modern embedded systems supporting a variety of applications is a great need. Embedded systems are being deployed with multicore processors to help parallel and distributed computing in order to meet the requirements for increased processing speed. Multiple cores offer manifold options to organize multi-level caches. A mixture of cache memory hierarchies are proposed to satisfy the requirements of high-performance low-power multicore embedded systems. In this paper, we investigate the impact of CL2 organizations on the performance and power consumption for multicore embedded systems. We simulate two 4-core architectures, one with shared CL2 and the other one with private CL2s. We use MPEG4, FFT, MI, and DFT applications/algorithms in our experiment. Simulation results depict that the mean delay and total power consumption significantly vary with the variations of CL2 organization and applications. It is observed that reductions in total power consumption and mean delay per task of up to 43% and 36%, respectively, are possible with optimized CL2, with an optimal choice of 256KB CL2 cache, 64 B CL2 line size, and 8-way CL2 associativity level.\",\"PeriodicalId\":286940,\"journal\":{\"name\":\"Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon)\",\"volume\":\"8 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2010-03-18\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"3\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/SECON.2010.5453931\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Proceedings of the IEEE SoutheastCon 2010 (SoutheastCon)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/SECON.2010.5453931","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 3
摘要
为支持各种应用的现代嵌入式系统设计高效的缓存、内存和存储子系统是一个巨大的需求。嵌入式系统正在部署多核处理器,以帮助并行和分布式计算,以满足提高处理速度的要求。多核提供了多种选择来组织多级缓存。为了满足高性能、低功耗多核嵌入式系统的要求,提出了一种混合缓存层次结构。在本文中,我们研究了CL2组织对多核嵌入式系统的性能和功耗的影响。我们模拟了两个4核架构,一个使用共享CL2,另一个使用私有CL2。我们在实验中使用了MPEG4、FFT、MI和DFT应用/算法。仿真结果表明,随着CL2结构和应用的不同,平均时延和总功耗有显著的变化。可以观察到,使用优化的CL2,在256KB CL2缓存、64 B CL2线路大小和8路CL2关联级别的最佳选择下,每个任务的总功耗和平均延迟分别降低43%和36%。
Power aware design of second level cache for multicore embedded systems
Designing efficient cache, memory, and storage subsystem for modern embedded systems supporting a variety of applications is a great need. Embedded systems are being deployed with multicore processors to help parallel and distributed computing in order to meet the requirements for increased processing speed. Multiple cores offer manifold options to organize multi-level caches. A mixture of cache memory hierarchies are proposed to satisfy the requirements of high-performance low-power multicore embedded systems. In this paper, we investigate the impact of CL2 organizations on the performance and power consumption for multicore embedded systems. We simulate two 4-core architectures, one with shared CL2 and the other one with private CL2s. We use MPEG4, FFT, MI, and DFT applications/algorithms in our experiment. Simulation results depict that the mean delay and total power consumption significantly vary with the variations of CL2 organization and applications. It is observed that reductions in total power consumption and mean delay per task of up to 43% and 36%, respectively, are possible with optimized CL2, with an optimal choice of 256KB CL2 cache, 64 B CL2 line size, and 8-way CL2 associativity level.