{"title":"实时成像的FPGA存储器优化","authors":"D. Houzet, V. Fresse, H. Konik","doi":"10.1109/DASIP.2016.7853816","DOIUrl":null,"url":null,"abstract":"most of advanced driver assistance systems are developed for safety and better driving. Safety system using image processing, like Hough transform, requires a lot of memory whose underutilization can lead to decrease the real time performances. Internal memories on reconfigurable devices such as FPGA are limited in size, number and bandwidth. Memory optimization cannot be done solely at the application level. Holistic design-space exploration is necessary to leverage the inherent locality of applications and reduce memory accesses. In this paper, we target FPGA internal memories optimization by adding a small register-based multi-ported cache memory in front of each internal FPGA memory block to increase their bandwidth. The dimensions of this cache are explored according to the locality of the function implemented. The exploration uses a cumulative-write cache exhibiting 1.5 to 2 speedup compared to the best FPGA implementations. The solution is optimized with an identical number of memory and few added registers and LUT.","PeriodicalId":6494,"journal":{"name":"2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)","volume":"34 1","pages":"176-182"},"PeriodicalIF":0.0000,"publicationDate":"2016-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"FPGA memory optimization for real-time imaging\",\"authors\":\"D. Houzet, V. Fresse, H. Konik\",\"doi\":\"10.1109/DASIP.2016.7853816\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"most of advanced driver assistance systems are developed for safety and better driving. Safety system using image processing, like Hough transform, requires a lot of memory whose underutilization can lead to decrease the real time performances. Internal memories on reconfigurable devices such as FPGA are limited in size, number and bandwidth. Memory optimization cannot be done solely at the application level. Holistic design-space exploration is necessary to leverage the inherent locality of applications and reduce memory accesses. In this paper, we target FPGA internal memories optimization by adding a small register-based multi-ported cache memory in front of each internal FPGA memory block to increase their bandwidth. The dimensions of this cache are explored according to the locality of the function implemented. The exploration uses a cumulative-write cache exhibiting 1.5 to 2 speedup compared to the best FPGA implementations. The solution is optimized with an identical number of memory and few added registers and LUT.\",\"PeriodicalId\":6494,\"journal\":{\"name\":\"2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)\",\"volume\":\"34 1\",\"pages\":\"176-182\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2016-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/DASIP.2016.7853816\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2016 Conference on Design and Architectures for Signal and Image Processing (DASIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/DASIP.2016.7853816","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
most of advanced driver assistance systems are developed for safety and better driving. Safety system using image processing, like Hough transform, requires a lot of memory whose underutilization can lead to decrease the real time performances. Internal memories on reconfigurable devices such as FPGA are limited in size, number and bandwidth. Memory optimization cannot be done solely at the application level. Holistic design-space exploration is necessary to leverage the inherent locality of applications and reduce memory accesses. In this paper, we target FPGA internal memories optimization by adding a small register-based multi-ported cache memory in front of each internal FPGA memory block to increase their bandwidth. The dimensions of this cache are explored according to the locality of the function implemented. The exploration uses a cumulative-write cache exhibiting 1.5 to 2 speedup compared to the best FPGA implementations. The solution is optimized with an identical number of memory and few added registers and LUT.