L. Bauer, Artjom Grudnitsky, Marvin Damschen, Srinivas Rao Kerekare, J. Henkel
{"title":"动态可重构处理器中流处理应用的浮点加速","authors":"L. Bauer, Artjom Grudnitsky, Marvin Damschen, Srinivas Rao Kerekare, J. Henkel","doi":"10.1109/ESTIMedia.2015.7351762","DOIUrl":null,"url":null,"abstract":"Runtime reconfigurable processors provide a large degree of flexibility that allows them to dynamically adapt to different applications and requirements. They couple a standard processor with a runtime reconfigurable fabric (like an embedded FPGA) to offload computationally intensive kernels. In this paper we present the design and architecture of a flexible accelerator for floating point operations in stream processing applications. To integrate it in an existing reconfigurable processor, the different frequencies between the sequential processor (high frequency) and parallel accelerators (low frequencies) have to be managed. The results show 63.70× and 3.85× better performance-per-area efficiency when using our accelerator and the reconfigurable processor compared to the baseline processor with a soft-float implementation and a high-performance floating point unit, respectively.","PeriodicalId":350361,"journal":{"name":"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","volume":"33 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-12-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"4","resultStr":"{\"title\":\"Floating point acceleration for stream processing applications in dynamically reconfigurable processors\",\"authors\":\"L. Bauer, Artjom Grudnitsky, Marvin Damschen, Srinivas Rao Kerekare, J. Henkel\",\"doi\":\"10.1109/ESTIMedia.2015.7351762\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Runtime reconfigurable processors provide a large degree of flexibility that allows them to dynamically adapt to different applications and requirements. They couple a standard processor with a runtime reconfigurable fabric (like an embedded FPGA) to offload computationally intensive kernels. In this paper we present the design and architecture of a flexible accelerator for floating point operations in stream processing applications. To integrate it in an existing reconfigurable processor, the different frequencies between the sequential processor (high frequency) and parallel accelerators (low frequencies) have to be managed. The results show 63.70× and 3.85× better performance-per-area efficiency when using our accelerator and the reconfigurable processor compared to the baseline processor with a soft-float implementation and a high-performance floating point unit, respectively.\",\"PeriodicalId\":350361,\"journal\":{\"name\":\"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)\",\"volume\":\"33 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-12-17\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"4\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ESTIMedia.2015.7351762\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 13th IEEE Symposium on Embedded Systems For Real-time Multimedia (ESTIMedia)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ESTIMedia.2015.7351762","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Floating point acceleration for stream processing applications in dynamically reconfigurable processors
Runtime reconfigurable processors provide a large degree of flexibility that allows them to dynamically adapt to different applications and requirements. They couple a standard processor with a runtime reconfigurable fabric (like an embedded FPGA) to offload computationally intensive kernels. In this paper we present the design and architecture of a flexible accelerator for floating point operations in stream processing applications. To integrate it in an existing reconfigurable processor, the different frequencies between the sequential processor (high frequency) and parallel accelerators (low frequencies) have to be managed. The results show 63.70× and 3.85× better performance-per-area efficiency when using our accelerator and the reconfigurable processor compared to the baseline processor with a soft-float implementation and a high-performance floating point unit, respectively.