S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal
{"title":"基于FPGA-GPU的高能效混合嵌入式平台,加速人脸识别应用","authors":"S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal","doi":"10.1109/CoolChips.2015.7158532","DOIUrl":null,"url":null,"abstract":"Nowadays face recognition application is widely used in various industries such as traffic, safety, medical engineering, etc. In this paper, we propose a power and energy efficient heterogeneous platform to accelerate face recognition applications. To achieve this efficiency, we propose a novel hybrid platform which consists of a Xilinx Zynq (ARM+FPGA) and an NVidia's Jetson TK1 (ARM+GPU) coupled with PCIe card. In this application, we optimized local binary pattern and eigenvalue based face detection and recognition in order to achieve a speedup of 69x when compared to sequential execution on the ARM core, 4.8x against Zynq platform (ARM+FPGA), 3.2x against NVidia platform (ARM+GPU) and 40% more energy efficient against sequential execution.","PeriodicalId":358999,"journal":{"name":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","volume":"1 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"9","resultStr":"{\"title\":\"An energy efficient hybrid FPGA-GPU based embedded platform to accelerate face recognition application\",\"authors\":\"S. Rethinagiri, Oscar Palomar, J. Moreno, O. Unsal, A. Cristal\",\"doi\":\"10.1109/CoolChips.2015.7158532\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Nowadays face recognition application is widely used in various industries such as traffic, safety, medical engineering, etc. In this paper, we propose a power and energy efficient heterogeneous platform to accelerate face recognition applications. To achieve this efficiency, we propose a novel hybrid platform which consists of a Xilinx Zynq (ARM+FPGA) and an NVidia's Jetson TK1 (ARM+GPU) coupled with PCIe card. In this application, we optimized local binary pattern and eigenvalue based face detection and recognition in order to achieve a speedup of 69x when compared to sequential execution on the ARM core, 4.8x against Zynq platform (ARM+FPGA), 3.2x against NVidia platform (ARM+GPU) and 40% more energy efficient against sequential execution.\",\"PeriodicalId\":358999,\"journal\":{\"name\":\"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)\",\"volume\":\"1 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2015-04-13\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"9\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/CoolChips.2015.7158532\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2015 IEEE Symposium in Low-Power and High-Speed Chips (COOL CHIPS XVIII)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/CoolChips.2015.7158532","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
An energy efficient hybrid FPGA-GPU based embedded platform to accelerate face recognition application
Nowadays face recognition application is widely used in various industries such as traffic, safety, medical engineering, etc. In this paper, we propose a power and energy efficient heterogeneous platform to accelerate face recognition applications. To achieve this efficiency, we propose a novel hybrid platform which consists of a Xilinx Zynq (ARM+FPGA) and an NVidia's Jetson TK1 (ARM+GPU) coupled with PCIe card. In this application, we optimized local binary pattern and eigenvalue based face detection and recognition in order to achieve a speedup of 69x when compared to sequential execution on the ARM core, 4.8x against Zynq platform (ARM+FPGA), 3.2x against NVidia platform (ARM+GPU) and 40% more energy efficient against sequential execution.