Soenke Michalik, S. Michalik, J. Naghmouchi, Mladen Berekovic
{"title":"基于FPGA-SoC的实时智能立体摄像头","authors":"Soenke Michalik, S. Michalik, J. Naghmouchi, Mladen Berekovic","doi":"10.1109/HUMANOIDS.2017.8246891","DOIUrl":null,"url":null,"abstract":"Stereo image processing is one of the most demanding tasks in the field of 3D computer vision and robot vision requiring high-performance computing capabilities within embedded systems. Real-time constraints for autonomous vehicles such as humanoid robots, lead to hardware acceleration approaches for high resolution stereo imaging in human-like vision systems, where commonly FPGA device are employed to handle very high sensor data rates. This work presents a realtime smart stereo camera system implementation resembling the full stereo processing pipeline in a single FPGA device. We introduce the novel memory optimized stereo processing algorithm ”Sparse Retina Census Correlation” (SRCC) that embodies a combination of two well established window based stereo matching approaches. We have leveraged a Sum of Absolute Difference (SAD) of Sobel-filtered images and a Sum of Hamming Distance (SHD) using a modified Retina based Census Transform for increased robustness to lighting variations and for high accuracy. A color rectification module has been implemented to cope with the high frame rate of the stereo pipelining calculating image transformations and rectified pixel coordinates in real-time using parameters for camera intrinsic, image rotation, image distortion and image projection. In addition multiple post-processing algorithms like texture filtering, uniqueness filtering, speckle removal and disparity to depth conversion have been implemented to further enhance the output results. The presented smart camera solution has demonstrated real-time stereo processing of 1280×720 pixel depth images with 256 disparities on a Zynq XC7Z030 FPGA device at 60fps. Due to the universal USB3.0 UVC interface and the onboard depth calculation it is a replacement for RGBD 3D-Sensors with improved image quality and outdoor performance. The camera can easily be used in conjunction with ROS-enabled robots and in automotive or industrial applications.","PeriodicalId":143992,"journal":{"name":"2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids)","volume":"60 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2017-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"11","resultStr":"{\"title\":\"Real-time smart stereo camera based on FPGA-SoC\",\"authors\":\"Soenke Michalik, S. Michalik, J. Naghmouchi, Mladen Berekovic\",\"doi\":\"10.1109/HUMANOIDS.2017.8246891\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Stereo image processing is one of the most demanding tasks in the field of 3D computer vision and robot vision requiring high-performance computing capabilities within embedded systems. Real-time constraints for autonomous vehicles such as humanoid robots, lead to hardware acceleration approaches for high resolution stereo imaging in human-like vision systems, where commonly FPGA device are employed to handle very high sensor data rates. This work presents a realtime smart stereo camera system implementation resembling the full stereo processing pipeline in a single FPGA device. We introduce the novel memory optimized stereo processing algorithm ”Sparse Retina Census Correlation” (SRCC) that embodies a combination of two well established window based stereo matching approaches. We have leveraged a Sum of Absolute Difference (SAD) of Sobel-filtered images and a Sum of Hamming Distance (SHD) using a modified Retina based Census Transform for increased robustness to lighting variations and for high accuracy. A color rectification module has been implemented to cope with the high frame rate of the stereo pipelining calculating image transformations and rectified pixel coordinates in real-time using parameters for camera intrinsic, image rotation, image distortion and image projection. In addition multiple post-processing algorithms like texture filtering, uniqueness filtering, speckle removal and disparity to depth conversion have been implemented to further enhance the output results. The presented smart camera solution has demonstrated real-time stereo processing of 1280×720 pixel depth images with 256 disparities on a Zynq XC7Z030 FPGA device at 60fps. Due to the universal USB3.0 UVC interface and the onboard depth calculation it is a replacement for RGBD 3D-Sensors with improved image quality and outdoor performance. The camera can easily be used in conjunction with ROS-enabled robots and in automotive or industrial applications.\",\"PeriodicalId\":143992,\"journal\":{\"name\":\"2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids)\",\"volume\":\"60 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2017-11-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"11\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/HUMANOIDS.2017.8246891\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2017 IEEE-RAS 17th International Conference on Humanoid Robotics (Humanoids)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/HUMANOIDS.2017.8246891","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
Stereo image processing is one of the most demanding tasks in the field of 3D computer vision and robot vision requiring high-performance computing capabilities within embedded systems. Real-time constraints for autonomous vehicles such as humanoid robots, lead to hardware acceleration approaches for high resolution stereo imaging in human-like vision systems, where commonly FPGA device are employed to handle very high sensor data rates. This work presents a realtime smart stereo camera system implementation resembling the full stereo processing pipeline in a single FPGA device. We introduce the novel memory optimized stereo processing algorithm ”Sparse Retina Census Correlation” (SRCC) that embodies a combination of two well established window based stereo matching approaches. We have leveraged a Sum of Absolute Difference (SAD) of Sobel-filtered images and a Sum of Hamming Distance (SHD) using a modified Retina based Census Transform for increased robustness to lighting variations and for high accuracy. A color rectification module has been implemented to cope with the high frame rate of the stereo pipelining calculating image transformations and rectified pixel coordinates in real-time using parameters for camera intrinsic, image rotation, image distortion and image projection. In addition multiple post-processing algorithms like texture filtering, uniqueness filtering, speckle removal and disparity to depth conversion have been implemented to further enhance the output results. The presented smart camera solution has demonstrated real-time stereo processing of 1280×720 pixel depth images with 256 disparities on a Zynq XC7Z030 FPGA device at 60fps. Due to the universal USB3.0 UVC interface and the onboard depth calculation it is a replacement for RGBD 3D-Sensors with improved image quality and outdoor performance. The camera can easily be used in conjunction with ROS-enabled robots and in automotive or industrial applications.