{"title":"A study of the use of SIMD instructions for two image processing algorithms","authors":"E. Welch, D. Patru, E. Saber, K. Bengtson","doi":"10.1109/WNYIPW.2012.6466650","DOIUrl":null,"url":null,"abstract":"Most image processing algorithms are parallelizable, i.e. the calculation of one pixel does not affect another one. SIMD architectures, including Intel's WMMX and SSE and ARM's NEON, can exploit this fact by processing multiple pixels at a time, which can result in significant speedups. This study investigates the use of NEON SIMD instructions for two image processing algorithms. The latter are altered to process four pixels at a time, for which a theoretical speedup factor of four can be achieved. In addition, parts of the original implementation have been replaced with inline functions or modified at assembly code level. Experimental benchmark data shows the actual execution speed to be between two to three times higher than the original reference. These results prove that SIMD instructions can significantly speedup image processing algorithms through proper code manipulations.","PeriodicalId":218110,"journal":{"name":"2012 Western New York Image Processing Workshop","volume":"10 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2012-11-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2012 Western New York Image Processing Workshop","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/WNYIPW.2012.6466650","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Most image processing algorithms are parallelizable, i.e. the calculation of one pixel does not affect another one. SIMD architectures, including Intel's WMMX and SSE and ARM's NEON, can exploit this fact by processing multiple pixels at a time, which can result in significant speedups. This study investigates the use of NEON SIMD instructions for two image processing algorithms. The latter are altered to process four pixels at a time, for which a theoretical speedup factor of four can be achieved. In addition, parts of the original implementation have been replaced with inline functions or modified at assembly code level. Experimental benchmark data shows the actual execution speed to be between two to three times higher than the original reference. These results prove that SIMD instructions can significantly speedup image processing algorithms through proper code manipulations.