{"title":"On The Efficiency of Heterogeneous System Architecture for Image Processing","authors":"S. Chetty, S. Winberg","doi":"10.1109/ICTAS47918.2020.233989","DOIUrl":null,"url":null,"abstract":"Graphics Processing Unit (GPU) based image processing algorithms have been previously developed to take advantage of the highly parallel nature of GPUs. However, these algorithms still exhibit problems of high programming complexity, relatively low device utilisation and difficulty when integrating into larger systems. In this paper, a set of image processing modules have been developed to take advantage of the computational characteristics of a System on a Chip (SoC) that contains both a GPU and a Central Processing Unit with fine grained shared virtual memory capabilities. The usage of shared memory simplifies design and removes the latency and bandwidth constraints associated with discrete GPUs on the Peripheral Component Interconnect Express (PCIe) bus. These modules feature a simple, composite design that improves upon previously developed algorithms by running discrete stages of the algorithms on the portions of the SoC that are best suited for them. This allows greater efficiency and lower code complexity than more expensive discrete-GPU-based alternatives.","PeriodicalId":431012,"journal":{"name":"2020 Conference on Information Communications Technology and Society (ICTAS)","volume":"26 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 Conference on Information Communications Technology and Society (ICTAS)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICTAS47918.2020.233989","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Graphics Processing Unit (GPU) based image processing algorithms have been previously developed to take advantage of the highly parallel nature of GPUs. However, these algorithms still exhibit problems of high programming complexity, relatively low device utilisation and difficulty when integrating into larger systems. In this paper, a set of image processing modules have been developed to take advantage of the computational characteristics of a System on a Chip (SoC) that contains both a GPU and a Central Processing Unit with fine grained shared virtual memory capabilities. The usage of shared memory simplifies design and removes the latency and bandwidth constraints associated with discrete GPUs on the Peripheral Component Interconnect Express (PCIe) bus. These modules feature a simple, composite design that improves upon previously developed algorithms by running discrete stages of the algorithms on the portions of the SoC that are best suited for them. This allows greater efficiency and lower code complexity than more expensive discrete-GPU-based alternatives.