{"title":"Analysis of Content-Aware Image Compression with VGG16","authors":"Alen Selimović, Blaž Meden, P. Peer, A. Hladnik","doi":"10.1109/IWOBI.2018.8464188","DOIUrl":null,"url":null,"abstract":"Content-aware compression based on the use of saliency maps aims to improve the interpretability of an image by encoding the more relevant image regions with a higher quality than the rest of the image. This paper revisits two convolutional neural network (CNN) models based on VGG16, multi-structure region of interest (MS-ROI) and class activation map (CAM), which enable the localization of salient image regions. While the MS-ROI model allows for the localization of multiple salient image regions, the CAM model, on the other hand, tends to localize only the most relevant class. We use the contextual information provided by the obtained saliency maps to guide the compression. By encoding more important image regions at a higher bitrate and less important ones at a lower bitrate, different qualities of compression for the regions of interest and the background are obtained, while also achieving smooth transitions from salient to non-salient regions. The performance of both models is evaluated on images from the MIT Saliency Benchmark dataset and the General-100 dataset, and the results of the compression are compared to the standard JPEG compression at different quality factors. Experimental results show that for the files of approximately same size, the compression methods based on the two CNN models outperform the standard JPEG compression. When comparing the compression based on the MS-ROI model to the compression based on the CAM model, the former is characterized by a higher PSNR and a better visual quality of the obtained images.","PeriodicalId":127078,"journal":{"name":"2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)","volume":"25 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2018-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"12","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"2018 IEEE International Work Conference on Bioinspired Intelligence (IWOBI)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/IWOBI.2018.8464188","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 12
Abstract
Content-aware compression based on the use of saliency maps aims to improve the interpretability of an image by encoding the more relevant image regions with a higher quality than the rest of the image. This paper revisits two convolutional neural network (CNN) models based on VGG16, multi-structure region of interest (MS-ROI) and class activation map (CAM), which enable the localization of salient image regions. While the MS-ROI model allows for the localization of multiple salient image regions, the CAM model, on the other hand, tends to localize only the most relevant class. We use the contextual information provided by the obtained saliency maps to guide the compression. By encoding more important image regions at a higher bitrate and less important ones at a lower bitrate, different qualities of compression for the regions of interest and the background are obtained, while also achieving smooth transitions from salient to non-salient regions. The performance of both models is evaluated on images from the MIT Saliency Benchmark dataset and the General-100 dataset, and the results of the compression are compared to the standard JPEG compression at different quality factors. Experimental results show that for the files of approximately same size, the compression methods based on the two CNN models outperform the standard JPEG compression. When comparing the compression based on the MS-ROI model to the compression based on the CAM model, the former is characterized by a higher PSNR and a better visual quality of the obtained images.