保持语义的图像压缩

2020 IEEE International Conference on Image Processing (ICIP) Pub Date : 2020-10-01 DOI:10.1109/ICIP40778.2020.9191247

Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi

{"title":"保持语义的图像压缩","authors":"Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi","doi":"10.1109/ICIP40778.2020.9191247","DOIUrl":null,"url":null,"abstract":"Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.","PeriodicalId":405734,"journal":{"name":"2020 IEEE International Conference on Image Processing (ICIP)","volume":"32 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2020-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"24","resultStr":"{\"title\":\"Semantic-Preserving Image Compression\",\"authors\":\"Neel Patwa, Nilesh A. Ahuja, Srinivasa Somayazulu, Omesh Tickoo, S. Varadarajan, S. Koolagudi\",\"doi\":\"10.1109/ICIP40778.2020.9191247\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.\",\"PeriodicalId\":405734,\"journal\":{\"name\":\"2020 IEEE International Conference on Image Processing (ICIP)\",\"volume\":\"32 1\",\"pages\":\"0\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2020-10-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"24\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"2020 IEEE International Conference on Image Processing (ICIP)\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://doi.org/10.1109/ICIP40778.2020.9191247\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"2020 IEEE International Conference on Image Processing (ICIP)","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1109/ICIP40778.2020.9191247","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 24

摘要

如今，视频流量占互联网总流量的很大一部分。未压缩的视觉数据需要非常大的数据速率;采用有损压缩技术是为了保持数据速率可管理。越来越多的生成的可视化数据被驻留在云中的分析(如分类、检测等)所消耗。图像和视频压缩会产生视觉伪影，特别是在较低的数据速率下，这可能导致此类分析任务的性能显著下降。此外，标准的图像和视频压缩技术旨在通过为场景的感知重要特征分配更多的比特来优化人类消费的感知质量。然而，这些特性不一定是最适合语义任务的。我们在这里提出了一种压缩视觉数据的方法，以便在给定的分析任务上最大化性能。我们使用多任务损失来训练深度自编码器来学习相关的嵌入。在训练过程中使用了量化器的近似可微模型，这有助于提高推理过程中的准确性。我们将该方法应用于图像分类问题，并表明对于给定的压缩级别，它比使用JPEG压缩的图像执行分类获得更高的分类精度。我们的方法也比相关的最先进的方法要好得多。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semantic-Preserving Image Compression

Video traffic comprises a large majority of the total traffic on the internet today. Uncompressed visual data requires a very large data rate; lossy compression techniques are employed in order to keep the data-rate manageable. Increasingly, a significant amount of visual data being generated is consumed by analytics (such as classification, detection, etc.) residing in the cloud. Image and video compression can produce visual artifacts, especially at lower data-rates, which can result in a significant drop in performance on such analytic tasks. Moreover, standard image and video compression techniques aim to optimize perceptual quality for human consumption by allocating more bits to perceptually significant features of the scene. However, these features may not necessarily be the most suitable ones for semantic tasks. We present here an approach to compress visual data in order to maximize performance on a given analytic task. We train a deep auto-encoder using a multi-task loss to learn the relevant embeddings. An approximate differentiable model of the quantizer is used during training which helps boost the accuracy during inference. We apply our approach on an image classification problem and show that for a given level of compression, it achieves higher classification accuracy than that obtained by performing classification on images compressed using JPEG. Our approach also outperforms the relevant state-of-the-art approach by a significant margin.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

2020 IEEE International Conference on Image Processing (ICIP)

自引率

0.00%

发文量