Human–Machine Collaborative Image Compression Method Based on Implicit Neural Representations

IF 3.7 2区 工程技术 Q2 ENGINEERING, ELECTRICAL & ELECTRONIC IEEE Journal on Emerging and Selected Topics in Circuits and Systems Pub Date : 2024-04-09 DOI:10.1109/JETCAS.2024.3386639
Huanyang Li;Xinfeng Zhang
{"title":"Human–Machine Collaborative Image Compression Method Based on Implicit Neural Representations","authors":"Huanyang Li;Xinfeng Zhang","doi":"10.1109/JETCAS.2024.3386639","DOIUrl":null,"url":null,"abstract":"With the explosive increase in the volume of images intended for analysis by AI, image coding for machine have been proposed to transmit information in a machine-interpretable format, thereby enhancing image compression efficiency. However, such efficient coding schemes often lead to issues like loss of image details and features, and unclear semantic information due to high data compression ratio, making them less suitable for human vision domains. Thus, it is a critical problem to balance image visual quality and machine vision accuracy at a given compression ratio. To address these issues, we introduce a human-machine collaborative image coding framework based on Implicit Neural Representations (INR), which effectively reduces the transmitted information for machine vision tasks at the decoding side while maintaining high-efficiency image compression for human vision against INR compression framework. To enhance the model’s perception of images for machine vision, we design a semantic embedding enhancement module to assist in understanding image semantics. Specifically, we employ the Swin Transformer model to initialize image features, ensuring that the embedding of the compression model are effectively applicable to downstream visual tasks. Extensive experimental results demonstrate that our method significantly outperforms other image compression methods in classification tasks while ensuring image compression efficiency.","PeriodicalId":48827,"journal":{"name":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","volume":"14 2","pages":"198-208"},"PeriodicalIF":3.7000,"publicationDate":"2024-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Journal on Emerging and Selected Topics in Circuits and Systems","FirstCategoryId":"5","ListUrlMain":"https://ieeexplore.ieee.org/document/10495030/","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"ENGINEERING, ELECTRICAL & ELECTRONIC","Score":null,"Total":0}
引用次数: 0

Abstract

With the explosive increase in the volume of images intended for analysis by AI, image coding for machine have been proposed to transmit information in a machine-interpretable format, thereby enhancing image compression efficiency. However, such efficient coding schemes often lead to issues like loss of image details and features, and unclear semantic information due to high data compression ratio, making them less suitable for human vision domains. Thus, it is a critical problem to balance image visual quality and machine vision accuracy at a given compression ratio. To address these issues, we introduce a human-machine collaborative image coding framework based on Implicit Neural Representations (INR), which effectively reduces the transmitted information for machine vision tasks at the decoding side while maintaining high-efficiency image compression for human vision against INR compression framework. To enhance the model’s perception of images for machine vision, we design a semantic embedding enhancement module to assist in understanding image semantics. Specifically, we employ the Swin Transformer model to initialize image features, ensuring that the embedding of the compression model are effectively applicable to downstream visual tasks. Extensive experimental results demonstrate that our method significantly outperforms other image compression methods in classification tasks while ensuring image compression efficiency.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
基于隐式神经表征的人机协作图像压缩方法
随着用于人工智能分析的图像数量呈爆炸式增长,人们提出了机器图像编码方案,以机器可解读的格式传输信息,从而提高图像压缩效率。然而,这种高效的编码方案往往会导致图像细节和特征的丢失,以及由于高数据压缩比导致语义信息不清晰等问题,使其不太适合人类视觉领域。因此,如何在给定的压缩比下平衡图像视觉质量和机器视觉精度是一个关键问题。为了解决这些问题,我们引入了基于隐式神经表征(INR)的人机协作图像编码框架,该框架在解码端有效减少了机器视觉任务的传输信息,同时在 INR 压缩框架下保持了人类视觉的高效图像压缩。为了增强机器视觉模型对图像的感知,我们设计了一个语义嵌入增强模块,以帮助理解图像语义。具体来说,我们采用 Swin Transformer 模型来初始化图像特征,确保压缩模型的嵌入有效地适用于下游视觉任务。广泛的实验结果表明,我们的方法在分类任务中明显优于其他图像压缩方法,同时确保了图像压缩效率。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
CiteScore
8.50
自引率
2.20%
发文量
86
期刊介绍: The IEEE Journal on Emerging and Selected Topics in Circuits and Systems is published quarterly and solicits, with particular emphasis on emerging areas, special issues on topics that cover the entire scope of the IEEE Circuits and Systems (CAS) Society, namely the theory, analysis, design, tools, and implementation of circuits and systems, spanning their theoretical foundations, applications, and architectures for signal and information processing.
期刊最新文献
Introducing IEEE Collabratec Table of Contents IEEE Journal on Emerging and Selected Topics in Circuits and Systems Information for Authors IEEE Circuits and Systems Society Information IEEE Journal on Emerging and Selected Topics in Circuits and Systems Publication Information
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1