Rate-Distortion-Perception Controllable Joint Source-Channel Coding for High-Fidelity Generative Semantic Communications

IF 7 1区 计算机科学 Q1 TELECOMMUNICATIONS IEEE Transactions on Cognitive Communications and Networking Pub Date : 2024-12-05 DOI:10.1109/TCCN.2024.3511960
Kailin Tan;Jincheng Dai;Zhenyu Liu;Sixian Wang;Xiaoqi Qin;Wenjun Xu;Kai Niu;Ping Zhang
{"title":"Rate-Distortion-Perception Controllable Joint Source-Channel Coding for High-Fidelity Generative Semantic Communications","authors":"Kailin Tan;Jincheng Dai;Zhenyu Liu;Sixian Wang;Xiaoqi Qin;Wenjun Xu;Kai Niu;Ping Zhang","doi":"10.1109/TCCN.2024.3511960","DOIUrl":null,"url":null,"abstract":"End-to-end image transmission has recently become a crucial trend in intelligent wireless communications, driven by the increasing demand for high bandwidth efficiency. However, existing methods primarily optimize the trade-off between bandwidth cost and objective distortion, often failing to deliver visually pleasing results aligned with human perception. In this paper, we propose a novel rate-distortion-perception (RDP) jointly optimized joint source-channel coding (JSCC) framework to enhance perception quality in human communications. Our RDP-JSCC framework integrates a flexible plug-in conditional Generative Adversarial Networks (GANs) to provide detailed and realistic image reconstructions at the receiver, overcoming the limitations of traditional rate-distortion optimized solutions that typically produce blurry or poorly textured images. Based on this framework, we introduce a distortion-perception controllable transmission (DPCT) model, which addresses the variation in the perception-distortion trade-off. DPCT uses a lightweight spatial realism embedding module (SREM) to condition the generator on a realism map, enabling the customization of appearance realism for each image region at the receiver from a single transmission. Furthermore, for scenarios with scarce bandwidth, we propose an interest-oriented content-controllable transmission (CCT) model. CCT prioritizes the transmission of regions that attract user attention and generates other regions from an instance label map, ensuring both content consistency and appearance realism for all regions while proportionally reducing channel bandwidth costs. Comprehensive experiments demonstrate the superiority of our RDP-optimized image transmission framework over state-of-the-art engineered image transmission systems and advanced perceptual methods.","PeriodicalId":13069,"journal":{"name":"IEEE Transactions on Cognitive Communications and Networking","volume":"11 2","pages":"672-686"},"PeriodicalIF":7.0000,"publicationDate":"2024-12-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"IEEE Transactions on Cognitive Communications and Networking","FirstCategoryId":"94","ListUrlMain":"https://ieeexplore.ieee.org/document/10778256/","RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"TELECOMMUNICATIONS","Score":null,"Total":0}
引用次数: 0

Abstract

End-to-end image transmission has recently become a crucial trend in intelligent wireless communications, driven by the increasing demand for high bandwidth efficiency. However, existing methods primarily optimize the trade-off between bandwidth cost and objective distortion, often failing to deliver visually pleasing results aligned with human perception. In this paper, we propose a novel rate-distortion-perception (RDP) jointly optimized joint source-channel coding (JSCC) framework to enhance perception quality in human communications. Our RDP-JSCC framework integrates a flexible plug-in conditional Generative Adversarial Networks (GANs) to provide detailed and realistic image reconstructions at the receiver, overcoming the limitations of traditional rate-distortion optimized solutions that typically produce blurry or poorly textured images. Based on this framework, we introduce a distortion-perception controllable transmission (DPCT) model, which addresses the variation in the perception-distortion trade-off. DPCT uses a lightweight spatial realism embedding module (SREM) to condition the generator on a realism map, enabling the customization of appearance realism for each image region at the receiver from a single transmission. Furthermore, for scenarios with scarce bandwidth, we propose an interest-oriented content-controllable transmission (CCT) model. CCT prioritizes the transmission of regions that attract user attention and generates other regions from an instance label map, ensuring both content consistency and appearance realism for all regions while proportionally reducing channel bandwidth costs. Comprehensive experiments demonstrate the superiority of our RDP-optimized image transmission framework over state-of-the-art engineered image transmission systems and advanced perceptual methods.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
高保真生成语义通信的速率-失真-感知可控联合信源-信道编码
在高带宽效率需求的推动下,端到端图像传输已成为智能无线通信的一个重要趋势。然而,现有的方法主要是优化带宽成本和客观失真之间的权衡,往往无法提供与人类感知一致的视觉愉悦结果。为了提高人类通信的感知质量,我们提出了一种新的速率失真感知(RDP)联合优化的联合信源信道编码(JSCC)框架。我们的RDP-JSCC框架集成了一个灵活的插件条件生成对抗网络(gan),在接收器上提供详细和逼真的图像重建,克服了传统的率失真优化解决方案的局限性,这些解决方案通常会产生模糊或纹理较差的图像。基于该框架,我们引入了失真-感知可控传输(DPCT)模型,该模型解决了感知-失真权衡的变化。DPCT使用轻量级的空间真实感嵌入模块(SREM)在真实感地图上调节生成器,从而能够从单个传输中为接收器的每个图像区域定制外观真实感。此外,对于带宽稀缺的场景,我们提出了一种面向兴趣的内容可控传输(CCT)模型。CCT优先传输吸引用户注意的区域,并从实例标签地图生成其他区域,确保所有区域的内容一致性和外观真实感,同时按比例降低信道带宽成本。综合实验证明了我们的rdp优化图像传输框架优于最先进的工程图像传输系统和先进的感知方法。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
IEEE Transactions on Cognitive Communications and Networking
IEEE Transactions on Cognitive Communications and Networking Computer Science-Artificial Intelligence
CiteScore
15.50
自引率
7.00%
发文量
108
期刊介绍: The IEEE Transactions on Cognitive Communications and Networking (TCCN) aims to publish high-quality manuscripts that push the boundaries of cognitive communications and networking research. Cognitive, in this context, refers to the application of perception, learning, reasoning, memory, and adaptive approaches in communication system design. The transactions welcome submissions that explore various aspects of cognitive communications and networks, focusing on innovative and holistic approaches to complex system design. Key topics covered include architecture, protocols, cross-layer design, and cognition cycle design for cognitive networks. Additionally, research on machine learning, artificial intelligence, end-to-end and distributed intelligence, software-defined networking, cognitive radios, spectrum sharing, and security and privacy issues in cognitive networks are of interest. The publication also encourages papers addressing novel services and applications enabled by these cognitive concepts.
期刊最新文献
PreNS: A Hybrid Predictive and Real-Time Resource Allocation Framework for 5G and beyond RAN Network Slicing TSS-LCD: A Temporal-Spectral-Spatial Guided Latent Conditional Diffusion Model for Spectrum Prediction Under Incomplete Observations Satellite-Cellular Coexistence in FR3 via Hybrid True-Time-Delay Array Based Nulling Semantic Radio Access Networks: Architecture, State-of-the-Art, and Future Directions Generative AI Agent Empowered Power Allocation for HAP Propulsion and Communication Systems
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1