Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559165
Heasung Kim;Gustavo de Veciana;Hyeji Kim
In modern wireless systems, the feedback of DownLink (DL) Channel State Information (CSI) from User Equipment (UE) to Base Stations (BS) may require substantial computational and feedback bandwidth overheads. A promising approach to improve feedback efficiency is to leverage side information which is correlated to DL CSI. Despite potential of doing so, critical aspects remain underexplored in current research, particularly the quantification of the benefits and the inherent limitations of utilizing side information. This paper addresses these gaps by introducing a novel algorithm to compute the rate-distortion function for general compression scenarios incorporating side information. We apply this algorithm to the DL CSI feedback problem having UL CSI as the side information and generate rate-distortion functions. Using the estimated rate-distortion functions, we measure the gain of side information over diverse feedback rates and UE mobility profiles. The results reveal that the benefits of leveraging side information are particularly significant for UEs characterized by high mobility and constrained to operate at low feedback overheads.
{"title":"Fundamental Limits to Exploiting Side Information for CSI Feedback in Wireless Systems","authors":"Heasung Kim;Gustavo de Veciana;Hyeji Kim","doi":"10.1109/JSAC.2025.3559165","DOIUrl":"10.1109/JSAC.2025.3559165","url":null,"abstract":"In modern wireless systems, the feedback of DownLink (DL) Channel State Information (CSI) from User Equipment (UE) to Base Stations (BS) may require substantial computational and feedback bandwidth overheads. A promising approach to improve feedback efficiency is to leverage side information which is correlated to DL CSI. Despite potential of doing so, critical aspects remain underexplored in current research, particularly the quantification of the benefits and the inherent limitations of utilizing side information. This paper addresses these gaps by introducing a novel algorithm to compute the rate-distortion function for general compression scenarios incorporating side information. We apply this algorithm to the DL CSI feedback problem having UL CSI as the side information and generate rate-distortion functions. Using the estimated rate-distortion functions, we measure the gain of side information over diverse feedback rates and UE mobility profiles. The results reveal that the benefits of leveraging side information are particularly significant for UEs characterized by high mobility and constrained to operate at low feedback overheads.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2417-2430"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813428","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559152
Pietro Talli;Edoardo David Santi;Federico Chiariotti;Touraj Soleymani;Federico Mason;Andrea Zanella;Deniz Gündüz
Pragmatic or goal-oriented communication can optimize communication decisions beyond the reliable transmission of data, instead aiming at directly affecting application performance with the minimum channel utilization. In this paper, we develop a general theoretical framework for the remote control of finite-state Markov processes, using pragmatic communication over a costly zero-delay communication channel. To that end, we model a cyber-physical system composed of an encoder, which observes and transmits the states of a process in real-time, and a decoder, which receives that information and controls the behavior of the process. The encoder and the decoder should cooperatively optimize the trade-off between the control performance (i.e., reward) and the communication cost (i.e., channel use). This scenario underscores a pragmatic (i.e., goal-oriented) communication problem, where the purpose is to convey only the data that is most valuable for the underlying task, taking into account the state of the decoder (hence, the pragmatic aspect). We investigate two different decision-making architectures: in pull-based remote control, the decoder is the only decision-maker, while in push-based remote control, the encoder and the decoder constitute two independent decision-makers, leading to a multi-agent scenario. We propose three algorithms to optimize our system (i.e., design the encoder and the decoder policies), discuss the optimality guarantees ofs the algorithms, and shed light on their computational complexity and fundamental limits.
{"title":"Pragmatic Communication for Remote Control of Finite-State Markov Processes","authors":"Pietro Talli;Edoardo David Santi;Federico Chiariotti;Touraj Soleymani;Federico Mason;Andrea Zanella;Deniz Gündüz","doi":"10.1109/JSAC.2025.3559152","DOIUrl":"10.1109/JSAC.2025.3559152","url":null,"abstract":"Pragmatic or goal-oriented communication can optimize communication decisions beyond the reliable transmission of data, instead aiming at directly affecting application performance with the minimum channel utilization. In this paper, we develop a general theoretical framework for the remote control of finite-state Markov processes, using pragmatic communication over a costly zero-delay communication channel. To that end, we model a cyber-physical system composed of an encoder, which observes and transmits the states of a process in real-time, and a decoder, which receives that information and controls the behavior of the process. The encoder and the decoder should cooperatively optimize the trade-off between the control performance (i.e., reward) and the communication cost (i.e., channel use). This scenario underscores a pragmatic (i.e., goal-oriented) communication problem, where the purpose is to convey only the data that is most valuable for the underlying task, taking into account the state of the decoder (hence, the pragmatic aspect). We investigate two different decision-making architectures: in pull-based remote control, the decoder is the only decision-maker, while in push-based remote control, the encoder and the decoder constitute two independent decision-makers, leading to a multi-agent scenario. We propose three algorithms to optimize our system (i.e., design the encoder and the decoder policies), discuss the optimality guarantees ofs the algorithms, and shed light on their computational complexity and fundamental limits.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2589-2603"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559121
Yufeng Diao;Yichi Zhang;Changyang She;Philip Guodong Zhao;Emma Liying Li
Existing communication systems aim to reconstruct the information at the receiver side, and are known as reconstruction-oriented communications. This approach often falls short in meeting the real-time, task-specific demands of modern AI-driven applications such as autonomous driving and semantic segmentation. As a new design principle, task-oriented communications have been developed. However, it typically requires joint optimization of encoder, decoder, and modified inference neural networks, resulting in extensive cross-system redesigns and compatibility issues. This paper proposes a novel communication framework that aligns reconstruction-oriented and task-oriented communications for edge intelligence. The idea is to extend the Information Bottleneck (IB) theory to optimize data transmission by minimizing task-relevant loss function, while maintaining the structure of the original data by an information reshaper. Such an approach integrates task-oriented communications with reconstruction-oriented communications, where a variational approach is designed to handle the intractability of mutual information in high-dimensional neural network features. We also introduce a joint source-channel coding (JSCC) modulation scheme compatible with classical modulation techniques, enabling the deployment of AI technologies within existing digital infrastructures. The proposed framework is particularly effective in edge-based autonomous driving scenarios. Our evaluation in the Car Learning to Act (CARLA) simulator demonstrates that the proposed framework significantly reduces bits per service by 99.19% compared to existing methods, such as JPEG, JPEG2000, and BPG, without compromising the effectiveness of task execution.
现有通信系统的目标是重构接收端的信息,被称为面向重构的通信。这种方法往往无法满足现代人工智能驱动的应用程序(如自动驾驶和语义分割)的实时、特定任务需求。面向任务的通信作为一种新的设计原则得到了发展。然而,它通常需要联合优化编码器、解码器和修改后的推理神经网络,从而导致广泛的跨系统重新设计和兼容性问题。本文提出了一种新的边缘智能通信框架,该框架将面向重构和面向任务的通信结合起来。其思想是扩展信息瓶颈(IB)理论,通过最小化任务相关损失函数来优化数据传输,同时通过信息重塑器保持原始数据的结构。该方法集成了面向任务的通信和面向重建的通信,其中设计了一种变分方法来处理高维神经网络特征中互信息的难治性。我们还介绍了一种与经典调制技术兼容的联合源信道编码(JSCC)调制方案,使人工智能技术能够在现有的数字基础设施中部署。该框架在基于边缘的自动驾驶场景中特别有效。我们在Car Learning to Act (CARLA)模拟器中的评估表明,与现有方法(如JPEG、JPEG2000和BPG)相比,所提出的框架在不影响任务执行效率的情况下,显著减少了99.19%的每个服务比特数。
{"title":"Aligning Task- and Reconstruction-Oriented Communications for Edge Intelligence","authors":"Yufeng Diao;Yichi Zhang;Changyang She;Philip Guodong Zhao;Emma Liying Li","doi":"10.1109/JSAC.2025.3559121","DOIUrl":"10.1109/JSAC.2025.3559121","url":null,"abstract":"Existing communication systems aim to reconstruct the information at the receiver side, and are known as reconstruction-oriented communications. This approach often falls short in meeting the real-time, task-specific demands of modern AI-driven applications such as autonomous driving and semantic segmentation. As a new design principle, task-oriented communications have been developed. However, it typically requires joint optimization of encoder, decoder, and modified inference neural networks, resulting in extensive cross-system redesigns and compatibility issues. This paper proposes a novel communication framework that aligns reconstruction-oriented and task-oriented communications for edge intelligence. The idea is to extend the Information Bottleneck (IB) theory to optimize data transmission by minimizing task-relevant loss function, while maintaining the structure of the original data by an information reshaper. Such an approach integrates task-oriented communications with reconstruction-oriented communications, where a variational approach is designed to handle the intractability of mutual information in high-dimensional neural network features. We also introduce a joint source-channel coding (JSCC) modulation scheme compatible with classical modulation techniques, enabling the deployment of AI technologies within existing digital infrastructures. The proposed framework is particularly effective in edge-based autonomous driving scenarios. Our evaluation in the Car Learning to Act (CARLA) simulator demonstrates that the proposed framework significantly reduces bits per service by 99.19% compared to existing methods, such as JPEG, JPEG2000, and BPG, without compromising the effectiveness of task execution.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2575-2588"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813427","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559127
Dongwook Won;Quang Tuan Do;Thwe Thwe Win;Donghyun Lee;Junsuk Oh;Sungrae Cho
The domain adaptation issues in semantic communications become critical when transmitter and receiver operate across different multiple domains or when input data during inference have different distributional characteristics than the data used to train semantic encoders and decoders. In this paper, we introduce the Multidomain Adaptive Deep Semantic Communication (MA-DeepSC) framework, designed to enhance semantic communications across multiple domains. Our framework consists of two core components: the Multidomain Adaptive Semantic Coding Network (MASCN), inherently designed to adapt semantic encoding and decoding across multiple domains, and the multidomain data adaptation network (MDAN), which transforms actual observable data into the data on which the system was initially trained, thus obviating the need for retraining the existing pre-trained semantic coding network. We validate our approach through experiments on digit datasets and CelebA, observing significant outperformance over existing techniques. In addition, we analyze the strategic benefits and drawbacks of both MASC and MDAN, assessing their applicability under various scenarios. The source code for MA-DeepSC is available at https://github.com/wongdongwook/JSAC_MA-DeepSC
{"title":"Multidomain Adaptive Semantic Communications","authors":"Dongwook Won;Quang Tuan Do;Thwe Thwe Win;Donghyun Lee;Junsuk Oh;Sungrae Cho","doi":"10.1109/JSAC.2025.3559127","DOIUrl":"10.1109/JSAC.2025.3559127","url":null,"abstract":"The domain adaptation issues in semantic communications become critical when transmitter and receiver operate across different multiple domains or when input data during inference have different distributional characteristics than the data used to train semantic encoders and decoders. In this paper, we introduce the Multidomain Adaptive Deep Semantic Communication (MA-DeepSC) framework, designed to enhance semantic communications across multiple domains. Our framework consists of two core components: the Multidomain Adaptive Semantic Coding Network (MASCN), inherently designed to adapt semantic encoding and decoding across multiple domains, and the multidomain data adaptation network (MDAN), which transforms actual observable data into the data on which the system was initially trained, thus obviating the need for retraining the existing pre-trained semantic coding network. We validate our approach through experiments on digit datasets and CelebA, observing significant outperformance over existing techniques. In addition, we analyze the strategic benefits and drawbacks of both MASC and MDAN, assessing their applicability under various scenarios. The source code for MA-DeepSC is available at <uri>https://github.com/wongdongwook/JSAC_MA-DeepSC</uri>","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2506-2517"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10960416","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559113
Peiwen Jiang;Chao-Kai Wen;Xiao Li;Shi Jin;Geoffrey Ye Li
Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay.
{"title":"Semantic Satellite Communications Based on Generative Foundation Model","authors":"Peiwen Jiang;Chao-Kai Wen;Xiao Li;Shi Jin;Geoffrey Ye Li","doi":"10.1109/JSAC.2025.3559113","DOIUrl":"10.1109/JSAC.2025.3559113","url":null,"abstract":"Satellite communications can provide massive connections and seamless coverage, but they also face several challenges, such as rain attenuation, long propagation delays, and co-channel interference. To improve transmission efficiency and address severe scenarios, semantic communication has become a popular choice, particularly when equipped with foundation models (FMs). In this study, we introduce an FM-based semantic satellite communication framework, termed FMSAT. This framework leverages FM-based segmentation and reconstruction to significantly reduce bandwidth requirements and accurately recover semantic features under high noise and interference. Considering the high speed of satellites, an adaptive encoder-decoder is proposed to protect important features and avoid frequent retransmissions. Meanwhile, a well-received image can provide a reference for repairing damaged images under sudden attenuation. Since acknowledgment feedback is subject to long propagation delays when retransmission is unavoidable, a novel error detection method is proposed to roughly detect semantic errors at the regenerative satellite. With the proposed detectors at both the satellite and the gateway, the quality of the received images can be ensured. The simulation results demonstrate that the proposed method can significantly reduce bandwidth requirements, adapt to complex satellite scenarios, and protect semantic information with an acceptable transmission delay.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2431-2445"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813548","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559160
Jiangyuan Guo;Wei Chen;Yuxuan Sun;Jialong Xu;Bo Ai
Although semantic communication (SC) has shown its potential in efficiently transmitting multimodal data such as texts, speeches and images, SC for videos has focused primarily on pixel-level reconstruction. However, these SC systems may be suboptimal for downstream intelligent tasks. Moreover, SC systems without pixel-level video reconstruction present advantages by achieving higher bandwidth efficiency and real-time performance of various intelligent tasks. The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels. In this paper, we propose an end-to-end SC system, named VideoQA-SC for video question answering (VideoQA) tasks. Our goal is to accomplish VideoQA tasks directly based on video semantics over noisy or fading wireless channels, bypassing the need for video reconstruction at the receiver. To this end, we develop a spatiotemporal semantic encoder for effective video semantic extraction, and a learning-based bandwidth-adaptive deep joint source-channel coding (DJSCC) scheme for efficient and robust video semantic transmission. Experiments demonstrate that VideoQA-SC outperforms traditional and advanced DJSCC-based SC systems that rely on video reconstruction at the receiver under a wide range of channel conditions and bandwidth constraints. In particular, when the signal-to-noise ratio is low, VideoQA-SC can improve the answer accuracy by 5.17% while saving almost 99.5% of the bandwidth at the same time, compared with the advanced DJSCC-based SC system. Our results show the great potential of SC system design for video applications.
{"title":"VideoQA-SC: Adaptive Semantic Communication for Video Question Answering","authors":"Jiangyuan Guo;Wei Chen;Yuxuan Sun;Jialong Xu;Bo Ai","doi":"10.1109/JSAC.2025.3559160","DOIUrl":"10.1109/JSAC.2025.3559160","url":null,"abstract":"Although semantic communication (SC) has shown its potential in efficiently transmitting multimodal data such as texts, speeches and images, SC for videos has focused primarily on pixel-level reconstruction. However, these SC systems may be suboptimal for downstream intelligent tasks. Moreover, SC systems without pixel-level video reconstruction present advantages by achieving higher bandwidth efficiency and real-time performance of various intelligent tasks. The difficulty in such system design lies in the extraction of task-related compact semantic representations and their accurate delivery over noisy channels. In this paper, we propose an end-to-end SC system, named VideoQA-SC for video question answering (VideoQA) tasks. Our goal is to accomplish VideoQA tasks directly based on video semantics over noisy or fading wireless channels, bypassing the need for video reconstruction at the receiver. To this end, we develop a spatiotemporal semantic encoder for effective video semantic extraction, and a learning-based bandwidth-adaptive deep joint source-channel coding (DJSCC) scheme for efficient and robust video semantic transmission. Experiments demonstrate that VideoQA-SC outperforms traditional and advanced DJSCC-based SC systems that rely on video reconstruction at the receiver under a wide range of channel conditions and bandwidth constraints. In particular, when the signal-to-noise ratio is low, VideoQA-SC can improve the answer accuracy by 5.17% while saving almost 99.5% of the bandwidth at the same time, compared with the advanced DJSCC-based SC system. Our results show the great potential of SC system design for video applications.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2462-2477"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813824","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559134
Chenghong Bian;Yulin Shao;Deniz Gündüz
Efficient data transmission across mobile multi-hop networks that connect edge devices to core servers presents significant challenges, particularly due to the variability in link qualities between wireless and wired segments. This variability necessitates a robust transmission scheme that transcends the limitations of existing deep joint source-channel coding (DeepJSCC) strategies, which often struggle at the intersection of analog and digital methods. Addressing this need, this paper introduces a novel hybrid DeepJSCC framework, h-DJSCC, tailored for effective image transmission from edge devices through a network architecture that includes initial wireless transmission followed by multiple wired hops. Our approach harnesses the strengths of DeepJSCC for the initial, variable-quality wireless link to avoid the cliff effect inherent in purely digital schemes. For the subsequent wired hops, which feature more stable and high-capacity connections, we implement digital compression and forwarding techniques to prevent noise accumulation. This dual-mode strategy is adaptable even in scenarios with limited knowledge of the image distribution, enhancing the framework’s robustness and utility. Extensive numerical simulations demonstrate that our hybrid solution outperforms traditional fully digital approaches by effectively managing transitions between different network segments and optimizing for variable signal-to-noise ratios (SNRs). We also introduce a fully adaptive h-DJSCC architecture with both SNR-adaptive (SA) and rate-adaptive (RA) modules capable of adjusting to different network conditions and achieving diverse rate-distortion objectives, thereby reducing the memory requirements on network nodes.
{"title":"A Deep Joint Source-Channel Coding Scheme for Hybrid Mobile Multi-Hop Networks","authors":"Chenghong Bian;Yulin Shao;Deniz Gündüz","doi":"10.1109/JSAC.2025.3559134","DOIUrl":"10.1109/JSAC.2025.3559134","url":null,"abstract":"Efficient data transmission across mobile multi-hop networks that connect edge devices to core servers presents significant challenges, particularly due to the variability in link qualities between wireless and wired segments. This variability necessitates a robust transmission scheme that transcends the limitations of existing deep joint source-channel coding (DeepJSCC) strategies, which often struggle at the intersection of analog and digital methods. Addressing this need, this paper introduces a novel hybrid DeepJSCC framework, h-DJSCC, tailored for effective image transmission from edge devices through a network architecture that includes initial wireless transmission followed by multiple wired hops. Our approach harnesses the strengths of DeepJSCC for the initial, variable-quality wireless link to avoid the cliff effect inherent in purely digital schemes. For the subsequent wired hops, which feature more stable and high-capacity connections, we implement digital compression and forwarding techniques to prevent noise accumulation. This dual-mode strategy is adaptable even in scenarios with limited knowledge of the image distribution, enhancing the framework’s robustness and utility. Extensive numerical simulations demonstrate that our hybrid solution outperforms traditional fully digital approaches by effectively managing transitions between different network segments and optimizing for variable signal-to-noise ratios (SNRs). We also introduce a fully adaptive h-DJSCC architecture with both SNR-adaptive (SA) and rate-adaptive (RA) modules capable of adjusting to different network conditions and achieving diverse rate-distortion objectives, thereby reducing the memory requirements on network nodes.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2543-2559"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813429","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
The existing task-agnostic and resource-constrained satellite communication fails to meet diverse task demands in the upcoming sixth-generation (6G) network. In this paper, to enable the ubiquitous intelligent services with massive traffic for global users through satellite-Integrated Internet, we first propose a novel semantic metric named utility loss of information (UoI), which can capture the task-oriented aspects by quantifying both value loss of semantic mismatch, and energy loss of unnecessary transmissions. Then, we design a UoI minimization data generation and transmission (UMGT) scheme for task-adaptive communications in satellite-Integrated Internet with energy constraint and reliability requirement. For the time-varying satellite-terrestrial link with high bit error rate (BER) and delayed feedback, we derive the closed-form expressions of BER, and apply the long erasure coding (LEC) to combat the deep fading. Subsequently, we transform the optimization problem to minimize the upper bound of an unconstrained Lyapunov drift-plus-penalty (DPP). Further, we propose two deep reinforcement learning (DRL) algorithms to intelligently choose when to generate data, how to adjust the number of LEC packets and whether to retransmit, thereby minimizing the average UoI. Simulation results validate that our UMGT scheme can achieve the lowest UoI than several state-of-the-art schemes, and demonstrate its adaptability to various task demands.
{"title":"Utility Loss of Information Minimization With Long Erasure Coding for Task-Adaptive Communications in Satellite-Integrated Internet","authors":"Jianhao Huang;Jian Jiao;Ye Wang;Yonghui Li;Qinyu Zhang","doi":"10.1109/JSAC.2025.3559119","DOIUrl":"10.1109/JSAC.2025.3559119","url":null,"abstract":"The existing task-agnostic and resource-constrained satellite communication fails to meet diverse task demands in the upcoming sixth-generation (6G) network. In this paper, to enable the ubiquitous intelligent services with massive traffic for global users through satellite-Integrated Internet, we first propose a novel semantic metric named utility loss of information (UoI), which can capture the task-oriented aspects by quantifying both value loss of semantic mismatch, and energy loss of unnecessary transmissions. Then, we design a UoI minimization data generation and transmission (UMGT) scheme for task-adaptive communications in satellite-Integrated Internet with energy constraint and reliability requirement. For the time-varying satellite-terrestrial link with high bit error rate (BER) and delayed feedback, we derive the closed-form expressions of BER, and apply the long erasure coding (LEC) to combat the deep fading. Subsequently, we transform the optimization problem to minimize the upper bound of an unconstrained Lyapunov drift-plus-penalty (DPP). Further, we propose two deep reinforcement learning (DRL) algorithms to intelligently choose when to generate data, how to adjust the number of LEC packets and whether to retransmit, thereby minimizing the average UoI. Simulation results validate that our UMGT scheme can achieve the lowest UoI than several state-of-the-art schemes, and demonstrate its adaptability to various task demands.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2604-2619"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10960409","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559120
Chunmei Xu;Mahdi Boloursaz Mashhadi;Yi Ma;Rahim Tafazolli;Jiangzhou Wang
Generative foundation models can revolutionize the design of semantic communication (SemCom) systems by enabling high fidelity exchange of semantic information at ultra-low rates. In this work, a generative SemCom framework utilizing pre-trained foundation models is proposed, where both uncoded forward-with-error and coded discard-with-error schemes are developed for the semantic decoder. Using the rate-distortion-perception theory, the relationship between regenerated signal quality and transmission reliability is characterized, which is proven to be non-decreasing. Based on this, semantic values are defined to quantify the semantic similarity between multimodal semantic features and the original source. We also investigate semantic-aware power allocation problems that minimize power consumption for ultra-low rate and high fidelity SemComs. Two semantic-aware power allocation methods are proposed by leveraging the non-decreasing property of the perception-error relationship. Based on the Kodak dataset, perception-error functions and semantic values are obtained for image tasks. Simulation results show that the proposed semantic-aware method significantly outperforms conventional approaches, particularly in the channel-coded case (up to 90% power saving).
{"title":"Generative Semantic Communications With Foundation Models: Perception-Error Analysis and Semantic-Aware Power Allocation","authors":"Chunmei Xu;Mahdi Boloursaz Mashhadi;Yi Ma;Rahim Tafazolli;Jiangzhou Wang","doi":"10.1109/JSAC.2025.3559120","DOIUrl":"10.1109/JSAC.2025.3559120","url":null,"abstract":"Generative foundation models can revolutionize the design of semantic communication (SemCom) systems by enabling high fidelity exchange of semantic information at ultra-low rates. In this work, a generative SemCom framework utilizing pre-trained foundation models is proposed, where both uncoded forward-with-error and coded discard-with-error schemes are developed for the semantic decoder. Using the rate-distortion-perception theory, the relationship between regenerated signal quality and transmission reliability is characterized, which is proven to be non-decreasing. Based on this, semantic values are defined to quantify the semantic similarity between multimodal semantic features and the original source. We also investigate semantic-aware power allocation problems that minimize power consumption for ultra-low rate and high fidelity SemComs. Two semantic-aware power allocation methods are proposed by leveraging the non-decreasing property of the perception-error relationship. Based on the Kodak dataset, perception-error functions and semantic values are obtained for image tasks. Simulation results show that the proposed semantic-aware method significantly outperforms conventional approaches, particularly in the channel-coded case (up to 90% power saving).","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2493-2505"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813483","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-04-09DOI: 10.1109/JSAC.2025.3559157
Jiakun Liu;H. Vincent Poor;Iickho Song;Wenyi Zhang
A composite source, consisting of multiple subsources and a memoryless switch, outputs one symbol at a time from the subsource selected by the switch. If some data should be encoded more accurately than other data from an information source, the composite source model is suitable because in this model different distortion constraints can be put on the subsources. In this context, we propose subsource-dependent fidelity criteria for composite sources and use them to formulate a rate-distortion problem. We solve the problem and obtain a single-letter expression for the rate-distortion function. Further rate-distortion analysis characterizes the performance of classify-then-compress (CTC) coding, which is frequently used in practice when subsource-dependent fidelity criteria are considered. Our analysis shows that CTC coding generally has performance loss relative to optimal coding, even if the classification is perfect. We also identify the cause of the performance loss, that is, class labels have to be reproduced in CTC coding. Last but not least, we show that the performance loss is negligible for asymptotically small distortion if CTC coding is appropriately designed and some mild conditions are satisfied.
{"title":"A Rate-Distortion Analysis for Composite Sources Under Subsource-Dependent Fidelity Criteria","authors":"Jiakun Liu;H. Vincent Poor;Iickho Song;Wenyi Zhang","doi":"10.1109/JSAC.2025.3559157","DOIUrl":"10.1109/JSAC.2025.3559157","url":null,"abstract":"A composite source, consisting of multiple subsources and a memoryless switch, outputs one symbol at a time from the subsource selected by the switch. If some data should be encoded more accurately than other data from an information source, the composite source model is suitable because in this model different distortion constraints can be put on the subsources. In this context, we propose subsource-dependent fidelity criteria for composite sources and use them to formulate a rate-distortion problem. We solve the problem and obtain a single-letter expression for the rate-distortion function. Further rate-distortion analysis characterizes the performance of classify-then-compress (CTC) coding, which is frequently used in practice when subsource-dependent fidelity criteria are considered. Our analysis shows that CTC coding generally has performance loss relative to optimal coding, even if the classification is perfect. We also identify the cause of the performance loss, that is, class labels have to be reproduced in CTC coding. Last but not least, we show that the performance loss is negligible for asymptotically small distortion if CTC coding is appropriately designed and some mild conditions are satisfied.","PeriodicalId":73294,"journal":{"name":"IEEE journal on selected areas in communications : a publication of the IEEE Communications Society","volume":"43 7","pages":"2379-2392"},"PeriodicalIF":0.0,"publicationDate":"2025-04-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143813480","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}