Pub Date : 2025-10-22DOI: 10.1109/TMC.2025.3624064
Dongping Liao;Xitong Gao;Chengzhong Xu
Federated learning (FL) enables collaborative training on decentralized data while preserving the data owners’ privacy, under the orchestration of a central server. FL has seen tremendous growth and advancements in recent years. Despite its progress, FL faces a significant challenge raised by data heterogeneity, leading to a slower convergence rate and a larger performance gap compared to centralized training. In this work, we empirically reveal that direct applying empirical risk minimizing (ERM) on skewed client training data causes the client model suffers from biased predictions towards majority classes. To address this problem, we propose a model agnostic instance reweighing method (MAIR). At a coarse-grained level, MAIR adjusts the logits predictions for each class to counteract the data heterogeneity. At a fine-grained level, it dynamically reweighs the importance of individual training samples with a predictive meta network. As a results, MAIR prevents client models from over-fitting on heterogeneous data and therefore substantially reduces client drift. Theoretically, we justify its non-convex convergence property. Extensive experiments demonstrate that MAIR reliably speeds up convergence and improves the quality of global models, outperforming its best competitor by a clear margin. It notably delivers $8.3%$ improvements on ImageNet subset and achieves $67.6%$ energy footprint reduction on CIFAR-100 over the FedAvg baseline. Our findings also suggest that improving the performance of FL-trained models necessitates rethinking clients’ local optimization objectives, and ERM should thus no longer be viewed as a de facto standard in FL under data heterogeneity.
{"title":"MAIR: Model Agnostic Instance Reweighing for Heterogeneous Federated Learning","authors":"Dongping Liao;Xitong Gao;Chengzhong Xu","doi":"10.1109/TMC.2025.3624064","DOIUrl":"https://doi.org/10.1109/TMC.2025.3624064","url":null,"abstract":"Federated learning (FL) enables collaborative training on decentralized data while preserving the data owners’ privacy, under the orchestration of a central server. FL has seen tremendous growth and advancements in recent years. Despite its progress, FL faces a significant challenge raised by data heterogeneity, leading to a slower convergence rate and a larger performance gap compared to centralized training. In this work, we empirically reveal that direct applying empirical risk minimizing (ERM) on skewed client training data causes the client model suffers from biased predictions towards majority classes. To address this problem, we propose a model agnostic instance reweighing method (MAIR). At a coarse-grained level, MAIR adjusts the logits predictions for each class to counteract the data heterogeneity. At a fine-grained level, it dynamically reweighs the importance of individual training samples with a predictive meta network. As a results, MAIR prevents client models from over-fitting on heterogeneous data and therefore substantially reduces client drift. Theoretically, we justify its non-convex convergence property. Extensive experiments demonstrate that MAIR reliably speeds up convergence and improves the quality of global models, outperforming its best competitor by a clear margin. It notably delivers <inline-formula><tex-math>$8.3%$</tex-math></inline-formula> improvements on ImageNet subset and achieves <inline-formula><tex-math>$67.6%$</tex-math></inline-formula> energy footprint reduction on CIFAR-100 over the FedAvg baseline. Our findings also suggest that improving the performance of FL-trained models necessitates rethinking clients’ local optimization objectives, and ERM should thus no longer be viewed as a de facto standard in FL under data heterogeneity.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4241-4252"},"PeriodicalIF":9.2,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116913","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-22DOI: 10.1109/TMC.2025.3624475
Zhan Zhang;Denghui Song;Anfu Zhou;Huadong Ma
Tactile sensing is an important capability for intelligent machines to interact with the external physical world. Current tactile sensing methods are mainly based on contact, which mostly perceive the object’s surface features, limiting for deeper features such as internal structure. In this paper, we propose terahertz textile tactile sensing, i.e., TeraTex, which utilizes the effect of textile material and structure on terahertz reflected signals to perceive its multidimensional tactile properties. To realize TeraTex, the key challenges lie in that the textile tactile properties influenced by material and structure are subtle and intricately entangled within the reflected signals, and different reflective surfaces can interfere with feature extraction. To address these challenges, we custom-design a biologically inspired network to extract different dimensions of tactile properties from the terahertz reflected signals, and eliminate the influence of different reflective surfaces. We prototype and validate TeraTex using a terahertz time-domain spectrometer on 30 types of textiles and 7 different reflective surfaces. TeraTex achieves an average accuracy of 92.42% across 11 tactile property dimensions. Furthermore, when the reflective surfaces are extended to unknown surfaces, TeraTex still achieves an average accuracy of 88.3% .
{"title":"TeraTex: Contactless Textile Tactile Sensing Using Terahertz Signal","authors":"Zhan Zhang;Denghui Song;Anfu Zhou;Huadong Ma","doi":"10.1109/TMC.2025.3624475","DOIUrl":"https://doi.org/10.1109/TMC.2025.3624475","url":null,"abstract":"Tactile sensing is an important capability for intelligent machines to interact with the external physical world. Current tactile sensing methods are mainly based on contact, which mostly perceive the object’s surface features, limiting for deeper features such as internal structure. In this paper, we propose terahertz textile tactile sensing, i.e., <italic>TeraTex</i>, which utilizes the effect of textile material and structure on terahertz reflected signals to perceive its multidimensional tactile properties. To realize <italic>TeraTex</i>, the key challenges lie in that the textile tactile properties influenced by material and structure are subtle and intricately entangled within the reflected signals, and different reflective surfaces can interfere with feature extraction. To address these challenges, we custom-design a biologically inspired network to extract different dimensions of tactile properties from the terahertz reflected signals, and eliminate the influence of different reflective surfaces. We prototype and validate <italic>TeraTex</i> using a terahertz time-domain spectrometer on 30 types of textiles and 7 different reflective surfaces. <italic>TeraTex</i> achieves an average accuracy of 92.42% across 11 tactile property dimensions. Furthermore, when the reflective surfaces are extended to unknown surfaces, <italic>TeraTex</i> still achieves an average accuracy of 88.3% .","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4269-4285"},"PeriodicalIF":9.2,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116809","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Quantization is a common method to improve communication efficiency in federated learning (FL) by compressing the gradients that clients upload. Currently, most application scenarios involve cloud-edge collaboration, where edge clients exhibit significant heterogeneity, making previous methods with uniform quantization levels unsuitable. To address these issues, we introduce a novel algorithm named Lightweight Adaptive Quantization for Heterogeneous Clients (LAQ-HC), which enables each client to adaptively choose its quantization level based on its data quality and communication capabilities, without increasing computation costs. The core idea is that clients with lower communication capabilities should use higher quantization levels, whereas those with higher capabilities should use lower levels. This ensures that clients complete their uploads in a similar time. Furthermore, LAQ-HC models the relationship between quantization levels and the impact on client quality, which remains consistent between clients and adjacent training rounds. This allows for a lightweight estimation of the impact of quantization levels on training convergence, as demonstrated in our theoretical analysis. Under the constraints of limited wireless mobile communication bandwidth, LAQ-HC achieves faster convergence and higher accuracy compared to the latest adaptive quantization algorithms, while using only 56.74% of the computation time, 80.57% of the overall runtime, and 94.04% of the communication overhead.
{"title":"Lightweight Adaptive Quantization Algorithms for Federated Learning With Heterogeneous Clients","authors":"Hengrui Cui;Zhihao Qu;Baoliu Ye;Bin Tang;Tao Zhuang;Xinyu Wang;Yue Zeng","doi":"10.1109/TMC.2025.3623984","DOIUrl":"https://doi.org/10.1109/TMC.2025.3623984","url":null,"abstract":"Quantization is a common method to improve communication efficiency in federated learning (FL) by compressing the gradients that clients upload. Currently, most application scenarios involve cloud-edge collaboration, where edge clients exhibit significant heterogeneity, making previous methods with uniform quantization levels unsuitable. To address these issues, we introduce a novel algorithm named Lightweight Adaptive Quantization for Heterogeneous Clients (LAQ-HC), which enables each client to adaptively choose its quantization level based on its data quality and communication capabilities, without increasing computation costs. The core idea is that clients with lower communication capabilities should use higher quantization levels, whereas those with higher capabilities should use lower levels. This ensures that clients complete their uploads in a similar time. Furthermore, LAQ-HC models the relationship between quantization levels and the impact on client quality, which remains consistent between clients and adjacent training rounds. This allows for a lightweight estimation of the impact of quantization levels on training convergence, as demonstrated in our theoretical analysis. Under the constraints of limited wireless mobile communication bandwidth, LAQ-HC achieves faster convergence and higher accuracy compared to the latest adaptive quantization algorithms, while using only 56.74% of the computation time, 80.57% of the overall runtime, and 94.04% of the communication overhead.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4409-4424"},"PeriodicalIF":9.2,"publicationDate":"2025-10-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116815","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Rotation is a fundamental form of motion and rotation speed measurement holds paramount importance for assessing the health and performance of machinery with rotating components. However, existing measurement systems often face challenges such as limited measurement distance, low accuracy, and complex installation or maintenance processes. In this paper, we propose RoLEX, a LoRa-based rotation speed measurement system for long-distance and contactless monitoring of rotating machinery in ubiquitous scenarios. RoLEX employs a novel Signal Selection method to eliminate chirp interference and adapt to varying rotation speeds, along with a Boost Sensing method to enhance sampling rates and an advanced feature processing algorithm for precise rotation speed estimation and tracking. Comprehensive experiments validate that RoLEX achieves a measurement distance of 50 m, approximately 17 times farther than the latest wireless rotation speed measurement systems. Moreover, RoLEX is robust to interference and obstructions (including through-wall scenarios) and achieves an average measurement error less than 0.69% across different rotation speeds (100–5100 Revolutions Per Minute). For tracking performance, RoLEX achieves a relative error less than 2.8% in 90% of cases. We also present a case study to highlight RoLEX’s practical applicability in real-world scenarios.
{"title":"RoLEX: A LoRa-Based Rotation Speed Measurement System for Ubiquitous Long-Distance Monitoring Applications","authors":"Keran Li;Haipeng Dai;Wei Wang;Junlong Chen;Jiliang Wang;Shuai Tong;Meng Li;Lei Wang;Haoran Wang;Guihai Chen","doi":"10.1109/TMC.2025.3623385","DOIUrl":"https://doi.org/10.1109/TMC.2025.3623385","url":null,"abstract":"Rotation is a fundamental form of motion and rotation speed measurement holds paramount importance for assessing the health and performance of machinery with rotating components. However, existing measurement systems often face challenges such as limited measurement distance, low accuracy, and complex installation or maintenance processes. In this paper, we propose RoLEX, a LoRa-based rotation speed measurement system for long-distance and contactless monitoring of rotating machinery in ubiquitous scenarios. RoLEX employs a novel Signal Selection method to eliminate chirp interference and adapt to varying rotation speeds, along with a Boost Sensing method to enhance sampling rates and an advanced feature processing algorithm for precise rotation speed estimation and tracking. Comprehensive experiments validate that RoLEX achieves a measurement distance of 50 m, approximately 17 times farther than the latest wireless rotation speed measurement systems. Moreover, RoLEX is robust to interference and obstructions (including through-wall scenarios) and achieves an average measurement error less than 0.69% across different rotation speeds (100–5100 Revolutions Per Minute). For tracking performance, RoLEX achieves a relative error less than 2.8% in 90% of cases. We also present a case study to highlight RoLEX’s practical applicability in real-world scenarios.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4170-4189"},"PeriodicalIF":9.2,"publicationDate":"2025-10-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175589","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17DOI: 10.1109/TMC.2025.3622895
Yinlin Ren;Longyu Zhou;Shaoyong Guo;Xuesong Qiu;Tony Q. S. Quek
Open Radio Access Network (O-RAN) supports heterogeneous service coexistence through functional splitting and open interfaces, enabling traffic steering via functional orchestration and resource scheduling. However, existing studies focus on known traffic patterns and lack the ability to anticipate dynamic service demands in advance. Isolated optimization of orchestration and scheduling fails to ensure End-to-End (E2E) latency. The varying time scales and vast solution space further complicate the joint optimization. To address this, we propose a traffic twin-enabled orchestration and scheduling multi-timescale joint optimization scheme. Explicitly, we design a spatiotemporal attention-assisted Time Series Generative Adversarial Network (TimeGAN) traffic twin model (STAG-TD) to capture unknown traffic patterns. Based on twin results, we formulate a joint optimization problem and design a dual-timescale algorithm framework, including propose a Task Decomposed Dueling Double Deep Q-Network (TD3QN) algorithm to handle large-timescale orchestration, and use a Penalty-based Particle Swarm Optimization (PPSO) algorithm to manage small-timescale scheduling. Our scheme achieves a predictive joint optimization to reduce the transmission latency of services. Extensive results show our scheme outperforms state-of-the-art methods, reducing E2E latency by over 39% and increasing throughput by over 14.9%. The highly consistent results between real and twin data also demonstrate the effectiveness of the traffic twin model.
{"title":"Traffic Digital Twin-Enabled Orchestration and Scheduling in O-RAN: A Multi-Timescale Joint Optimization Approach","authors":"Yinlin Ren;Longyu Zhou;Shaoyong Guo;Xuesong Qiu;Tony Q. S. Quek","doi":"10.1109/TMC.2025.3622895","DOIUrl":"https://doi.org/10.1109/TMC.2025.3622895","url":null,"abstract":"Open Radio Access Network (O-RAN) supports heterogeneous service coexistence through functional splitting and open interfaces, enabling traffic steering via functional orchestration and resource scheduling. However, existing studies focus on known traffic patterns and lack the ability to anticipate dynamic service demands in advance. Isolated optimization of orchestration and scheduling fails to ensure End-to-End (E2E) latency. The varying time scales and vast solution space further complicate the joint optimization. To address this, we propose a traffic twin-enabled orchestration and scheduling multi-timescale joint optimization scheme. Explicitly, we design a spatiotemporal attention-assisted Time Series Generative Adversarial Network (TimeGAN) traffic twin model (STAG-TD) to capture unknown traffic patterns. Based on twin results, we formulate a joint optimization problem and design a dual-timescale algorithm framework, including propose a Task Decomposed Dueling Double Deep Q-Network (TD3QN) algorithm to handle large-timescale orchestration, and use a Penalty-based Particle Swarm Optimization (PPSO) algorithm to manage small-timescale scheduling. Our scheme achieves a predictive joint optimization to reduce the transmission latency of services. Extensive results show our scheme outperforms state-of-the-art methods, reducing E2E latency by over 39% and increasing throughput by over 14.9%. The highly consistent results between real and twin data also demonstrate the effectiveness of the traffic twin model.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4223-4240"},"PeriodicalIF":9.2,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116873","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17DOI: 10.1109/TMC.2025.3622596
Jin Meng;Chao Deng;Penghui Song;Minghao Jin;Yingzhuang Liu
This study introduces an analytical model for a nonsaturated IEEE 802.11ax network, designed to capture the coexistence characteristics of uplink orthogonal frequency division multiple access (OFDMA)-based random access (UORA) and non-random access (UONRA) mechanisms under imperfect channels. Existing models fail to assess this realistic network due to three issues: 1. When accounting for frame aggregation, imperfect channels, and queue buffer, the network exhibits an overwhelming number of states, which renders the evaluation of its performance infeasible. 2. Existing bulk-service queue models fail to evaluate these queue characteristics influenced by imperfect channels. 3. The coexistence characteristics of the two mechanisms remain unexplored due to their complex interactions. To address Issue 1, we propose two designs: device updates and hardware implementation, which reduce the massive states and facilitate the network evaluation. To address Issue 2, we have developed a feedback-driven bulk-service queue model that captures the joint effects of imperfect channels and queue characteristics. To address Issue 3, we comprehensively analyzed the joint impact of the two mechanisms from a probabilistic perspective. Extensive simulations demonstrate that the proposed model accurately captures network performance (throughput, delay, and collision probability) and reduces the mean estimation error by a factor of 30 compared to existing studies.
{"title":"Modeling Nonsaturated IEEE 802.11ax Networks With the Coexistence of UORA and UONRA in Imperfect Channels","authors":"Jin Meng;Chao Deng;Penghui Song;Minghao Jin;Yingzhuang Liu","doi":"10.1109/TMC.2025.3622596","DOIUrl":"https://doi.org/10.1109/TMC.2025.3622596","url":null,"abstract":"This study introduces an analytical model for a nonsaturated IEEE 802.11ax network, designed to capture the coexistence characteristics of uplink orthogonal frequency division multiple access (OFDMA)-based random access (UORA) and non-random access (UONRA) mechanisms under imperfect channels. Existing models fail to assess this realistic network due to three issues: 1. When accounting for frame aggregation, imperfect channels, and queue buffer, the network exhibits an overwhelming number of states, which renders the evaluation of its performance infeasible. 2. Existing bulk-service queue models fail to evaluate these queue characteristics influenced by imperfect channels. 3. The coexistence characteristics of the two mechanisms remain unexplored due to their complex interactions. To address Issue 1, we propose two designs: device updates and hardware implementation, which reduce the massive states and facilitate the network evaluation. To address Issue 2, we have developed a feedback-driven bulk-service queue model that captures the joint effects of imperfect channels and queue characteristics. To address Issue 3, we comprehensively analyzed the joint impact of the two mechanisms from a probabilistic perspective. Extensive simulations demonstrate that the proposed model accurately captures network performance (throughput, delay, and collision probability) and reduces the mean estimation error by a factor of 30 compared to existing studies.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4190-4205"},"PeriodicalIF":9.2,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146175588","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17DOI: 10.1109/TMC.2025.3622698
Changyao Lin;Zhenming Chen;Ziyang Zhang;Jie Liu
Multi-task inference, as a prevalent inference paradigm nowadays, requires deploying multiple deep learning models on the hardware platform to concurrently process inference tasks. Modern platforms are typically equipped with various heterogeneous processors, such as CPU-GPU platform. To reduce resource contention and improve quality of service in the multi-task scenario, existing work has studied cross-processor inference at the fine-grained operator-level. However, it lacks specific optimizations for asynchronous multi-task inference systems. In such systems, tasks arrive dynamically, leading to diverse inference progress for each model. This renders offline optimization strategies based solely on the original computation graph suboptimal or even ineffective. Therefore, we propose a novel framework, Cola, to address the cross-processor operator scheduling for asynchronous tasks. Cola introduces intermediate representation to abstract and simplify such dynamic scheduling problem, considering the impact of task arrival patterns on the inference progress, and employs an efficient two-phase search algorithm. We implemented and validated Cola on a real-world case of intelligent steel structure manufacturing. Cola outperforms the state-of-the-art cross-processor operator scheduling framework in both throughput and resource utilization with highly acceptable runtime overhead.
{"title":"Cola: Cross-Processor Operator Parallelism for Asynchronous Deep Learning Inference","authors":"Changyao Lin;Zhenming Chen;Ziyang Zhang;Jie Liu","doi":"10.1109/TMC.2025.3622698","DOIUrl":"https://doi.org/10.1109/TMC.2025.3622698","url":null,"abstract":"Multi-task inference, as a prevalent inference paradigm nowadays, requires deploying multiple deep learning models on the hardware platform to concurrently process inference tasks. Modern platforms are typically equipped with various heterogeneous processors, such as CPU-GPU platform. To reduce resource contention and improve quality of service in the multi-task scenario, existing work has studied cross-processor inference at the fine-grained operator-level. However, it lacks specific optimizations for asynchronous multi-task inference systems. In such systems, tasks arrive dynamically, leading to diverse inference progress for each model. This renders offline optimization strategies based solely on the original computation graph suboptimal or even ineffective. Therefore, we propose a novel framework, Cola, to address the cross-processor operator scheduling for asynchronous tasks. Cola introduces intermediate representation to abstract and simplify such dynamic scheduling problem, considering the impact of task arrival patterns on the inference progress, and employs an efficient two-phase search algorithm. We implemented and validated Cola on a real-world case of intelligent steel structure manufacturing. Cola outperforms the state-of-the-art cross-processor operator scheduling framework in both throughput and resource utilization with highly acceptable runtime overhead.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4346-4362"},"PeriodicalIF":9.2,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116848","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17DOI: 10.1109/TMC.2025.3622937
Yushan Han;Hui Zhang;Honglei Zhang;Chuntao Ding;Yuanzhouhan Cao;Yidong Li
Collaborative perception has been proven to improve individual perception in autonomous driving through multi-agent interaction. Nevertheless, most methods often assume identical encoders for all agents, which does not hold true when these models are deployed in real-world applications. To realize collaborative perception in actual heterogeneous scenarios, existing methods usually align neighbor features to those of the ego vehicle, which is vulnerable to noise from domain gaps and thus fails to address feature discrepancies effectively. Moreover, they adopt transformer-based modules for domain adaptation, which causes the model inference inefficiency on mobile devices. To tackle these issues, we propose CoDS, a Collaborative perception method that leverages Domain Separation to address feature discrepancies in heterogeneous scenarios. The CoDS employs two feature alignment modules, i.e., Lightweight Spatial-Channel Resizer (LSCR) and Distribution Alignment via Domain Separation (DADS). Besides, it utilizes the Domain Alignment Mutual Information (DAMI) loss to ensure effective feature alignment. Specifically, the LSCR aligns the neighbor feature across spatial and channel dimensions using a lightweight convolutional layer. Subsequently, the DADS mitigates feature distribution discrepancy with encoder-specific and encoder-agnostic domain separation modules. The former removes domain-dependent information and the latter captures task-related information. During training, the DAMI loss maximizes the mutual information between aligned heterogeneous features to enhance the domain separation process. The CoDS employs a fully convolutional architecture, which ensures high inference efficiency. Extensive experiments demonstrate that the CoDS effectively mitigates feature discrepancies in heterogeneous scenarios and achieves a trade-off between detection accuracy and inference efficiency.
{"title":"CoDS: Enhancing Collaborative Perception in Heterogeneous Scenarios via Domain Separation","authors":"Yushan Han;Hui Zhang;Honglei Zhang;Chuntao Ding;Yuanzhouhan Cao;Yidong Li","doi":"10.1109/TMC.2025.3622937","DOIUrl":"https://doi.org/10.1109/TMC.2025.3622937","url":null,"abstract":"Collaborative perception has been proven to improve individual perception in autonomous driving through multi-agent interaction. Nevertheless, most methods often assume identical encoders for all agents, which does not hold true when these models are deployed in real-world applications. To realize collaborative perception in actual heterogeneous scenarios, existing methods usually align neighbor features to those of the ego vehicle, which is vulnerable to noise from domain gaps and thus fails to address feature discrepancies effectively. Moreover, they adopt transformer-based modules for domain adaptation, which causes the model inference inefficiency on mobile devices. To tackle these issues, we propose CoDS, a <underline>Co</u>llaborative perception method that leverages <underline>D</u>omain <underline>S</u>eparation to address feature discrepancies in heterogeneous scenarios. The CoDS employs two feature alignment modules, i.e., Lightweight Spatial-Channel Resizer (LSCR) and Distribution Alignment via Domain Separation (DADS). Besides, it utilizes the Domain Alignment Mutual Information (DAMI) loss to ensure effective feature alignment. Specifically, the LSCR aligns the neighbor feature across spatial and channel dimensions using a lightweight convolutional layer. Subsequently, the DADS mitigates feature distribution discrepancy with encoder-specific and encoder-agnostic domain separation modules. The former removes domain-dependent information and the latter captures task-related information. During training, the DAMI loss maximizes the mutual information between aligned heterogeneous features to enhance the domain separation process. The CoDS employs a fully convolutional architecture, which ensures high inference efficiency. Extensive experiments demonstrate that the CoDS effectively mitigates feature discrepancies in heterogeneous scenarios and achieves a trade-off between detection accuracy and inference efficiency.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4286-4299"},"PeriodicalIF":9.2,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116872","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-10-17DOI: 10.1109/TMC.2025.3623072
Lixiang Yuan;Jiapeng Zhang;Mingxing Duan;Guoqing Xiao;Zhuo Tang;Kenli Li
Federated learning (FL) enables collaborative training of a global model while preserving participants’ local data privacy, making it ideal for data-sensitive fields like Industrial Internet of Things (IIoT), finance, and healthcare. However, Non-IID data among participants and the presence of malicious participants pose significant challenges to the model’s performance and convergence. The global model is difficult to achieve consistent performance across all participants. Therefore, this paper proposes personalized and robust federated learning (PRFL) to handle non-independently and identically distributed (Non-IID) data with malicious participants. First, to enhance the robustness and convergence, a model similarity-based division mechanism is employed. It groups participants with similar data and removes both independent and colluding malicious participants. Second, we propose a three-stage knowledge sharing personalized federated learning framework. Each participant undergoes inner-loop knowledge sharing, outer-loop knowledge sharing, and personalized knowledge distillation, incorporating performance-driven dynamic weighted sharing mechanism. Moreover, extensive experiments demonstrate that PRFL outperforms other advanced personalized federated learning methods across various benchmark datasets, particularly in scenarios with Non-IID data and malicious participants.
{"title":"PRFL: Personalized and Robust Federated Learning for Non-IID Data With Malicious Participants","authors":"Lixiang Yuan;Jiapeng Zhang;Mingxing Duan;Guoqing Xiao;Zhuo Tang;Kenli Li","doi":"10.1109/TMC.2025.3623072","DOIUrl":"https://doi.org/10.1109/TMC.2025.3623072","url":null,"abstract":"Federated learning (FL) enables collaborative training of a global model while preserving participants’ local data privacy, making it ideal for data-sensitive fields like Industrial Internet of Things (IIoT), finance, and healthcare. However, Non-IID data among participants and the presence of malicious participants pose significant challenges to the model’s performance and convergence. The global model is difficult to achieve consistent performance across all participants. Therefore, this paper proposes personalized and robust federated learning (PRFL) to handle non-independently and identically distributed (Non-IID) data with malicious participants. First, to enhance the robustness and convergence, a model similarity-based division mechanism is employed. It groups participants with similar data and removes both independent and colluding malicious participants. Second, we propose a three-stage knowledge sharing personalized federated learning framework. Each participant undergoes inner-loop knowledge sharing, outer-loop knowledge sharing, and personalized knowledge distillation, incorporating performance-driven dynamic weighted sharing mechanism. Moreover, extensive experiments demonstrate that PRFL outperforms other advanced personalized federated learning methods across various benchmark datasets, particularly in scenarios with Non-IID data and malicious participants.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 3","pages":"4331-4345"},"PeriodicalIF":9.2,"publicationDate":"2025-10-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146116900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-09-23DOI: 10.1109/TMC.2025.3613450
Xinyu Lu;Zhanbo Feng;Jiong Lou;Chentao Wu;Guangtao Xue;Wei Zhao;Jie Li
In recent years, gig platforms have emerged as a new paradigm, seamlessly connecting workers and tasks while leveraging workers’ collective intelligence, participation, and shared resources. Traditionally, platforms have operated under the assumption of worker homogeneity, where service capabilities and associated service costs are similar. However, in mobile computing scenarios, such as mobile crowdsensing, the diversity in worker capabilities and costs renders the supply and demand matching into a complex problem characterized by multiple layers of workers possessing distinct attributes. The dynamic nature of incoming task requests requires the continual reallocation of these workers, thereby introducing a time-dependent overhead. In this paper, we introduce a framework, called the Generative Diffusion Model with Duality Guidance, termed Guid, to address the intricate multi-layer scheduling problem. We formalize a time-slotted long-term optimization problem that captures the spatiotemporal dynamics of task requests and worker services, as well as the intricate time-coupled overhead. Our framework employs a generative diffusion model to explore the complex solution space of the problem and generate superior solutions. To effectively manage time coupling, we utilize dual optimization theory to generate time slot-aware information, guiding the generative diffusion model towards solutions that assure long-term performance. We provide a rigorous theoretical analysis demonstrating that our guidance solution ensures a parameterized competitive ratio guarantee relative to the theoretically optimal solution. Our comprehensive experiments further illustrate that the proposed method outperforms benchmark techniques, achieving reduced overhead compared to seven baseline methods.
{"title":"Multi-Layer Scheduling in Gig Platforms Using a Generative Diffusion Model With Duality Guidance","authors":"Xinyu Lu;Zhanbo Feng;Jiong Lou;Chentao Wu;Guangtao Xue;Wei Zhao;Jie Li","doi":"10.1109/TMC.2025.3613450","DOIUrl":"https://doi.org/10.1109/TMC.2025.3613450","url":null,"abstract":"In recent years, gig platforms have emerged as a new paradigm, seamlessly connecting workers and tasks while leveraging workers’ collective intelligence, participation, and shared resources. Traditionally, platforms have operated under the assumption of worker homogeneity, where service capabilities and associated service costs are similar. However, in mobile computing scenarios, such as mobile crowdsensing, the diversity in worker capabilities and costs renders the supply and demand matching into a complex problem characterized by multiple layers of workers possessing distinct attributes. The dynamic nature of incoming task requests requires the continual reallocation of these workers, thereby introducing a time-dependent overhead. In this paper, we introduce a framework, called the <italic><u>G</u>enerative Diffusion Model with Duality G<u>uid</u>ance</i>, termed <italic>Guid</i>, to address the intricate multi-layer scheduling problem. We formalize a time-slotted long-term optimization problem that captures the spatiotemporal dynamics of task requests and worker services, as well as the intricate time-coupled overhead. Our framework employs a generative diffusion model to explore the complex solution space of the problem and generate superior solutions. To effectively manage time coupling, we utilize dual optimization theory to generate time slot-aware information, guiding the generative diffusion model towards solutions that assure long-term performance. We provide a rigorous theoretical analysis demonstrating that our guidance solution ensures a parameterized competitive ratio guarantee relative to the theoretically optimal solution. Our comprehensive experiments further illustrate that the proposed method outperforms benchmark techniques, achieving reduced overhead compared to seven baseline methods.","PeriodicalId":50389,"journal":{"name":"IEEE Transactions on Mobile Computing","volume":"25 2","pages":"2927-2940"},"PeriodicalIF":9.2,"publicationDate":"2025-09-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145929416","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}