Pub Date : 2025-01-22DOI: 10.1109/TSMC.2025.3526234
Zhangping Tu;Wujie Zhou;Xiaohong Qian;Weiqing Yan
The aim of RGB-D Co-salient object detection (RGB-D Co-SOD) is to locate the most prominent objects within a provided collection of correlated RGB and depth images. The development of the Transformer has resulted in significant advancements in RGB-D Co-SOD. However, existing methods overlook the considerable computational and parametric costs associated with using the Transformer. Although compact models are computationally efficient, they suffer from performance degradation, which limits their practical applicability. This is because the reduction of model parameters weakens their feature representation capability. To bridge the performance gap between compact and complex models, we propose a hybrid knowledge distillation (KD) network, HKDNet-S*, to perform the RGB-D Co-SOD task. This method incorporates positive-negative logits approximation KD to guide the student network (HKDNet-S) in effectively learning the interrelationships among samples with multiple attributes by considering both positive and negative logits. HKDNet-S* primarily consists of the group cosaliency semantic exploration module and the positive and negative logits approximation KD method. Specifically, we employ a trained RGB-D Co-SOD model as a teacher model (HKDNet-T) to train the HKDNet-S with a limited number of participants using KD. Through extensive experiments on three challenging benchmark datasets (RGBD CoSal1k, RGBD CoSal150, and RGBD CoSeg183), we demonstrate that HKDNet-S* achieves superior accuracy while utilizing fewer parameters in comparison to the existing state-of-the-art methods.
{"title":"Hybrid Knowledge Distillation Network for RGB-D Co-Salient Object Detection","authors":"Zhangping Tu;Wujie Zhou;Xiaohong Qian;Weiqing Yan","doi":"10.1109/TSMC.2025.3526234","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3526234","url":null,"abstract":"The aim of RGB-D Co-salient object detection (RGB-D Co-SOD) is to locate the most prominent objects within a provided collection of correlated RGB and depth images. The development of the Transformer has resulted in significant advancements in RGB-D Co-SOD. However, existing methods overlook the considerable computational and parametric costs associated with using the Transformer. Although compact models are computationally efficient, they suffer from performance degradation, which limits their practical applicability. This is because the reduction of model parameters weakens their feature representation capability. To bridge the performance gap between compact and complex models, we propose a hybrid knowledge distillation (KD) network, HKDNet-S*, to perform the RGB-D Co-SOD task. This method incorporates positive-negative logits approximation KD to guide the student network (HKDNet-S) in effectively learning the interrelationships among samples with multiple attributes by considering both positive and negative logits. HKDNet-S* primarily consists of the group cosaliency semantic exploration module and the positive and negative logits approximation KD method. Specifically, we employ a trained RGB-D Co-SOD model as a teacher model (HKDNet-T) to train the HKDNet-S with a limited number of participants using KD. Through extensive experiments on three challenging benchmark datasets (RGBD CoSal1k, RGBD CoSal150, and RGBD CoSeg183), we demonstrate that HKDNet-S* achieves superior accuracy while utilizing fewer parameters in comparison to the existing state-of-the-art methods.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2695-2706"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655038","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2025.3525556
Xiao-Hui Liu;Guang-Hong Yang;Georgi Marko Dimirovski
This article is concerned with the optimal event-based sensor transmission strategy for wireless networked control systems (WNCSs), which aims to minimize the linear quadratic Gaussian (LQG) cost under the communication rate constraints. A sufficient and necessary condition for the convergence of LQG cost is derived. Then, under this condition, the optimal event-based transmission strategy (OETS) is obtained by proposing a search algorithm. Compared with the existing works, the real-time state information of the channel is not required. Finally, the validity of the results is illustrated through a numerical example.
{"title":"Event-Based Sensor Transmission Strategy for Wireless Networked Control Systems Over Time-Varying Channels","authors":"Xiao-Hui Liu;Guang-Hong Yang;Georgi Marko Dimirovski","doi":"10.1109/TSMC.2025.3525556","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3525556","url":null,"abstract":"This article is concerned with the optimal event-based sensor transmission strategy for wireless networked control systems (WNCSs), which aims to minimize the linear quadratic Gaussian (LQG) cost under the communication rate constraints. A sufficient and necessary condition for the convergence of LQG cost is derived. Then, under this condition, the optimal event-based transmission strategy (OETS) is obtained by proposing a search algorithm. Compared with the existing works, the real-time state information of the channel is not required. Finally, the validity of the results is illustrated through a numerical example.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2528-2536"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3525296
Jie Su;Yongduan Song
Achieving full state regulation within a prescribed-time for uncertain nonlinear systems under any initial condition is rather challenging although highly desirable, whereas most existing prescribed-time control results are literally contingent upon infinite feedback gain at the equilibrium. This article presents a new prescribed-time control design method that is able to ensure prescribed-time stability with bounded feedback gain and bounded control action during the entire process of system operation, elegantly circumventing the infinite feedback gain problem. As a nonscaling-based method with structural adaptation is utilized, the proposed control scheme is able to regulate all the states to zero well before the prescribed-time, yet in the presence of time-varying and mismatched structural uncertainties, substantially reducing the numerical computational complexity induced by scaling-based methods. The theoretical results are supported by two numerical simulations.
{"title":"Prescribed-Time Control With Bounded Feedback Gain: A Nonscaling and Structural Adaptation-Based Approach","authors":"Jie Su;Yongduan Song","doi":"10.1109/TSMC.2024.3525296","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3525296","url":null,"abstract":"Achieving full state regulation within a prescribed-time for uncertain nonlinear systems under any initial condition is rather challenging although highly desirable, whereas most existing prescribed-time control results are literally contingent upon infinite feedback gain at the equilibrium. This article presents a new prescribed-time control design method that is able to ensure prescribed-time stability with bounded feedback gain and bounded control action during the entire process of system operation, elegantly circumventing the infinite feedback gain problem. As a nonscaling-based method with structural adaptation is utilized, the proposed control scheme is able to regulate all the states to zero well before the prescribed-time, yet in the presence of time-varying and mismatched structural uncertainties, substantially reducing the numerical computational complexity induced by scaling-based methods. The theoretical results are supported by two numerical simulations.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2580-2589"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655057","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Online social networks (OSNs) facilitate the rapid and extensive spreading of rumors. While most existing methods for debunking rumors consider a solitary debunker, they overlook that rumor-mongering and debunking are interdependent and confrontational behaviors. In reality, a debunker must consider the impact of rumor-mongering behavior when making decisions. Moreover, a single rumor-debunking strategy is ineffective in addressing the complexity of the rumor environment in networks. Therefore, this article proposes a hybrid rumor-debunking approach that combines truth dissemination and regulatory measures based on the differential game theory under adversarial behaviors of rumor-mongering and debunking. Toward this end, we first establish a rumor propagation model using node-based modeling techniques that can be applied to any network structure. Next, we mathematically describe and analyze the processes of rumor-mongering and debunking. Finally, we validate the theoretical results of the proposed method through various comparative experiments, including comparisons with a random strategy, a uniform strategy, and single strategy models on real-world datasets collected from Facebook, Twitter, and YouTube. Furthermore, we harness two actual rumor events to estimate parameters and predict rumor propagation, thereby affirming the veracity and effectiveness of our rumor propagation model.
{"title":"Hybrid Rumor Debunking in Online Social Networks: A Differential Game Approach","authors":"Chenquan Gan;Wei Yang;Qingyi Zhu;Meng Li;Deepak Kumar Jain;Vitomir Štruc;Da-Wen Huang","doi":"10.1109/TSMC.2025.3526734","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3526734","url":null,"abstract":"Online social networks (OSNs) facilitate the rapid and extensive spreading of rumors. While most existing methods for debunking rumors consider a solitary debunker, they overlook that rumor-mongering and debunking are interdependent and confrontational behaviors. In reality, a debunker must consider the impact of rumor-mongering behavior when making decisions. Moreover, a single rumor-debunking strategy is ineffective in addressing the complexity of the rumor environment in networks. Therefore, this article proposes a hybrid rumor-debunking approach that combines truth dissemination and regulatory measures based on the differential game theory under adversarial behaviors of rumor-mongering and debunking. Toward this end, we first establish a rumor propagation model using node-based modeling techniques that can be applied to any network structure. Next, we mathematically describe and analyze the processes of rumor-mongering and debunking. Finally, we validate the theoretical results of the proposed method through various comparative experiments, including comparisons with a random strategy, a uniform strategy, and single strategy models on real-world datasets collected from Facebook, Twitter, and YouTube. Furthermore, we harness two actual rumor events to estimate parameters and predict rumor propagation, thereby affirming the veracity and effectiveness of our rumor propagation model.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2513-2527"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=10849987","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655093","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this article, a novel visionary policy iteration (VPI) framework is proposed to address the continuous-action reinforcement learning (RL) tasks. In VPI, a visionary Q-function is constructed by incorporating the successor state into the standard Q-function. Due to the introduction of the successor state, the proposed visionary Q-function captures information about state transitions within the Markov decision process (MDP), thereby providing a forward-looking perspective that enables a more accurate and foresighted evaluation of potential action outcomes. The relationship between the visionary Q-function and the standard Q-function is analyzed. Subsequently, both the policy evaluation and policy improvement rules in VPI are designed based on the proposed visionary Q-function. The convergence proof for VPI is provided, ensuring that the iterative policy sequence in VPI will converge to the optimal policy. By combining the VPI framework with the twin delayed deep deterministic policy gradient (TD3) algorithm, a visionary TD3 (VTD3) algorithm is developed. The evaluation of VTD3 is performed on multiple continuous-action control tasks from Mujoco and OpenAI Gym platforms. The results of comparative experiments demonstrate that VTD3 can achieve more competitive performance than other state-of-the-art (SOTA) RL approaches. Additionally, the experimental results indicate that VPI enhances decision-making capability, reduces Q-function estimation bias, and improves sample efficiency, thereby boosting the performance of existing RL algorithms.
{"title":"Visionary Policy Iteration for Continuous Control","authors":"Botao Dong;Longyang Huang;Xiwen Ma;Hongtian Chen;Weidong Zhang","doi":"10.1109/TSMC.2025.3525473","DOIUrl":"https://doi.org/10.1109/TSMC.2025.3525473","url":null,"abstract":"In this article, a novel visionary policy iteration (VPI) framework is proposed to address the continuous-action reinforcement learning (RL) tasks. In VPI, a visionary Q-function is constructed by incorporating the successor state into the standard Q-function. Due to the introduction of the successor state, the proposed visionary Q-function captures information about state transitions within the Markov decision process (MDP), thereby providing a forward-looking perspective that enables a more accurate and foresighted evaluation of potential action outcomes. The relationship between the visionary Q-function and the standard Q-function is analyzed. Subsequently, both the policy evaluation and policy improvement rules in VPI are designed based on the proposed visionary Q-function. The convergence proof for VPI is provided, ensuring that the iterative policy sequence in VPI will converge to the optimal policy. By combining the VPI framework with the twin delayed deep deterministic policy gradient (TD3) algorithm, a visionary TD3 (VTD3) algorithm is developed. The evaluation of VTD3 is performed on multiple continuous-action control tasks from Mujoco and OpenAI Gym platforms. The results of comparative experiments demonstrate that VTD3 can achieve more competitive performance than other state-of-the-art (SOTA) RL approaches. Additionally, the experimental results indicate that VPI enhances decision-making capability, reduces Q-function estimation bias, and improves sample efficiency, thereby boosting the performance of existing RL algorithms.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2707-2720"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654926","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Amid the global push for sustainable development, rising market demands have necessitated a multiregional, multiobjective, and flexible production model. Against this backdrop, this article investigates the multiobjective distributed flow shop group scheduling problem by formulating a mathematical model and introducing an advanced memetic algorithm integrated with reinforcement learning (RLMA). The RLMA involves a novel cooperative crossover operation in conjunction with the nature of the coupled problems to extensively explore the solution space. Additionally, the Sarsa algorithm enhanced with eligibility traces guides the selection of optimal schemes during the local enhancement phase. To ensure a balance between convergence and diversity, a solution selection strategy based on penalty-based boundary intersection decomposition is utilized. Furthermore, the increasing-efficiency and reducing-consumption strategies integrating a rapid evaluation mechanism are designed by dynamically changing the machine speed to balance economic and sustainability metrics. Comprehensive numerical experiments and comparative analyses demonstrate that the proposed RLMA surpasses existing state-of-the-art algorithms in addressing this complex problem.
{"title":"Reinforcement Learning-Assisted Memetic Algorithm for Sustainability-Oriented Multiobjective Distributed Flow Shop Group Scheduling","authors":"Yuhang Wang;Yuyan Han;Yuting Wang;Xianpeng Wang;Yiping Liu;Kaizhou Gao","doi":"10.1109/TSMC.2024.3518625","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3518625","url":null,"abstract":"Amid the global push for sustainable development, rising market demands have necessitated a multiregional, multiobjective, and flexible production model. Against this backdrop, this article investigates the multiobjective distributed flow shop group scheduling problem by formulating a mathematical model and introducing an advanced memetic algorithm integrated with reinforcement learning (RLMA). The RLMA involves a novel cooperative crossover operation in conjunction with the nature of the coupled problems to extensively explore the solution space. Additionally, the Sarsa algorithm enhanced with eligibility traces guides the selection of optimal schemes during the local enhancement phase. To ensure a balance between convergence and diversity, a solution selection strategy based on penalty-based boundary intersection decomposition is utilized. Furthermore, the increasing-efficiency and reducing-consumption strategies integrating a rapid evaluation mechanism are designed by dynamically changing the machine speed to balance economic and sustainability metrics. Comprehensive numerical experiments and comparative analyses demonstrate that the proposed RLMA surpasses existing state-of-the-art algorithms in addressing this complex problem.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2399-2413"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655055","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3524795
Peng Wang;Yingxin Fu;Peide Liu
Graph model for conflict resolution (GMCR) is an effective tool to solve conflicts, which determines the feasible states by modeling the conflict, and then analyzes the behavior of decision-makers (DMs) through stability analysis to find a solution to the conflict. This article studies the composite DMs (CDMs) and the heterogeneous behaviors of opponents in GMCR. Based on the social relationship between DMs, the social network is applied to analyze the individuals in CDMs and to identify the types of heterogeneous behaviors of DMs. Combining social network and aggregating operator, this article unifies the preferences of individuals in a CDM. Subsequently, an identification mechanism is designed to determine the kind of opponents’ heterogeneous behaviors. Then, the mixed stabilities are extended to hesitant fuzzy mixed general meta-rationality (HFMGMR) and hesitant fuzzy mixed symmetric meta-rationality (HFMSMR). The matrix representations of two stabilities are developed to analyze the equilibrium of conflicts. Finally, a conflict in pollution rectification of industry enterprises is analyzed to demonstrate how social networks can be applied to GMCR with CDM and heterogeneous opponents. Hesitant fuzzy mixed stability analysis reveals the influence of heterogeneous behaviors in GMCR. Different types of DM behavior lead to different equilibrium results, which is concluded in this article.
{"title":"Graph Model for Conflict Resolution Considering Heterogeneous Behavior Based on Hesitant Fuzzy Preference and Social Network Analysis","authors":"Peng Wang;Yingxin Fu;Peide Liu","doi":"10.1109/TSMC.2024.3524795","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3524795","url":null,"abstract":"Graph model for conflict resolution (GMCR) is an effective tool to solve conflicts, which determines the feasible states by modeling the conflict, and then analyzes the behavior of decision-makers (DMs) through stability analysis to find a solution to the conflict. This article studies the composite DMs (CDMs) and the heterogeneous behaviors of opponents in GMCR. Based on the social relationship between DMs, the social network is applied to analyze the individuals in CDMs and to identify the types of heterogeneous behaviors of DMs. Combining social network and aggregating operator, this article unifies the preferences of individuals in a CDM. Subsequently, an identification mechanism is designed to determine the kind of opponents’ heterogeneous behaviors. Then, the mixed stabilities are extended to hesitant fuzzy mixed general meta-rationality (HFMGMR) and hesitant fuzzy mixed symmetric meta-rationality (HFMSMR). The matrix representations of two stabilities are developed to analyze the equilibrium of conflicts. Finally, a conflict in pollution rectification of industry enterprises is analyzed to demonstrate how social networks can be applied to GMCR with CDM and heterogeneous opponents. Hesitant fuzzy mixed stability analysis reveals the influence of heterogeneous behaviors in GMCR. Different types of DM behavior lead to different equilibrium results, which is concluded in this article.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2644-2658"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-22DOI: 10.1109/TSMC.2024.3525038
Tao Dong;Rui He;Huaqing Li;Wenjie Hu;Tingwen Huang
Phase-change memory (PCM) is a novel type of nonvolatile memory and offers low power consumption, high integration, and significant plasticity, making it suitable for neural synapses. In this article, we investigate the global exponential stabilization (GES) of phase-change inertial neural networks (PCINNs) with discrete and distributed time-varying delays. Initially, a piecewise equation is established to model the electrical conductivity of PCM. Based on this, we use PCM to simulate neural synapses, and a class of PCINNs with discrete and distributed time-varying delays is formulated. A continuous state feedback controller is designed to obtain the ${boldsymbol {rho} {textrm {th}}({rho ge 1})}$ moment GES conditions of PCINNs in the Filippov sense by using differential inclusion theory, comparison strategies, and inequality techniques. Additionally, the global exponential stability conditions of phase-change Hopfield neural networks are obtained, expressed in the form of an M-matrix. Finally, three simulation examples are provided to verify the effectiveness of the theoretical results.
{"title":"Exponential Stabilization of Phase-Change Inertial Neural Networks With Time-Varying Delays","authors":"Tao Dong;Rui He;Huaqing Li;Wenjie Hu;Tingwen Huang","doi":"10.1109/TSMC.2024.3525038","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3525038","url":null,"abstract":"Phase-change memory (PCM) is a novel type of nonvolatile memory and offers low power consumption, high integration, and significant plasticity, making it suitable for neural synapses. In this article, we investigate the global exponential stabilization (GES) of phase-change inertial neural networks (PCINNs) with discrete and distributed time-varying delays. Initially, a piecewise equation is established to model the electrical conductivity of PCM. Based on this, we use PCM to simulate neural synapses, and a class of PCINNs with discrete and distributed time-varying delays is formulated. A continuous state feedback controller is designed to obtain the <inline-formula> <tex-math>${boldsymbol {rho} {textrm {th}}({rho ge 1})}$ </tex-math></inline-formula> moment GES conditions of PCINNs in the Filippov sense by using differential inclusion theory, comparison strategies, and inequality techniques. Additionally, the global exponential stability conditions of phase-change Hopfield neural networks are obtained, expressed in the form of an M-matrix. Finally, three simulation examples are provided to verify the effectiveness of the theoretical results.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2659-2669"},"PeriodicalIF":8.6,"publicationDate":"2025-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654929","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-20DOI: 10.1109/TSMC.2024.3524158
I. D. Wijegunawardana;S. M. Bhagya P. Samarakoon;M. A. Viraj J. Muthugala;Mohan Rajesh Elara
Complete coverage path planning (CCPP) is a trending research area in floor cleaning robotics. CCPP is often approached as an optimization problem, typically solved by considering factors, such as power consumption and time as key objectives. In recent years, the safety of cleaning robots has become a major concern, which can critically limit the performance and lifetime of the robots. However, so far, optimizing safety has rarely been addressed in CCPP. Most of the path-planning algorithms in literature tend to identify and avoid the hazards detected by the robot’s perception. However, these systems can limit the area coverage of the robot or pose a risk of failing when the robot is near a hazard. Therefore, this article proposes a novel CCPP method with the awareness of risk levels for a robot to minimize possible hazards to the robot during a coverage task. The proposed CCPP strategy uses reinforcement learning (RL) to obtain a safety-ensured path plan that evaluates and when necessary, avoid the hazardous components in their environment in real time. Furthermore, the failure mode and effect analysis (FMEA) method has been adopted to classify the hazards identified in the environment of the robot and suitably modified to evaluate the risk levels. These risk levels are used in the reward architecture of the RL. Thus, the robot can cross the low-risk hazardous environments if it is necessary to obtain complete coverage. Experimental results showed a noticeable reduction in overall risk faced by a robot compared to the existing methods, while also effectively achieving complete coverage.
{"title":"Risk-Aware Complete Coverage Path Planning Using Reinforcement Learning","authors":"I. D. Wijegunawardana;S. M. Bhagya P. Samarakoon;M. A. Viraj J. Muthugala;Mohan Rajesh Elara","doi":"10.1109/TSMC.2024.3524158","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3524158","url":null,"abstract":"Complete coverage path planning (CCPP) is a trending research area in floor cleaning robotics. CCPP is often approached as an optimization problem, typically solved by considering factors, such as power consumption and time as key objectives. In recent years, the safety of cleaning robots has become a major concern, which can critically limit the performance and lifetime of the robots. However, so far, optimizing safety has rarely been addressed in CCPP. Most of the path-planning algorithms in literature tend to identify and avoid the hazards detected by the robot’s perception. However, these systems can limit the area coverage of the robot or pose a risk of failing when the robot is near a hazard. Therefore, this article proposes a novel CCPP method with the awareness of risk levels for a robot to minimize possible hazards to the robot during a coverage task. The proposed CCPP strategy uses reinforcement learning (RL) to obtain a safety-ensured path plan that evaluates and when necessary, avoid the hazardous components in their environment in real time. Furthermore, the failure mode and effect analysis (FMEA) method has been adopted to classify the hazards identified in the environment of the robot and suitably modified to evaluate the risk levels. These risk levels are used in the reward architecture of the RL. Thus, the robot can cross the low-risk hazardous environments if it is necessary to obtain complete coverage. Experimental results showed a noticeable reduction in overall risk faced by a robot compared to the existing methods, while also effectively achieving complete coverage.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2476-2488"},"PeriodicalIF":8.6,"publicationDate":"2025-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143655094","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1109/TSMC.2024.3524390
Yi Zhang;Sheng Huang;Luwen Huangfu;Daniel Dajun Zeng
Interest in few-shot learning (FSL) has grown recently, but the value of feature learning, which bridges the gap between base and novel classes, remains largely understudied. The limited availability of labeled samples for each class poses a major challenge. To tackle this, we propose a simple yet effective approach called deep discriminative handcrafted feature regression (DDHFR) to explore intrinsic information and select improved discriminative features in few-shot data by mining knowledge from classical handcrafted features. To explore intrinsic information, we design several deep handcrafted feature regression (DHFR) modules and plugged them separately into different layers of the backbone to use feature engineering knowledge for feature learning optimization at different granularities. To achieve discriminative feature selection, we incorporate an auxiliary classifier (AC) into each DHFR module to enhance the acquisition of discriminative information. Furthermore, we employed self-distillation to boost ability of ACs ot be classified. Experimental results in three backbones on three datasets show that DDHFR can generally improve the performance of existing FSL methods. On average, it improves the recognition accuracy by 1.16% in two common few-shot settings.
{"title":"Learning Feature Exploration and Selection With Handcrafted Features for Few-Shot Learning","authors":"Yi Zhang;Sheng Huang;Luwen Huangfu;Daniel Dajun Zeng","doi":"10.1109/TSMC.2024.3524390","DOIUrl":"https://doi.org/10.1109/TSMC.2024.3524390","url":null,"abstract":"Interest in few-shot learning (FSL) has grown recently, but the value of feature learning, which bridges the gap between base and novel classes, remains largely understudied. The limited availability of labeled samples for each class poses a major challenge. To tackle this, we propose a simple yet effective approach called deep discriminative handcrafted feature regression (DDHFR) to explore intrinsic information and select improved discriminative features in few-shot data by mining knowledge from classical handcrafted features. To explore intrinsic information, we design several deep handcrafted feature regression (DHFR) modules and plugged them separately into different layers of the backbone to use feature engineering knowledge for feature learning optimization at different granularities. To achieve discriminative feature selection, we incorporate an auxiliary classifier (AC) into each DHFR module to enhance the acquisition of discriminative information. Furthermore, we employed self-distillation to boost ability of ACs ot be classified. Experimental results in three backbones on three datasets show that DDHFR can generally improve the performance of existing FSL methods. On average, it improves the recognition accuracy by 1.16% in two common few-shot settings.","PeriodicalId":48915,"journal":{"name":"IEEE Transactions on Systems Man Cybernetics-Systems","volume":"55 4","pages":"2599-2610"},"PeriodicalIF":8.6,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143654928","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}