Pub Date : 2024-02-19DOI: 10.1109/TETCI.2024.3360308
Zhidong Wang;He Huang
This paper focuses on proposing a second-order complex-valued incremental learning (CVIL) algorithm for the structure optimization of fully complex-valued neural networks (FCVNNs). The main purpose of this study is to integrate the structure optimization and parameter learning of FCVNNs into a unified framework such that good generalization is guaranteed. A hybrid training strategy is firstly developed for FCVNNs with fixed structure. By introducing complex-valued sparse matrices and generalized augmented hidden output matrix, nonlinear parameters between the hidden and input neurons are trained by complex-valued Levenberg-Marquardt (CLM) algorithm and linear parameters between the output and hidden neurons are obtained by complex-valued least squares (CLS) algorithm. Starting with an initial FCVNN, hidden neurons are added one by one once the training falls in the plateau. It is theoretically shown that the objective function is monotonously decreasing after adding hidden neuron and successive learning is immediately continuous with the latest training results. Repetition training is avoided and thus training efficiency is achieved. The experimental results on the channel modulation identification and real-valued pattern classification tasks are provided to demonstrate that the developed algorithm is superior to some existing ones for the training of FCVNNs.
{"title":"Second-Order Structure Optimization of Fully Complex-Valued Neural Networks","authors":"Zhidong Wang;He Huang","doi":"10.1109/TETCI.2024.3360308","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3360308","url":null,"abstract":"This paper focuses on proposing a second-order complex-valued incremental learning (CVIL) algorithm for the structure optimization of fully complex-valued neural networks (FCVNNs). The main purpose of this study is to integrate the structure optimization and parameter learning of FCVNNs into a unified framework such that good generalization is guaranteed. A hybrid training strategy is firstly developed for FCVNNs with fixed structure. By introducing complex-valued sparse matrices and generalized augmented hidden output matrix, nonlinear parameters between the hidden and input neurons are trained by complex-valued Levenberg-Marquardt (CLM) algorithm and linear parameters between the output and hidden neurons are obtained by complex-valued least squares (CLS) algorithm. Starting with an initial FCVNN, hidden neurons are added one by one once the training falls in the plateau. It is theoretically shown that the objective function is monotonously decreasing after adding hidden neuron and successive learning is immediately continuous with the latest training results. Repetition training is avoided and thus training efficiency is achieved. The experimental results on the channel modulation identification and real-valued pattern classification tasks are provided to demonstrate that the developed algorithm is superior to some existing ones for the training of FCVNNs.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2024-02-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096207","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-16DOI: 10.1109/TETCI.2024.3360290
Yihan Wang;Yongfang Wang;Tengyao Cui;Zhijun Fang
Video-based Point Cloud Compression (V-PCC) was proposed by the Moving Picture Experts Group (MPEG) to standardize Point Cloud Compression (PCC). The main idea of V-PCC is to project the Dynamic Point Cloud (DPC) into auxiliary information, occupancy, geometry, and attribute videos for encoding utilizing High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), etc. Compared with the previous PCC algorithms, V-PCC has achieved a significant improvement in compression efficiency. However, it is accompanied by substantial computational complexity. To solve this problem, this paper proposes a fast V-PCC method to decrease the coding complexity. Taking into account the coding characteristic of V-PCC, the geometry and attribute maps are first classified into occupied and unoccupied blocks. Moreover, we analyze Coding Unit (CU) splitting for geometry and attribute map. Finally, we propose fast V-PCC algorithms based on early termination algorithm and transformer model, in which the early termination method is proposed for low complexity blocks in the geometry and attribute map, and the transformer model-based fast method is designed to predict the optimal CU splitting modes for the occupied block of the attribute map. The proposed algorithms are implemented with typical DPC sequences on the Test Model Category 2 (TMC2). The experimental results imply that the average time of the proposed method can significantly reduce 56.39% and 55.10% in the geometry and attribute map, respectively, with negligible Bjontegaard-Delta bitrate (BD-rate) compared with the anchor method.
{"title":"Fast Video-Based Point Cloud Compression Based on Early Termination and Transformer Model","authors":"Yihan Wang;Yongfang Wang;Tengyao Cui;Zhijun Fang","doi":"10.1109/TETCI.2024.3360290","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3360290","url":null,"abstract":"Video-based Point Cloud Compression (V-PCC) was proposed by the Moving Picture Experts Group (MPEG) to standardize Point Cloud Compression (PCC). The main idea of V-PCC is to project the Dynamic Point Cloud (DPC) into auxiliary information, occupancy, geometry, and attribute videos for encoding utilizing High Efficiency Video Coding (HEVC), Versatile Video Coding (VVC), etc. Compared with the previous PCC algorithms, V-PCC has achieved a significant improvement in compression efficiency. However, it is accompanied by substantial computational complexity. To solve this problem, this paper proposes a fast V-PCC method to decrease the coding complexity. Taking into account the coding characteristic of V-PCC, the geometry and attribute maps are first classified into occupied and unoccupied blocks. Moreover, we analyze Coding Unit (CU) splitting for geometry and attribute map. Finally, we propose fast V-PCC algorithms based on early termination algorithm and transformer model, in which the early termination method is proposed for low complexity blocks in the geometry and attribute map, and the transformer model-based fast method is designed to predict the optimal CU splitting modes for the occupied block of the attribute map. The proposed algorithms are implemented with typical DPC sequences on the Test Model Category 2 (TMC2). The experimental results imply that the average time of the proposed method can significantly reduce 56.39% and 55.10% in the geometry and attribute map, respectively, with negligible Bjontegaard-Delta bitrate (BD-rate) compared with the anchor method.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096291","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper investigates the application of multi-agent reinforcement learning (MARL) algorithm to solve the joint spectrum and power allocation problem (JSPAP) in wireless network. The objective of JSPAP is to optimize the subband selection and transmit power levels for links, with the aim of maximizing the sum-rate utility function. To address the JSPAP with discrete subband selection and continuous power allocation, most existing algorithms rely on a centralized optimizer and the instantaneous global channel state information, which can be challenging to implement in large wireless networks with time-varying subbands. To conquer such limitation, a two-stage MARL algorithm is proposed, which comprises a top layer network for selecting subbands across all links and a bottom layer network for determining the transmit power levels for all transmitters. By utilizing the value decomposition technique in the top layer network, the links can cooperatively select transmission subbands, effectively resolving non-stationarity issues in wireless network. Additionally, in the bottom layer network of the proposed two-stage MARL algorithm, each transmitter selects the transmit power level based solely on the local information, thereby effectively reducing computational burden. Empirical experiments demonstrate the effectiveness of the proposed two-stage MARL algorithm by comparison with the state-of-the-art RL algorithms and fractional programming algorithms.
{"title":"Joint Spectrum and Power Allocation in Wireless Network: A Two-Stage Multi-Agent Reinforcement Learning Method","authors":"Pengcheng Dai;He Wang;Huazhou Hou;Xusheng Qian;Wenwu Yu","doi":"10.1109/TETCI.2024.3360305","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3360305","url":null,"abstract":"This paper investigates the application of multi-agent reinforcement learning (MARL) algorithm to solve the joint spectrum and power allocation problem (JSPAP) in wireless network. The objective of JSPAP is to optimize the subband selection and transmit power levels for links, with the aim of maximizing the sum-rate utility function. To address the JSPAP with discrete subband selection and continuous power allocation, most existing algorithms rely on a centralized optimizer and the instantaneous global channel state information, which can be challenging to implement in large wireless networks with time-varying subbands. To conquer such limitation, a two-stage MARL algorithm is proposed, which comprises a top layer network for selecting subbands across all links and a bottom layer network for determining the transmit power levels for all transmitters. By utilizing the value decomposition technique in the top layer network, the links can cooperatively select transmission subbands, effectively resolving non-stationarity issues in wireless network. Additionally, in the bottom layer network of the proposed two-stage MARL algorithm, each transmitter selects the transmit power level based solely on the local information, thereby effectively reducing computational burden. Empirical experiments demonstrate the effectiveness of the proposed two-stage MARL algorithm by comparison with the state-of-the-art RL algorithms and fractional programming algorithms.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096168","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
In this paper, we present a novel curriculum reinforcement learning method that can automatically generate a high-performance autopilot controller for a 6-degree-of-freedom (6-DOF) aircraft with an unknown dynamic model, which is difficult to be handled using traditional control methods. In this method, a sigmoid-like learning curve is elegantly introduced to generate goals (the desired heading, altitude, and velocity) from easy to hard for autopilot. The shape of the learning curve can be intelligently adjusted to adapt to the training process of Proximal Policy Optimization (PPO). In addition, the conflict between multiple goals in autopilot training is solved by designing an adaptive reward function. Furthermore, the control inputs can avoid large oscillations by filtering the outputs from PPO with a first-order filter to ensure the smoothness. A series of simulation results show that the proposed method can not only observably improve the success rate and stability of training but also has superior performance in settling time and robustness compared with the traditional PID control and a state-of-the-art (SOTA) method. In the end, the applications of the controller, including the navigation task, pursuit-evasion, and dogfighting, are demonstrated to prove its feasibility to multiple tasks.
{"title":"Autopilot Controller of Fixed-Wing Planes Based on Curriculum Reinforcement Learning Scheduled by Adaptive Learning Curve","authors":"Lun Li;Xuebo Zhang;Chenxu Qian;Runhua Wang;Minghui Zhao","doi":"10.1109/TETCI.2024.3360322","DOIUrl":"https://doi.org/10.1109/TETCI.2024.3360322","url":null,"abstract":"In this paper, we present a novel curriculum reinforcement learning method that can automatically generate a high-performance autopilot controller for a 6-degree-of-freedom (6-DOF) aircraft with an unknown dynamic model, which is difficult to be handled using traditional control methods. In this method, a sigmoid-like learning curve is elegantly introduced to generate goals (the desired heading, altitude, and velocity) from easy to hard for autopilot. The shape of the learning curve can be intelligently adjusted to adapt to the training process of Proximal Policy Optimization (PPO). In addition, the conflict between multiple goals in autopilot training is solved by designing an adaptive reward function. Furthermore, the control inputs can avoid large oscillations by filtering the outputs from PPO with a first-order filter to ensure the smoothness. A series of simulation results show that the proposed method can not only observably improve the success rate and stability of training but also has superior performance in settling time and robustness compared with the traditional PID control and a state-of-the-art (SOTA) method. In the end, the applications of the controller, including the navigation task, pursuit-evasion, and dogfighting, are demonstrated to prove its feasibility to multiple tasks.","PeriodicalId":13135,"journal":{"name":"IEEE Transactions on Emerging Topics in Computational Intelligence","volume":null,"pages":null},"PeriodicalIF":5.3,"publicationDate":"2024-02-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141096290","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-02-16DOI: 10.1109/TETCI.2024.3360316
Abhishek Moitra;Abhiroop Bhattacharjee;Youngeun Kim;Priyadarshini Panda
In practical cloud-edge scenarios, where a resource constrained edge performs data acquisition and a cloud system (having sufficient resources) performs inference tasks with a deep neural network (DNN), adversarial robustness is critical for reliability and ubiquitous deployment. Adversarial detection is a prime adversarial defense technique used in prior literature. However, in prior detection works, the detector is attached to the classifier model and both detector and classifier work in tandem to perform adversarial detection that requires a high computational overhead which is not available at the lowpower edge. Therefore, prior works can only perform adversarial detection at the cloud and not at the edge. This means that in case of adversarial attacks, the unfavourable adversarial samples must be communicated to the cloud which leads to energy wastage at the edge device. Therefore, a low-power edge-friendly adversarial detection method is required to improve the energy efficiency of the edge and robustness of the cloud-based classifier. To this end, RobustEdge proposes Quantization-enabled Energy Separation (QES) training with “early detection and exit” to perform edge-based low cost adversarial detection. The QEStrained detector implemented at the edge blocks adversarial data transmission to the classifier model, thereby improving adversarial robustness and energy-efficiency of the Cloud-Edge system. Through extensive experiments on CIFAR10, CIFAR100 and TinyImagenet, we find that 16-bit and 12-bit quantized detectors achieve a high AUC score $>$