Pub Date : 2024-05-29DOI: 10.1007/s11063-024-11641-w
Shaohua Dong, Xiaochao Fan, Xinchun Ma
Multimodal sentiment analysis is a downstream branch task of sentiment analysis with high attention at present. Previous work in multimodal sentiment analysis have focused on the representation and fusion of modalities, capturing the underlying semantic relationships between modalities by considering contextual information. While this approach is feasible for simple contextual comments, more complex comments require the integration of external knowledge to obtain more accurate sentiment information. However, incorporating external knowledge into sentiment analysis to enhance information complementarity has not been thoroughly investigated. To address this, we propose a multichannel cross-modal feedback interaction model that incorporates the knowledge graph into multimodal sentiment analysis. Our proposed model consists of two main components: the cross-modal feedback recurrent interaction module and the external knowledge module for capturing latent information. The cross-modal interaction employs a self-feedback mechanism during network training, extracting feature representations of each modality and using these representations to mask sensory inputs, allowing the model to perform feedback-based feature masking. The external knowledge graph captures potential semantic information representations in the textual data through knowledge graph embedding. Finally, a global feature fusion module is employed for multichannel multimodal information integration. On two publicly available datasets, our method demonstrates good performance in terms of accuracy and F1 scores, compared to state-of-the-art models and several baselines.
多模态情感分析是情感分析的下游分支任务,目前备受关注。以往的多模态情感分析工作侧重于模态的表示和融合,通过考虑上下文信息来捕捉模态之间的潜在语义关系。虽然这种方法对于简单的上下文评论是可行的,但更复杂的评论则需要整合外部知识才能获得更准确的情感信息。然而,将外部知识纳入情感分析以增强信息互补性的做法尚未得到深入研究。为此,我们提出了一种多渠道跨模态反馈交互模型,将知识图谱融入多模态情感分析中。我们提出的模型由两个主要部分组成:跨模态反馈循环交互模块和用于捕捉潜在信息的外部知识模块。跨模态交互模块在网络训练过程中采用自我反馈机制,提取每种模态的特征表征,并利用这些表征来屏蔽感官输入,从而使模型能够执行基于反馈的特征屏蔽。外部知识图谱通过知识图谱嵌入捕捉文本数据中潜在的语义信息表征。最后,全局特征融合模块用于多通道多模态信息整合。在两个公开可用的数据集上,与最先进的模型和几种基线相比,我们的方法在准确率和 F1 分数方面表现出色。
{"title":"Multichannel Multimodal Emotion Analysis of Cross-Modal Feedback Interactions Based on Knowledge Graph","authors":"Shaohua Dong, Xiaochao Fan, Xinchun Ma","doi":"10.1007/s11063-024-11641-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11641-w","url":null,"abstract":"<p>Multimodal sentiment analysis is a downstream branch task of sentiment analysis with high attention at present. Previous work in multimodal sentiment analysis have focused on the representation and fusion of modalities, capturing the underlying semantic relationships between modalities by considering contextual information. While this approach is feasible for simple contextual comments, more complex comments require the integration of external knowledge to obtain more accurate sentiment information. However, incorporating external knowledge into sentiment analysis to enhance information complementarity has not been thoroughly investigated. To address this, we propose a multichannel cross-modal feedback interaction model that incorporates the knowledge graph into multimodal sentiment analysis. Our proposed model consists of two main components: the cross-modal feedback recurrent interaction module and the external knowledge module for capturing latent information. The cross-modal interaction employs a self-feedback mechanism during network training, extracting feature representations of each modality and using these representations to mask sensory inputs, allowing the model to perform feedback-based feature masking. The external knowledge graph captures potential semantic information representations in the textual data through knowledge graph embedding. Finally, a global feature fusion module is employed for multichannel multimodal information integration. On two publicly available datasets, our method demonstrates good performance in terms of accuracy and F1 scores, compared to state-of-the-art models and several baselines.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"52 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170614","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-29DOI: 10.1007/s11063-024-11649-2
Ruikun Zhang, Shangyu Sang, Jingyuan Zhang, Xue Lin
This paper proposes a quantized model-free adaptive iterative learning control (MFAILC) algorithm to solve the bipartite containment tracking problem of unknown nonlinear multi-agent systems, where the interactions between agents include cooperation and antagonistic interactions. To design the controller, the agent’s dynamics is transformed into the linear data model based on the dynamic linearization method, and then a quantized MFAILC algorithm is established based on the quantized values of the relative output measurements. The designed controller only depends on the input and output data of the agent. We prove that under the quantized MFAILC algorithm, the multi-agent systems can achieve the bipartite containment, that is, the output trajectories of followers converge to the convex hull formed by the leaders’ trajectories and the leaders’ symmetric trajectories. Finally, we provide simulations to illustrate the effectiveness of our theoretical results.
{"title":"Quantized Iterative Learning Bipartite Containment Tracking Control for Unknown Nonlinear Multi-agent Systems","authors":"Ruikun Zhang, Shangyu Sang, Jingyuan Zhang, Xue Lin","doi":"10.1007/s11063-024-11649-2","DOIUrl":"https://doi.org/10.1007/s11063-024-11649-2","url":null,"abstract":"<p>This paper proposes a quantized model-free adaptive iterative learning control (MFAILC) algorithm to solve the bipartite containment tracking problem of unknown nonlinear multi-agent systems, where the interactions between agents include cooperation and antagonistic interactions. To design the controller, the agent’s dynamics is transformed into the linear data model based on the dynamic linearization method, and then a quantized MFAILC algorithm is established based on the quantized values of the relative output measurements. The designed controller only depends on the input and output data of the agent. We prove that under the quantized MFAILC algorithm, the multi-agent systems can achieve the bipartite containment, that is, the output trajectories of followers converge to the convex hull formed by the leaders’ trajectories and the leaders’ symmetric trajectories. Finally, we provide simulations to illustrate the effectiveness of our theoretical results.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"14 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141187764","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11063-024-11640-x
Waleed Razzaq, Kashif Raza
The development of high-stakes decision-making neural agents that interact with complex environments, such as video games, is an important aspect of AI research with numerous potential applications. Reinforcement learning combined with deep learning architectures (DRL) has shown remarkable success in various genres of games. The performance of DRL is heavily dependent upon the neural networks resides within them. Although these algorithms perform well in offline testing but the performance deteriorates in noisy and sub-optimal conditions, creating safety and security issues. To address these, we propose a hybrid deep learning architecture that combines a traditional convolutional neural network with worm brain-inspired neural circuit policies. This allows the agent to learn key coherent features from the environment and interpret its dynamics. The obtained DRL agent was not only able to achieve an optimal policy quickly, but it was also the most noise-resilient with the highest success rate. Our research indicates that only 20 control neurons (12 inter-neurons and 8 command neurons) are sufficient to achieve competitive results. We implemented and analyzed the agent in the popular video game Doom, demonstrating its effectiveness in practical applications.
{"title":"Neural Circuit Policies for Virtual Character Control","authors":"Waleed Razzaq, Kashif Raza","doi":"10.1007/s11063-024-11640-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11640-x","url":null,"abstract":"<p>The development of high-stakes decision-making neural agents that interact with complex environments, such as video games, is an important aspect of AI research with numerous potential applications. Reinforcement learning combined with deep learning architectures (DRL) has shown remarkable success in various genres of games. The performance of DRL is heavily dependent upon the neural networks resides within them. Although these algorithms perform well in offline testing but the performance deteriorates in noisy and sub-optimal conditions, creating safety and security issues. To address these, we propose a hybrid deep learning architecture that combines a traditional convolutional neural network with worm brain-inspired neural circuit policies. This allows the agent to learn key coherent features from the environment and interpret its dynamics. The obtained DRL agent was not only able to achieve an optimal policy quickly, but it was also the most noise-resilient with the highest success rate. Our research indicates that only 20 control neurons (12 inter-neurons and 8 command neurons) are sufficient to achieve competitive results. We implemented and analyzed the agent in the popular video game Doom, demonstrating its effectiveness in practical applications.\u0000</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"44 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170874","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11063-024-11629-6
Juan Yang, Guanghong Zhou, Ronggui Wang, Lixia Xue
Existing pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.
{"title":"Sample-Adaptive Classification Inference Network","authors":"Juan Yang, Guanghong Zhou, Ronggui Wang, Lixia Xue","doi":"10.1007/s11063-024-11629-6","DOIUrl":"https://doi.org/10.1007/s11063-024-11629-6","url":null,"abstract":"<p>Existing pre-trained models have yielded promising results in terms of computational time reduction. However, these models only focus on pruning simple sentences or less salient words, while neglecting the treatment of relatively complex sentences. It is frequently these sentences that cause the loss of model accuracy. This shows that the adaptation of the existing models is one-sided. To address this issue, in this paper, we propose a sample-adaptive training and inference model. Specifically, complex samples are extracted from the training datasets and a dedicated data augmentation module is trained to extract global and local semantic information of complex samples. During inference, simple samples can exit the model via the Sample Adaptive Exit Mechanism, Normal samples pass through the whole backbone model before inference, while complex samples are processed by the Characteristic Enhancement Module after passing through the backbone model. In this way, all samples are processed adaptively. Our extensive experiments on classification tasks datasets in the field of Natural Language Processing demonstrate that our method enhances model accuracy and reduces model inference time for multiple datasets. Moreover, our method is transferable and can be applied to multiple pre-trained models.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"23 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170615","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11063-024-11648-3
Lingyan Wang, Huaiqin Wu, Jinde Cao
This paper investigates the topology identification and synchronization in finite time for fractional singularly perturbed complex networks (FSPCNs). Firstly, a convergence principle is developed for continuously differential functions. Secondly, a dynamic event-triggered mechanism (DETM) is designed to achieve the network synchronization, and a topology observer is developed to identify the network topology. Thirdly, under the designed DETM, by constructing a Lyapunov functional and applying the inequality analysis technique, the topology identification and synchronization condition in finite time is established in the forms of the matrix inequality. In addition, it is proved that the Zeno behavior can be effectively excluded. Finally, the effectiveness of the main results is verified by an application example.
{"title":"An Observer-Based Topology Identification and Synchronization in Finite Time for Fractional Singularly Perturbed Complex Networks via Dynamic Event-Triggered Control","authors":"Lingyan Wang, Huaiqin Wu, Jinde Cao","doi":"10.1007/s11063-024-11648-3","DOIUrl":"https://doi.org/10.1007/s11063-024-11648-3","url":null,"abstract":"<p>This paper investigates the topology identification and synchronization in finite time for fractional singularly perturbed complex networks (FSPCNs). Firstly, a convergence principle is developed for continuously differential functions. Secondly, a dynamic event-triggered mechanism (DETM) is designed to achieve the network synchronization, and a topology observer is developed to identify the network topology. Thirdly, under the designed DETM, by constructing a Lyapunov functional and applying the inequality analysis technique, the topology identification and synchronization condition in finite time is established in the forms of the matrix inequality. In addition, it is proved that the Zeno behavior can be effectively excluded. Finally, the effectiveness of the main results is verified by an application example.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"20 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170610","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11063-024-11634-9
Qing Ye, Zihan Song, Yuqi Zhao, Yongmei Zhang
Video anomaly event detection is crucial for analyzing surveillance videos. Existing methods have limitations: frame-level detection fails to remove background interference, and object-level methods overlook object-environment interaction. To address these issues, this paper proposes a novel video anomaly event detection algorithm based on a dual-channel autoencoder with key region feature enhancement. The goal is to preserve valuable information in the global context while focusing on regions with a high anomaly occurrence. Firstly, a key region extraction network is proposed to perform foreground segmentation on video frames, eliminating background redundancy. Secondly, a dual-channel autoencoder is designed to enhance the features of key regions, enabling the model to extract more representative features. Finally, channel attention modules are inserted between each deconvolution layer of the decoder to enhance the model’s perception and discrimination of valuable information. Compared to existing methods, our approach accurately locates and focuses on regions with a high anomaly occurrence, improving the accuracy of anomaly event detection. Extensive experiments are conducted on the UCSD ped2, CUHK Avenue, and SHTech Campus datasets, and the results validate the effectiveness of the proposed method.
{"title":"Dual-Channel Autoencoder with Key Region Feature Enhancement for Video Anomalous Event Detection","authors":"Qing Ye, Zihan Song, Yuqi Zhao, Yongmei Zhang","doi":"10.1007/s11063-024-11634-9","DOIUrl":"https://doi.org/10.1007/s11063-024-11634-9","url":null,"abstract":"<p>Video anomaly event detection is crucial for analyzing surveillance videos. Existing methods have limitations: frame-level detection fails to remove background interference, and object-level methods overlook object-environment interaction. To address these issues, this paper proposes a novel video anomaly event detection algorithm based on a dual-channel autoencoder with key region feature enhancement. The goal is to preserve valuable information in the global context while focusing on regions with a high anomaly occurrence. Firstly, a key region extraction network is proposed to perform foreground segmentation on video frames, eliminating background redundancy. Secondly, a dual-channel autoencoder is designed to enhance the features of key regions, enabling the model to extract more representative features. Finally, channel attention modules are inserted between each deconvolution layer of the decoder to enhance the model’s perception and discrimination of valuable information. Compared to existing methods, our approach accurately locates and focuses on regions with a high anomaly occurrence, improving the accuracy of anomaly event detection. Extensive experiments are conducted on the UCSD ped2, CUHK Avenue, and SHTech Campus datasets, and the results validate the effectiveness of the proposed method.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"130 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170612","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-28DOI: 10.1007/s11063-024-11633-w
Meiqi Wang, Kangyu Qiao, Shuyue Xing, Caixia Yuan, Xiaojie Wang
Document information selection is an essential part of document-grounded dialogue tasks, and more accurate information selection results can provide more appropriate dialogue responses. Existing works have achieved excellent results by employing multi-granularity of dialogue history information, indicating the effectiveness of multi-level historical information. However, these works often focus on exploring the hierarchical information of dialogue history, while neglecting the multi-granularity utilization in response, important information that holds an impact on the decoding process. Therefore, this paper proposes a model for document information selection based on multi-granularity responses. By integrating the document selection results at the response word level and semantic unit level, the model enhances its capability in knowledge selection and produces better responses. For the division at the semantic unit level of the response, we propose two semantic unit division methods, static and dynamic. Experiments on two public datasets show that our models combining static or dynamic semantic unit levels significantly outperform baseline models.
{"title":"Enhancing Document Information Selection Through Multi-Granularity Responses for Dialogue Generation","authors":"Meiqi Wang, Kangyu Qiao, Shuyue Xing, Caixia Yuan, Xiaojie Wang","doi":"10.1007/s11063-024-11633-w","DOIUrl":"https://doi.org/10.1007/s11063-024-11633-w","url":null,"abstract":"<p>Document information selection is an essential part of document-grounded dialogue tasks, and more accurate information selection results can provide more appropriate dialogue responses. Existing works have achieved excellent results by employing multi-granularity of dialogue history information, indicating the effectiveness of multi-level historical information. However, these works often focus on exploring the hierarchical information of dialogue history, while neglecting the multi-granularity utilization in response, important information that holds an impact on the decoding process. Therefore, this paper proposes a model for document information selection based on multi-granularity responses. By integrating the document selection results at the response word level and semantic unit level, the model enhances its capability in knowledge selection and produces better responses. For the division at the semantic unit level of the response, we propose two semantic unit division methods, static and dynamic. Experiments on two public datasets show that our models combining static or dynamic semantic unit levels significantly outperform baseline models.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"236 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170972","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s11063-024-11618-9
Na Li, Sen Xu, Heyang Xu, Xiufang Xu, Naixuan Guo, Na Cai
Clustering ensembles can obtain more superior final results by combining multiple different clustering results. The qualities of the points, clusters, and partitions play crucial roles in the consistency of the clustering process. However, existing methods mostly focus on one or two aspects of them, without a comprehensive consideration of the three aspects. This paper proposes a three-level weighted clustering ensemble algorithm namely unified point-cluser-partition algorithm (PCPA). The first step of the PCPA is to generate the adjacency matrix by base clusterings. Then, the central step is to obtain the weighted adjacency matrix by successively weighting three layers, i.e., points, clusters, and partitions. Finally, the consensus clustering is obtained by the average link method. Three performance indexes, namely F, NMI, and ARI, are used to evaluate the accuracy of the proposed method. The experimental results show that: Firstly, as expected, the proposed three-layer weighted clustering ensemble can improve the accuracy of each evaluation index by an average value of 22.07% compared with the direct clustering ensemble without weighting; Secondly, compared with seven other methods, PCPA can achieve better clustering results and the proportion that PCPA ranks first is 28/33.
{"title":"A Point-Cluster-Partition Architecture for Weighted Clustering Ensemble","authors":"Na Li, Sen Xu, Heyang Xu, Xiufang Xu, Naixuan Guo, Na Cai","doi":"10.1007/s11063-024-11618-9","DOIUrl":"https://doi.org/10.1007/s11063-024-11618-9","url":null,"abstract":"<p>Clustering ensembles can obtain more superior final results by combining multiple different clustering results. The qualities of the points, clusters, and partitions play crucial roles in the consistency of the clustering process. However, existing methods mostly focus on one or two aspects of them, without a comprehensive consideration of the three aspects. This paper proposes a three-level weighted clustering ensemble algorithm namely unified point-cluser-partition algorithm (PCPA). The first step of the PCPA is to generate the adjacency matrix by base clusterings. Then, the central step is to obtain the weighted adjacency matrix by successively weighting three layers, i.e., points, clusters, and partitions. Finally, the consensus clustering is obtained by the average link method. Three performance indexes, namely F, NMI, and ARI, are used to evaluate the accuracy of the proposed method. The experimental results show that: Firstly, as expected, the proposed three-layer weighted clustering ensemble can improve the accuracy of each evaluation index by an average value of 22.07% compared with the direct clustering ensemble without weighting; Secondly, compared with seven other methods, PCPA can achieve better clustering results and the proportion that PCPA ranks first is 28/33.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"38 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170545","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-27DOI: 10.1007/s11063-024-11632-x
Xiaozhu Gao, Jinhui Liu, Bo Wan, Lingling An
Hierarchical reinforcement learning (HRL) has achieved remarkable success and significant progress in complex and long-term decision-making problems. However, HRL training typically entails substantial computational costs and an enormous number of samples. One effective approach to tackle this challenge is hierarchical reinforcement learning from demonstrations (HRLfD), which leverages demonstrations to expedite the training process of HRL. The effectiveness of HRLfD is contingent upon the quality of the demonstrations; hence, suboptimal demonstrations may impede efficient learning. To address this issue, this paper proposes a reachability-based reward shaping (RbRS) method to alleviate the negative interference of suboptimal demonstrations for the HRL agent. The novel HRLfD algorithm based on RbRS is named HRLfD-RbRS, which incorporates the RbRS method to enhance the learning efficiency of HRLfD. Moreover, with the help of this method, the learning agent can explore better policies under the guidance of the suboptimal demonstration. We evaluate the proposed HRLfD-RbRS algorithm on various complex robotic tasks, and the experimental results demonstrate that our method outperforms current state-of-the-art HRLfD algorithms.
{"title":"Hierarchical Reinforcement Learning from Demonstration via Reachability-Based Reward Shaping","authors":"Xiaozhu Gao, Jinhui Liu, Bo Wan, Lingling An","doi":"10.1007/s11063-024-11632-x","DOIUrl":"https://doi.org/10.1007/s11063-024-11632-x","url":null,"abstract":"<p>Hierarchical reinforcement learning (HRL) has achieved remarkable success and significant progress in complex and long-term decision-making problems. However, HRL training typically entails substantial computational costs and an enormous number of samples. One effective approach to tackle this challenge is hierarchical reinforcement learning from demonstrations (HRLfD), which leverages demonstrations to expedite the training process of HRL. The effectiveness of HRLfD is contingent upon the quality of the demonstrations; hence, suboptimal demonstrations may impede efficient learning. To address this issue, this paper proposes a reachability-based reward shaping (RbRS) method to alleviate the negative interference of suboptimal demonstrations for the HRL agent. The novel HRLfD algorithm based on RbRS is named HRLfD-RbRS, which incorporates the RbRS method to enhance the learning efficiency of HRLfD. Moreover, with the help of this method, the learning agent can explore better policies under the guidance of the suboptimal demonstration. We evaluate the proposed HRLfD-RbRS algorithm on various complex robotic tasks, and the experimental results demonstrate that our method outperforms current state-of-the-art HRLfD algorithms.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"23 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141170611","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-05-13DOI: 10.1007/s11063-024-11482-7
Jiawei Luo, Lei Yu, Bangshu Xiong
To solve a general time-variant Sylvester equation, two novel zeroing neural networks (ZNNs) solutions are designed and analyzed. In the foregoing ZNN solutions, the design convergent parameters (CPs) before the nonlinear stimulated functions are very pivotal because CPs basically decide the convergent speeds. Nonetheless, the CPs are generally set to be constants, which is not feasible because CPs are generally time-variant in practical hardware conditions particularly when the external noises invade. So, a lot of variant-parameter ZNNs (VP-ZNNs) with time-variant CPs have been come up with. Comparing with fixed-parameter ZNNs, the foregoing VP-ZNNs have been illustrated to own better convergence, the downside is that the CPs generally increases over time, and will be probably infinite at last. Obviously, infinite large CPs would lead to be non-robustness of the ZNN schemes, which are not permitted in reality when the exterior noises inject. Moreover, even though VP-ZNNs are convergent over time, the growth of CPs will waste tremendous computing resources. Based on these factors, 2 hyperbolic tangent-type variant-parameter robust ZNNs (HTVPR-ZNNs) have been proposed in this paper. Both the convergent preassigned-time of the HTVPR-ZNN and top-time boundary of CPs are theoretically investigated. Many numerical simulations substantiated the admirable validity of the HTVPR-ZNN solutions.
{"title":"Hyperbolic Tangent-Type Variant-Parameter and Robust ZNN Solutions for Resolving Time-Variant Sylvester Equation in Preassigned-Time","authors":"Jiawei Luo, Lei Yu, Bangshu Xiong","doi":"10.1007/s11063-024-11482-7","DOIUrl":"https://doi.org/10.1007/s11063-024-11482-7","url":null,"abstract":"<p>To solve a general time-variant Sylvester equation, two novel zeroing neural networks (ZNNs) solutions are designed and analyzed. In the foregoing ZNN solutions, the design convergent parameters (CPs) before the nonlinear stimulated functions are very pivotal because CPs basically decide the convergent speeds. Nonetheless, the CPs are generally set to be constants, which is not feasible because CPs are generally time-variant in practical hardware conditions particularly when the external noises invade. So, a lot of variant-parameter ZNNs (VP-ZNNs) with time-variant CPs have been come up with. Comparing with fixed-parameter ZNNs, the foregoing VP-ZNNs have been illustrated to own better convergence, the downside is that the CPs generally increases over time, and will be probably infinite at last. Obviously, infinite large CPs would lead to be non-robustness of the ZNN schemes, which are not permitted in reality when the exterior noises inject. Moreover, even though VP-ZNNs are convergent over time, the growth of CPs will waste tremendous computing resources. Based on these factors, 2 hyperbolic tangent-type variant-parameter robust ZNNs (HTVPR-ZNNs) have been proposed in this paper. Both the convergent preassigned-time of the HTVPR-ZNN and top-time boundary of CPs are theoretically investigated. Many numerical simulations substantiated the admirable validity of the HTVPR-ZNN solutions.</p>","PeriodicalId":51144,"journal":{"name":"Neural Processing Letters","volume":"26 1","pages":""},"PeriodicalIF":3.1,"publicationDate":"2024-05-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140942150","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}