In the field of robotics, path planning in complex dynamic environments has become a significant research hotspot. Existing methods often suffer from inadequate dynamic obstacle avoidance capabilities and low exploration efficiency. These issues primarily arise from inconsistencies caused by insufficient utilization of environmental maps in actual path planning. To address these challenges, we propose an improved algorithm that integrates the enhanced A* algorithm with the optimized dynamic window approach (DWA). The enhanced A* algorithm improves the robot’s path smoothness and accelerates global exploration efficiency, while the optimized DWA enhances local static and dynamic obstacle avoidance capabilities. We performed simulation experiments using MATLAB and conducted experiments in real dynamic environments simulated with Gazebo. Simulation results indicate that, compared to the traditional A* algorithm, our method optimizes traversed grids by 25% and reduces time by 23% in global planning. In dynamic obstacle avoidance, our approach improves path length by 2.7% and reduces time by 19.2% compared to the traditional DWA, demonstrating significant performance enhancements.
{"title":"Dynamic path planning fusion algorithm with improved A* algorithm and dynamic window approach","authors":"Jianfeng Zhang, Jielong Guo, Daxin Zhu, Yufang Xie","doi":"10.1007/s13042-024-02377-z","DOIUrl":"https://doi.org/10.1007/s13042-024-02377-z","url":null,"abstract":"<p>In the field of robotics, path planning in complex dynamic environments has become a significant research hotspot. Existing methods often suffer from inadequate dynamic obstacle avoidance capabilities and low exploration efficiency. These issues primarily arise from inconsistencies caused by insufficient utilization of environmental maps in actual path planning. To address these challenges, we propose an improved algorithm that integrates the enhanced A* algorithm with the optimized dynamic window approach (DWA). The enhanced A* algorithm improves the robot’s path smoothness and accelerates global exploration efficiency, while the optimized DWA enhances local static and dynamic obstacle avoidance capabilities. We performed simulation experiments using MATLAB and conducted experiments in real dynamic environments simulated with Gazebo. Simulation results indicate that, compared to the traditional A* algorithm, our method optimizes traversed grids by 25% and reduces time by 23% in global planning. In dynamic obstacle avoidance, our approach improves path length by 2.7% and reduces time by 19.2% compared to the traditional DWA, demonstrating significant performance enhancements.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"3 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209073","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-12DOI: 10.1007/s13042-024-02378-y
Hongzhi Chen, Fu Zhang, Qinghui Li, Xiang Li, Yifan Ding, Daqing Zhang, Jingwei Cheng, Xing Wang
Commonsense knowledge is essential for performing inference and retrieval in many artificial intelligence applications, including those in natural language processing and expert system. However, a large amount of valuable commonsense knowledge exists implicitly or is missing in commonsense knowledge graphs (KGs). In this case, commonsense knowledge graph completion (CKGC) is proposed to solve this incomplete problem by inferring missing parts of commonsense triples, e.g., (?, HasPrerequisite, turn computer on) or (get onto web, HasPrerequisite, ?). Some existing methods attempt to learn as much entity semantic information as possible by exploiting the structural and semantic context of entities for improving the performance of CKGC. However, we found that the existing models only pay attention to entities and relations of the commonsense triples and ignore the important confidence (weight) information related to the commonsense triples. In this paper we innovatively introduce commonsense triple confidence into CKGC and propose a confidence-aware encoder–decoder CKGC model. In the encoding stage, we propose a method to incorporate the commonsense triple confidence into RGCN (relational graph convolutional network), so that the encoder can learn a more accurate semantic representation of a triple by considering the triple confidence constraints. Moreover, the commonsense KGs are usually sparse, because there are a large number of entities with an in-degree of 1 in the commonsense triples. Therefore, we propose to add a new relation (called similar edge) between two similar entities for compensating the sparsity of commonsense KGs. In the decoding stage, considering that entities in the commonsense triples are sentence-level entities (e.g., the tail entity turn computer on mentioned above), we propose a joint decoding model by fusing effectively the existing InteractE and ConvTransE models. Experiments show that our new model achieves better performance compared to the previous competitive models. In particular, the incorporating of the confidence of triples actually brings significant improvements to CKGC.
{"title":"Triple confidence-aware encoder–decoder model for commonsense knowledge graph completion","authors":"Hongzhi Chen, Fu Zhang, Qinghui Li, Xiang Li, Yifan Ding, Daqing Zhang, Jingwei Cheng, Xing Wang","doi":"10.1007/s13042-024-02378-y","DOIUrl":"https://doi.org/10.1007/s13042-024-02378-y","url":null,"abstract":"<p>Commonsense knowledge is essential for performing inference and retrieval in many artificial intelligence applications, including those in natural language processing and expert system. However, a large amount of valuable commonsense knowledge exists implicitly or is missing in commonsense knowledge graphs (KGs). In this case, commonsense knowledge graph completion (CKGC) is proposed to solve this incomplete problem by inferring missing parts of commonsense triples, e.g., (?<i>, HasPrerequisite, turn computer on</i>) or (<i>get onto web, HasPrerequisite,</i> ?). Some existing methods attempt to learn as much entity semantic information as possible by exploiting the structural and semantic context of entities for improving the performance of CKGC. However, we found that the existing models only pay attention to entities and relations of the commonsense triples and ignore the important <i>confidence</i> (<i>weight</i>) information related to the commonsense triples. In this paper we innovatively introduce commonsense triple confidence into CKGC and propose a confidence-aware encoder–decoder CKGC model. In the <i>encoding</i> stage, we propose a method to incorporate the commonsense triple confidence into RGCN (relational graph convolutional network), so that the encoder can learn a more accurate semantic representation of a triple by considering the triple confidence constraints. Moreover, the commonsense KGs are usually sparse, because there are a large number of entities with an in-degree of 1 in the commonsense triples. Therefore, we propose to add a new relation (called similar edge) between two similar entities for compensating the sparsity of commonsense KGs. In the <i>decoding</i> stage, considering that entities in the commonsense triples are sentence-level entities (e.g., the tail entity <i>turn computer on</i> mentioned above), we propose a joint decoding model by fusing effectively the existing InteractE and ConvTransE models. Experiments show that our new model achieves better performance compared to the previous competitive models. In particular, the incorporating of the confidence of triples actually brings significant improvements to CKGC.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"405 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209074","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Recent years have witnessed the great success of graph neural networks (GNNs) in various graph data mining tasks. However, studies demonstrate that GNNs are vulnerable to imperceptible structural perturbations. Carefully crafted perturbations of few edges can significantly degrade the performance of GNNs. Many useful defense methods have been developed to eliminate the impacts of adversarial edges. However, existing approaches ignore the mutual corroboration effects of structures and attributes, which can be used for graph augmentation. This paper presents GAF, a novel graph Augmentation framework defending GNNs against structural poisoning attacks via structure and attribute reconciliation. GAF first constructs two auxiliary graphs, including an attributive neighborhood graph and a structural neighborhood graph, to augment the original one. We propose a novel graph purification scheme to prune irrelevant edges and assign the rest edges with different weights based on both node attributes and graph structures. This significantly mitigates the inconsistency between structural and attributive data, reducing the impacts of adversarial and noisy edges. Then, a joint graph convolutional network (GCN) model is developed to encode the three graphs for representation learning. Experimental results show that GAF outperforms state-of-the-art approaches against various adversarial attacks and exhibits great superiority for attacks with high perturbation rates. Source code is available at: https://github.com/shaoyf9/GAF.
{"title":"Graph augmentation against structural poisoning attacks via structure and attribute reconciliation","authors":"Yumeng Dai, Yifan Shao, Chenxu Wang, Xiaohong Guan","doi":"10.1007/s13042-024-02380-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02380-4","url":null,"abstract":"<p>Recent years have witnessed the great success of graph neural networks (GNNs) in various graph data mining tasks. However, studies demonstrate that GNNs are vulnerable to imperceptible structural perturbations. Carefully crafted perturbations of few edges can significantly degrade the performance of GNNs. Many useful defense methods have been developed to eliminate the impacts of adversarial edges. However, existing approaches ignore the mutual corroboration effects of structures and attributes, which can be used for graph augmentation. This paper presents GAF, a novel graph Augmentation framework defending GNNs against structural poisoning attacks via structure and attribute reconciliation. GAF first constructs two auxiliary graphs, including an attributive neighborhood graph and a structural neighborhood graph, to augment the original one. We propose a novel graph purification scheme to prune irrelevant edges and assign the rest edges with different weights based on both node attributes and graph structures. This significantly mitigates the inconsistency between structural and attributive data, reducing the impacts of adversarial and noisy edges. Then, a joint graph convolutional network (GCN) model is developed to encode the three graphs for representation learning. Experimental results show that GAF outperforms state-of-the-art approaches against various adversarial attacks and exhibits great superiority for attacks with high perturbation rates. Source code is available at: https://github.com/shaoyf9/GAF.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"1 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209075","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-11DOI: 10.1007/s13042-024-02341-x
Tianli Li, Mohammad Faidzul Nasrudin, Dawei Zhao, Fei Chen, Xing Peng, Hafiz Mohd Sarim
Multi-label learning has emerged as a prominent research area in machine learning, as each instance can be associated with multiple class labels. However, many multi-label learning algorithms assume that the label space is complete, whereas in real-world applications, we often only have access to partial label information. To address this issue, we propose a novel Multi-label Weak-label learning algorithm via Low-rank Label correlations (MW2L). First, we propagate the structural and semantic information from the feature space to the label space to effectively capture label-related information and recover lost labels. Second, we incorporate global and local low-rank label correlation information to ensure that the label-related matrix is informative. Last, we use label correlations to supplement the original weak-label matrix and form a unified learning framework. We evaluate the performance of our approach on several benchmark datasets and show that it outperforms state-of-the-art methods in terms of accuracy and robustness to weak-label noise. The proposed approach can effectively handle incomplete and noisy weak labels in multi-label learning and outperforms existing methods.
{"title":"Uncovering hidden patterns: low-rank label correlations for multi-label weak-label learning","authors":"Tianli Li, Mohammad Faidzul Nasrudin, Dawei Zhao, Fei Chen, Xing Peng, Hafiz Mohd Sarim","doi":"10.1007/s13042-024-02341-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02341-x","url":null,"abstract":"<p>Multi-label learning has emerged as a prominent research area in machine learning, as each instance can be associated with multiple class labels. However, many multi-label learning algorithms assume that the label space is complete, whereas in real-world applications, we often only have access to partial label information. To address this issue, we propose a novel <b>M</b>ulti-label <b>W</b>eak-label learning algorithm via <b>L</b>ow-rank <b>L</b>abel correlations (MW2L). First, we propagate the structural and semantic information from the feature space to the label space to effectively capture label-related information and recover lost labels. Second, we incorporate global and local low-rank label correlation information to ensure that the label-related matrix is informative. Last, we use label correlations to supplement the original weak-label matrix and form a unified learning framework. We evaluate the performance of our approach on several benchmark datasets and show that it outperforms state-of-the-art methods in terms of accuracy and robustness to weak-label noise. The proposed approach can effectively handle incomplete and noisy weak labels in multi-label learning and outperforms existing methods.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"19 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-10DOI: 10.1007/s13042-024-02328-8
Zheng-Ang Su, Juan Zhang, Zhijun Fang, Yongbin Gao
The fusion of side information in sequential recommendation (SR) is a recommendation system technique that combines a user’s historical behavior sequence with additional side information to provide more accurate personalized recommendations. Recent methods are based on self-attention mechanisms, incorporating side information as part of the attention matrix to update item representations. We believe that the integration method via self-attention mechanisms does not fully utilize side information. Therefore, we designed a new Enhanced Side Information Fusion framework (ESIF) for sequential recommendations. Specifically, we have altered the fusion strategy by using an attention matrix to simultaneously update the representations of items and side information, thereby increasing the use of side information. The attention matrix serves to balance various features, ensuring effective utilization of side information throughout the fusion process. We designed a Gated Linear Representation Fusion Module, comprising linear transformations and gated units. The linear transformation processes the input data, while the gated unit dynamically adjusts the degree of information flow based on the input. This module then combines the updated item representation with the side information representation for more efficient use of side information. Additionally, user interaction behavior data inevitably contains noise. The presence of noise can disrupt the model’s performance, affecting the accuracy and reliability of the results. Therefore, we introduced a denoising module in ESIF to enhance recommendation accuracy by reducing noise. Our experimental results demonstrate that ESIF achieves superior performance across five real-world datasets, surpassing the current state-of-the-art side information fusion SR models.
序列推荐(SR)中的侧面信息融合是一种推荐系统技术,它将用户的历史行为序列与额外的侧面信息相结合,以提供更准确的个性化推荐。最近的方法都是基于自我注意机制,将侧面信息作为注意矩阵的一部分来更新项目表征。我们认为,通过自我注意机制进行整合的方法并不能充分利用侧面信息。因此,我们为顺序推荐设计了一个新的增强侧信息融合框架(ESIF)。具体来说,我们改变了融合策略,利用注意力矩阵同时更新项目和侧面信息的表征,从而提高了侧面信息的利用率。注意力矩阵的作用是平衡各种特征,确保在整个融合过程中有效利用边信息。我们设计了一个门控线性表征融合模块,由线性变换和门控单元组成。线性变换处理输入数据,而门控单元则根据输入信息动态调整信息流的程度。然后,该模块将更新后的项目表示法与侧面信息表示法相结合,从而更有效地利用侧面信息。此外,用户交互行为数据不可避免地包含噪音。噪声的存在会破坏模型的性能,影响结果的准确性和可靠性。因此,我们在 ESIF 中引入了去噪模块,通过减少噪声来提高推荐的准确性。我们的实验结果表明,ESIF 在五个真实数据集上取得了卓越的性能,超越了目前最先进的侧面信息融合 SR 模型。
{"title":"Enhanced side information fusion framework for sequential recommendation","authors":"Zheng-Ang Su, Juan Zhang, Zhijun Fang, Yongbin Gao","doi":"10.1007/s13042-024-02328-8","DOIUrl":"https://doi.org/10.1007/s13042-024-02328-8","url":null,"abstract":"<p>The fusion of side information in sequential recommendation (SR) is a recommendation system technique that combines a user’s historical behavior sequence with additional side information to provide more accurate personalized recommendations. Recent methods are based on self-attention mechanisms, incorporating side information as part of the attention matrix to update item representations. We believe that the integration method via self-attention mechanisms does not fully utilize side information. Therefore, we designed a new <b>E</b>nhanced <b>S</b>ide <b>I</b>nformation <b>F</b>usion framework (ESIF) for sequential recommendations. Specifically, we have altered the fusion strategy by using an attention matrix to simultaneously update the representations of items and side information, thereby increasing the use of side information. The attention matrix serves to balance various features, ensuring effective utilization of side information throughout the fusion process. We designed a Gated Linear Representation Fusion Module, comprising linear transformations and gated units. The linear transformation processes the input data, while the gated unit dynamically adjusts the degree of information flow based on the input. This module then combines the updated item representation with the side information representation for more efficient use of side information. Additionally, user interaction behavior data inevitably contains noise. The presence of noise can disrupt the model’s performance, affecting the accuracy and reliability of the results. Therefore, we introduced a denoising module in ESIF to enhance recommendation accuracy by reducing noise. Our experimental results demonstrate that ESIF achieves superior performance across five real-world datasets, surpassing the current state-of-the-art side information fusion SR models.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"58 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209076","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-09DOI: 10.1007/s13042-024-02333-x
Hai-Long Yang, Xu Liu, Zhi-Lian Guo
With the development and progress of technology, information becomes increasingly diverse, which poses higher demands on decision-making methods. Probabilistic linguistic term set (PLTS) is a tool that can more intuitively express the evaluations of decision makers (DMs). As a specialized form of PLTS with ignored probabilities, weak probabilistic linguistic term set (WPLTS) can describe incomplete or inaccurate evaluation information. Three-way decision (3WD) is an efficient decision-making method that reduces decision cost by adopting delayed decisions on the boundary domain. In this paper, we propose a novel 3WD method by combining 3WD with the complex proportional assessment (COPRAS) method under the WPLTS environment, named the WPLTS-3WD method. Firstly, we introduce the notion of the WPLTS information system. For a WPLTS information system, we propose a method of complementing the ignored probabilities and a new score function. Secondly, the objects are ranked by the COPRAS method. According to the ranking result, we define the dominance relation and dominance sets. Based on the dominance sets, the conditional probabilities can be estimated. By combining the conditional probabilities with relative loss functions, the expected losses will be obtained and the objects can be classified. Moreover, we propose two conversion functions that can convert real-valued and linguistic term evaluation information into PLTS evaluation information. Finally, we use the proposed WPLTS-3WD method to analyze the air quality of four cities. The rationality and advantages of our method are verified through experimental comparisons with other methods and parameter analysis.
{"title":"A three-way decision method based on COPRAS in the weak probabilistic linguistic term set information systems","authors":"Hai-Long Yang, Xu Liu, Zhi-Lian Guo","doi":"10.1007/s13042-024-02333-x","DOIUrl":"https://doi.org/10.1007/s13042-024-02333-x","url":null,"abstract":"<p>With the development and progress of technology, information becomes increasingly diverse, which poses higher demands on decision-making methods. Probabilistic linguistic term set (PLTS) is a tool that can more intuitively express the evaluations of decision makers (DMs). As a specialized form of PLTS with ignored probabilities, weak probabilistic linguistic term set (WPLTS) can describe incomplete or inaccurate evaluation information. Three-way decision (3WD) is an efficient decision-making method that reduces decision cost by adopting delayed decisions on the boundary domain. In this paper, we propose a novel 3WD method by combining 3WD with the complex proportional assessment (COPRAS) method under the WPLTS environment, named the WPLTS-3WD method. Firstly, we introduce the notion of the WPLTS information system. For a WPLTS information system, we propose a method of complementing the ignored probabilities and a new score function. Secondly, the objects are ranked by the COPRAS method. According to the ranking result, we define the dominance relation and dominance sets. Based on the dominance sets, the conditional probabilities can be estimated. By combining the conditional probabilities with relative loss functions, the expected losses will be obtained and the objects can be classified. Moreover, we propose two conversion functions that can convert real-valued and linguistic term evaluation information into PLTS evaluation information. Finally, we use the proposed WPLTS-3WD method to analyze the air quality of four cities. The rationality and advantages of our method are verified through experimental comparisons with other methods and parameter analysis.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"23 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209080","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-08DOI: 10.1007/s13042-024-02346-6
Yiqiu Sun, Dongming Zhou, Kaixiang Yan
The introduction of the one-stream one-stage framework has led to remarkable advances in visual object tracking, resulting in exceptional tracking performance. Most existing one-stream one-stage tracking pipelines have achieved a relative balance between accuracy and speed. However, they focus solely on integrating feature learning and relational modelling. In complex scenes, the tracking performance often falls short due to confounding factors such as changes in target scale, occlusion, and fast motion. In these cases, numerous trackers cannot sufficiently exploit the target feature information and face the dilemma of information loss. To address these challenges, we propose a screening enrichment for transformer-based tracking. Our method incorporates a screening enrichment module as an additional processing operation in the integration of feature learning and relational modelling. The module effectively distinguishes target areas within the search regions. It also enriches the associations between tokens of target area information. In addition, we introduce our box validation module. This module uses the target position information from the previous frame to validate and revise the target position in the current frame. This process enables more accurate target localization. Through these innovations, we have developed a powerful and efficient tracker. It achieves state-of-the-art performance on six benchmark datasets, including GOT-10K, LaSOT, TrackingNet, UAV123, TNL2K and VOT2020. On the GOT-10K benchmarks, Specifically, on the GOT-10K benchmarks, our proposed tracker reaches an impressive Success Rate ((S{{R}_{0.5}})) of 85.4 and an Average Overlap (AO) of 75.3. Experimental results show that our proposed tracker outperforms other state-of-the-art trackers in terms of tracking accuracy.
{"title":"Visual tracking with screening region enrichment and target validation","authors":"Yiqiu Sun, Dongming Zhou, Kaixiang Yan","doi":"10.1007/s13042-024-02346-6","DOIUrl":"https://doi.org/10.1007/s13042-024-02346-6","url":null,"abstract":"<p>The introduction of the one-stream one-stage framework has led to remarkable advances in visual object tracking, resulting in exceptional tracking performance. Most existing one-stream one-stage tracking pipelines have achieved a relative balance between accuracy and speed. However, they focus solely on integrating feature learning and relational modelling. In complex scenes, the tracking performance often falls short due to confounding factors such as changes in target scale, occlusion, and fast motion. In these cases, numerous trackers cannot sufficiently exploit the target feature information and face the dilemma of information loss. To address these challenges, we propose a screening enrichment for transformer-based tracking. Our method incorporates a screening enrichment module as an additional processing operation in the integration of feature learning and relational modelling. The module effectively distinguishes target areas within the search regions. It also enriches the associations between tokens of target area information. In addition, we introduce our box validation module. This module uses the target position information from the previous frame to validate and revise the target position in the current frame. This process enables more accurate target localization. Through these innovations, we have developed a powerful and efficient tracker. It achieves state-of-the-art performance on six benchmark datasets, including GOT-10K, LaSOT, TrackingNet, UAV123, TNL2K and VOT2020. On the GOT-10K benchmarks, Specifically, on the GOT-10K benchmarks, our proposed tracker reaches an impressive Success Rate (<span>(S{{R}_{0.5}})</span>) of 85.4 and an Average Overlap (AO) of 75.3. Experimental results show that our proposed tracker outperforms other state-of-the-art trackers in terms of tracking accuracy.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"405 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s13042-024-02372-4
Xinyu Liang, Guannan Si, Jianxin Li, Zhaoliang An, Pengxin Tian, Fengyu Zhou, Xiaoliang Wang
Inductive link prediction (ILP) predicts missing triplets involving unseen entities in knowledge graphs (KGs). Existing ILP research mainly addresses seen-unseen entities in the original KG (semi-inductive link prediction) and unseen-unseen entities in emerging KGs (fully-inductive link prediction). Bridging-inductive link prediction, which focuses on unseen entities that carry evolutionary information from the original KG to the emerging KG, has not been extensively studied so far. This study introduces a novel model called GSELI (integrating global semantics and enhanced local subgraph for inductive link prediction), which comprises three components. (1) The contrastive learning-based global semantic features (CLSF) module extracts relation-specific semantic features between the original and emerging KGs and employs semantic-aware contrastive learning to optimize these features. (2) The GNN-based enhanced local subgraph (GELS) module employs personalized PageRank (PPR)-based local clustering to sample tightly-related subgraphs and incorporates complete neighboring relations to enhance the topological information of subgraphs. (3) Joint contrastive learning and supervised learning training. Experimental results on various benchmark datasets demonstrate that GSELI outperforms the baseline models in both fully-inductive and bridging-inductive link predictions.
归纳链接预测(ILP)可预测知识图谱(KG)中涉及未见实体的缺失三元组。现有的 ILP 研究主要针对原始知识图谱中的可见-不可见实体(半归纳链接预测)和新兴知识图谱中的不可见-不可见实体(全归纳链接预测)。桥接-归纳链接预测主要针对从原始幼稚园到新兴幼稚园之间携带演化信息的未见实体,迄今为止尚未得到广泛研究。本研究引入了一种名为 GSELI(整合全局语义和增强局部子图进行归纳链接预测)的新型模型,该模型由三个部分组成。(1) 基于对比学习的全局语义特征(CLSF)模块提取原始 KG 和新出现 KG 之间的特定关系语义特征,并采用语义感知对比学习来优化这些特征。(2) 基于 GNN 的增强局部子图(GELS)模块采用基于个性化 PageRank(PPR)的局部聚类来采样紧密相关的子图,并结合完整的相邻关系来增强子图的拓扑信息。(3) 联合对比学习和监督学习训练。在各种基准数据集上的实验结果表明,GSELI 在完全归纳和桥接归纳链接预测方面都优于基线模型。
{"title":"Integrating global semantics and enhanced local subgraph for inductive link prediction","authors":"Xinyu Liang, Guannan Si, Jianxin Li, Zhaoliang An, Pengxin Tian, Fengyu Zhou, Xiaoliang Wang","doi":"10.1007/s13042-024-02372-4","DOIUrl":"https://doi.org/10.1007/s13042-024-02372-4","url":null,"abstract":"<p>Inductive link prediction (ILP) predicts missing triplets involving unseen entities in knowledge graphs (KGs). Existing ILP research mainly addresses seen-unseen entities in the original KG (semi-inductive link prediction) and unseen-unseen entities in emerging KGs (fully-inductive link prediction). Bridging-inductive link prediction, which focuses on unseen entities that carry evolutionary information from the original KG to the emerging KG, has not been extensively studied so far. This study introduces a novel model called GSELI (integrating global semantics and enhanced local subgraph for inductive link prediction), which comprises three components. (1) The contrastive learning-based global semantic features (CLSF) module extracts relation-specific semantic features between the original and emerging KGs and employs semantic-aware contrastive learning to optimize these features. (2) The GNN-based enhanced local subgraph (GELS) module employs personalized PageRank (PPR)-based local clustering to sample tightly-related subgraphs and incorporates complete neighboring relations to enhance the topological information of subgraphs. (3) Joint contrastive learning and supervised learning training. Experimental results on various benchmark datasets demonstrate that GSELI outperforms the baseline models in both fully-inductive and bridging-inductive link predictions.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"54 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209077","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-06DOI: 10.1007/s13042-024-02371-5
Yajun Liu, Kefeng Fan, Wenju Zhou
Convolutional neural networks (CNNs) have been successfully implemented in various computer vision tasks. However, the remarkable achievements are accompanied by high memory and high computation, which hinder the deployment and application of CNNs on resource-constrained mobile devices. Filter pruning is proposed as an effective method to solve the above problems. In this paper, we propose an iterative filter pruning method that combines feature map properties and knowledge distillation. This method can maximize the important feature information (e.g., spatial features) in the feature map by calculating the information capacity and feature relevance of the feature map, and then pruning based on the set criteria. Then, the pruned network learns the complete feature information of the standard CNN architecture in order to quickly and completely recover the lost accuracy before the next pruning operation. The alternating operation of pruning and knowledge distillation can effectively and comprehensively achieve network compression. Experiments on image classification datasets via mainstream CNN architectures indicate the effectiveness of our approach. For example, on CIFAR-10, our method reduces Floating Point Operations (FLOPs) by 71.8% and parameters by 71.0% with an accuracy improvement of 0.24% over the ResNet-110 benchmark. On ImageNet, our method achieves 55.6% reduction in FLOPs and 52.5% reduction in model memory at the cost of losing only 0.17% of Top-5 on ResNet-50.
{"title":"Iterative filter pruning with combined feature maps and knowledge distillation","authors":"Yajun Liu, Kefeng Fan, Wenju Zhou","doi":"10.1007/s13042-024-02371-5","DOIUrl":"https://doi.org/10.1007/s13042-024-02371-5","url":null,"abstract":"<p>Convolutional neural networks (CNNs) have been successfully implemented in various computer vision tasks. However, the remarkable achievements are accompanied by high memory and high computation, which hinder the deployment and application of CNNs on resource-constrained mobile devices. Filter pruning is proposed as an effective method to solve the above problems. In this paper, we propose an iterative filter pruning method that combines feature map properties and knowledge distillation. This method can maximize the important feature information (e.g., spatial features) in the feature map by calculating the information capacity and feature relevance of the feature map, and then pruning based on the set criteria. Then, the pruned network learns the complete feature information of the standard CNN architecture in order to quickly and completely recover the lost accuracy before the next pruning operation. The alternating operation of pruning and knowledge distillation can effectively and comprehensively achieve network compression. Experiments on image classification datasets via mainstream CNN architectures indicate the effectiveness of our approach. For example, on CIFAR-10, our method reduces Floating Point Operations (FLOPs) by 71.8% and parameters by 71.0% with an accuracy improvement of 0.24% over the ResNet-110 benchmark. On ImageNet, our method achieves 55.6% reduction in FLOPs and 52.5% reduction in model memory at the cost of losing only 0.17% of Top-5 on ResNet-50.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"73 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209079","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2024-09-05DOI: 10.1007/s13042-024-02373-3
Huan-Yu Ke, Yang-Jie Chen, Ming Li, Jian-Ning Li
For nonlinear multilateral teleoperation systems, unreliable communication channels and actuator constraints are the main challenging issues to achieve the stability condition and satisfy the required performance. In this paper, a novel fault-tolerant control algorithm is proposed for a class of multi-degree-of-freedom nonlinear multilateral teleoperation systems with the aforementioned problems and unknown environmental forces. The time-varying delays and packet dropouts are incorporated in the unreliable communication channels, and the considered systems are modeled as a kind of T-S fuzzy systems with multiple time-varying delays. For actuator constraints, both the actuator failures and the unknown control directions are investigated in such research, by designing a novel fault-tolerant control scheme, the failures and control directions can be estimated simultaneously. Next, the radial basis function neural network (RBFNN) is introduced to estimate the unknown environmental force, and the estimated results are incorporated in the controller design and the mean-square stability of the closed-loop system with disturbance attenuation level is guaranteed. Finally, a numerical simulation example is given to show the effectiveness of the proposed method.
{"title":"Fault-tolerant control design for nonlinear multilateral teleoperation system with unreliable communication channels and actuator constraints","authors":"Huan-Yu Ke, Yang-Jie Chen, Ming Li, Jian-Ning Li","doi":"10.1007/s13042-024-02373-3","DOIUrl":"https://doi.org/10.1007/s13042-024-02373-3","url":null,"abstract":"<p>For nonlinear multilateral teleoperation systems, unreliable communication channels and actuator constraints are the main challenging issues to achieve the stability condition and satisfy the required performance. In this paper, a novel fault-tolerant control algorithm is proposed for a class of multi-degree-of-freedom nonlinear multilateral teleoperation systems with the aforementioned problems and unknown environmental forces. The time-varying delays and packet dropouts are incorporated in the unreliable communication channels, and the considered systems are modeled as a kind of T-S fuzzy systems with multiple time-varying delays. For actuator constraints, both the actuator failures and the unknown control directions are investigated in such research, by designing a novel fault-tolerant control scheme, the failures and control directions can be estimated simultaneously. Next, the radial basis function neural network (RBFNN) is introduced to estimate the unknown environmental force, and the estimated results are incorporated in the controller design and the mean-square stability of the closed-loop system with disturbance attenuation level is guaranteed. Finally, a numerical simulation example is given to show the effectiveness of the proposed method.</p>","PeriodicalId":51327,"journal":{"name":"International Journal of Machine Learning and Cybernetics","volume":"43 1","pages":""},"PeriodicalIF":5.6,"publicationDate":"2024-09-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142209084","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}