CAAI Transactions on Intelligence Technology最新文献_第2页

A UAV Air Combat Trajectory Prediction Method Based on QCNet 基于QCNet的无人机空战轨迹预测方法

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-10-11 DOI: 10.1049/cit2.70068

Zhang Jiahui, Meng Zhijun, He Jiazheng

The Unmanned Aerial Vehicle (UAV) air combat trajectory prediction algorithm facilitates strategic pre-planning by predicting UAV flight trajectories with high accuracy, thus mitigating risks and securing advantages in intricate aerial scenarios. This study tackles the prevalent limitations of existing datasets, which are often restricted in scale and scenario diversity, by introducing a novel UAV air combat trajectory prediction methodology predicated on QCNet. Firstly, a robust UAV air combat dynamics model is developed to synthesise air combat trajectories, forming the basis for a comprehensive trajectory prediction dataset. Subsequently, a specialised trajectory prediction framework utilising QCNet is devised, followed by rigorous algorithm training. The parameter impact analysis is conducted to assess the influence of critical algorithm parameters on efficiency. The results of the parameter impact analysis experiment indicate that augmenting the number of encoder layers and the decoder's recurrent steps generally enhances performance, albeit an excessive increment in recurrent steps may inversely affect efficiency. Finally, the proposed algorithm is evaluated compared with other traditional time-series prediction algorithms and shows better performance. The effectiveness experiment indicates that the proposed algorithm can predict the flight trajectories of UAVs and provide corresponding probabilities under different manoeuvres.

无人机（UAV）空战轨迹预测算法通过高精度预测无人机飞行轨迹，促进战略预规划，从而在复杂的空中场景中降低风险并确保优势。通过引入一种基于QCNet的新型无人机空战轨迹预测方法，解决了现有数据集在规模和场景多样性方面的普遍局限。首先，建立了鲁棒的无人机空战动力学模型，用于综合空战轨迹，为综合轨迹预测数据集奠定基础。随后，利用QCNet设计了一个专门的轨迹预测框架，然后进行了严格的算法训练。进行参数影响分析，评估关键算法参数对效率的影响。参数影响分析实验的结果表明，增加编码器层数和解码器的循环步长通常会提高性能，尽管循环步长增加过多可能会对效率产生相反的影响。最后，将该算法与其他传统的时间序列预测算法进行了比较，结果表明该算法具有更好的性能。有效性实验表明，该算法能够预测无人机在不同机动情况下的飞行轨迹，并给出相应的概率。

{"title":"A UAV Air Combat Trajectory Prediction Method Based on QCNet","authors":"Zhang Jiahui, Meng Zhijun, He Jiazheng","doi":"10.1049/cit2.70068","DOIUrl":"https://doi.org/10.1049/cit2.70068","url":null,"abstract":"The Unmanned Aerial Vehicle (UAV) air combat trajectory prediction algorithm facilitates strategic pre-planning by predicting UAV flight trajectories with high accuracy, thus mitigating risks and securing advantages in intricate aerial scenarios. This study tackles the prevalent limitations of existing datasets, which are often restricted in scale and scenario diversity, by introducing a novel UAV air combat trajectory prediction methodology predicated on QCNet. Firstly, a robust UAV air combat dynamics model is developed to synthesise air combat trajectories, forming the basis for a comprehensive trajectory prediction dataset. Subsequently, a specialised trajectory prediction framework utilising QCNet is devised, followed by rigorous algorithm training. The parameter impact analysis is conducted to assess the influence of critical algorithm parameters on efficiency. The results of the parameter impact analysis experiment indicate that augmenting the number of encoder layers and the decoder's recurrent steps generally enhances performance, albeit an excessive increment in recurrent steps may inversely affect efficiency. Finally, the proposed algorithm is evaluated compared with other traditional time-series prediction algorithms and shows better performance. The effectiveness experiment indicates that the proposed algorithm can predict the flight trajectories of UAVs and provide corresponding probabilities under different manoeuvres.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1661-1674"},"PeriodicalIF":7.3,"publicationDate":"2025-10-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70068","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145824454","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Multi-View Seizure Classification Based on Attention-Based Adaptive Graph ProbSparse Hybrid Network 基于注意力自适应图prob稀疏混合网络的多视图癫痫分类

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-10-08 DOI: 10.1049/cit2.70059

Changxu Dong, Yanqing Liu, Dengdi Sun

Epilepsy is a neurological disorder characterised by recurrent seizures due to abnormal neuronal discharges. Seizure detection via EEG signals has progressed, but two main challenges are still encountered. First, EEG data can be distorted by physiological factors and external variables, resulting in noisy brain networks. Static adjacency matrices are typically used in current mainstream methods, which neglect the need for dynamic updates and feature refinement. The second challenge stems from the strong reliance on long-range dependencies through self-attention in current methods, which can introduce redundant noise and increase computational complexity, especially in long-duration data. To address these challenges, the Attention-based Adaptive Graph ProbSparse Hybrid Network (AA-GPHN) is proposed. Brain network structures are dynamically optimised using variational inference and the information bottleneck principle, refining the adjacency matrix for improved epilepsy classification. A Linear Graph Convolutional Network (LGCN) is incorporated to focus on first-order neighbours, minimising the aggregation of distant information. Furthermore, a ProbSparse attention-based Informer (PAT) is introduced to adaptively filter long-range dependencies, enhancing efficiency. A joint optimisation loss function is applied to improve robustness in noisy environments. Experimental results on both patient-specific and cross-subject datasets demonstrate that AA-GPHN outperforms existing methods in seizure detection, showing superior effectiveness and generalisation.

癫痫是一种神经系统疾病，其特征是由于异常的神经元放电引起的反复发作。通过脑电图信号检测癫痫发作已取得进展，但仍面临两个主要挑战。首先，脑电数据会受到生理因素和外部变量的扭曲，导致脑网络的噪声。目前的主流方法通常使用静态邻接矩阵，忽略了动态更新和特征细化的需要。第二个挑战源于当前方法中对远程依赖关系的强烈依赖，这可能会引入冗余噪声并增加计算复杂性，特别是在长持续时间数据中。为了解决这些问题，提出了基于注意力的自适应图概率稀疏混合网络（AA-GPHN）。利用变分推理和信息瓶颈原理对脑网络结构进行动态优化，细化邻接矩阵，改进癫痫分类。采用线性图卷积网络（LGCN）来关注一阶邻居，最大限度地减少远程信息的聚集。在此基础上，引入基于ProbSparse attention-based Informer （PAT）自适应过滤远程依赖关系，提高了效率。采用联合优化损失函数来提高噪声环境下的鲁棒性。在患者特异性和跨学科数据集上的实验结果表明，AA-GPHN在癫痫检测方面优于现有方法，显示出优越的有效性和泛化性。

{"title":"Multi-View Seizure Classification Based on Attention-Based Adaptive Graph ProbSparse Hybrid Network","authors":"Changxu Dong, Yanqing Liu, Dengdi Sun","doi":"10.1049/cit2.70059","DOIUrl":"https://doi.org/10.1049/cit2.70059","url":null,"abstract":"Epilepsy is a neurological disorder characterised by recurrent seizures due to abnormal neuronal discharges. Seizure detection via EEG signals has progressed, but two main challenges are still encountered. First, EEG data can be distorted by physiological factors and external variables, resulting in noisy brain networks. Static adjacency matrices are typically used in current mainstream methods, which neglect the need for dynamic updates and feature refinement. The second challenge stems from the strong reliance on long-range dependencies through self-attention in current methods, which can introduce redundant noise and increase computational complexity, especially in long-duration data. To address these challenges, the Attention-based Adaptive Graph ProbSparse Hybrid Network (AA-GPHN) is proposed. Brain network structures are dynamically optimised using variational inference and the information bottleneck principle, refining the adjacency matrix for improved epilepsy classification. A Linear Graph Convolutional Network (LGCN) is incorporated to focus on first-order neighbours, minimising the aggregation of distant information. Furthermore, a ProbSparse attention-based Informer (PAT) is introduced to adaptively filter long-range dependencies, enhancing efficiency. A joint optimisation loss function is applied to improve robustness in noisy environments. Experimental results on both patient-specific and cross-subject datasets demonstrate that AA-GPHN outperforms existing methods in seizure detection, showing superior effectiveness and generalisation.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1783-1798"},"PeriodicalIF":7.3,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70059","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145848268","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

PointGeo: Geometry Transformer for Point Cloud Analysis PointGeo：点云分析的几何转换器

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-10-08 DOI: 10.1049/cit2.70062

Li An, Pengbo Zhou, Mingquan Zhou, Yong Wang, Guohua Geng, Yangyang Liu

Point cloud processing plays a crucial role in tasks such as point cloud classification, partial segmentation and semantic segmentation. However, existing processing frameworks are constrained by several challenges, such as recognising features in irregular and complex spatial structures, large attention parameter volumes and limitations in generalisation across different scenes. We propose a geometry transformer (PointGeo) method for addressing these concerns through point cloud analysis. This method utilises a geometry transformation network to process point cloud data, effectively capturing both local and global features and enhancing the modelling capability for irregular structures. We extensively test this method on multiple datasets, including ModelNet and ScanObjectNN for point cloud classification tasks, ShapeNet for point cloud partial segmentation tasks and S3DIS and SemanticKITTI for point cloud semantic segmentation tasks. Experimental results show that our approach delivers outstanding performance across all tasks, validating its effectiveness and generalisation capability in handling point cloud data.

点云处理在点云分类、部分分割和语义分割等任务中起着至关重要的作用。然而，现有的处理框架受到一些挑战的限制，例如识别不规则和复杂空间结构中的特征，大量的注意力参数以及在不同场景中泛化的局限性。我们提出了一种几何转换器（PointGeo）方法，通过点云分析来解决这些问题。该方法利用几何变换网络对点云数据进行处理，有效地捕获了局部和全局特征，增强了不规则结构的建模能力。我们在多个数据集上广泛测试了该方法，包括用于点云分类任务的ModelNet和ScanObjectNN，用于点云部分分割任务的ShapeNet，以及用于点云语义分割任务的S3DIS和SemanticKITTI。实验结果表明，我们的方法在所有任务中都具有出色的性能，验证了其在处理点云数据方面的有效性和泛化能力。

{"title":"PointGeo: Geometry Transformer for Point Cloud Analysis","authors":"Li An, Pengbo Zhou, Mingquan Zhou, Yong Wang, Guohua Geng, Yangyang Liu","doi":"10.1049/cit2.70062","DOIUrl":"https://doi.org/10.1049/cit2.70062","url":null,"abstract":"Point cloud processing plays a crucial role in tasks such as point cloud classification, partial segmentation and semantic segmentation. However, existing processing frameworks are constrained by several challenges, such as recognising features in irregular and complex spatial structures, large attention parameter volumes and limitations in generalisation across different scenes. We propose a geometry transformer (PointGeo) method for addressing these concerns through point cloud analysis. This method utilises a geometry transformation network to process point cloud data, effectively capturing both local and global features and enhancing the modelling capability for irregular structures. We extensively test this method on multiple datasets, including ModelNet and ScanObjectNN for point cloud classification tasks, ShapeNet for point cloud partial segmentation tasks and S3DIS and SemanticKITTI for point cloud semantic segmentation tasks. Experimental results show that our approach delivers outstanding performance across all tasks, validating its effectiveness and generalisation capability in handling point cloud data.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1880-1892"},"PeriodicalIF":7.3,"publicationDate":"2025-10-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70062","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

LA-YOLO: Location Refinement and Adjacent Feature Fusion-Based Infrared Small Target Detection 基于位置细化和相邻特征融合的红外小目标检测

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-10-07 DOI: 10.1049/cit2.70070

Shijie Liu, Chenqi Luo, Kang Yan, Feiwei Qin, Ruiquan Ge, Yong Peng, Jie Huang, Nenggan Zheng, Yongquan Zhang, Changmiao Wang

In the field of infrared small target detection (ISTD), the ability to detect targets in dim environments is critical, as it improves the performance of target recognition in nighttime and harsh weather conditions. The blurry contour, small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds. Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets. To address these challenges and to enhance the precision of small object detection and classification, this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO (LA-YOLO), which enhances feature extraction by integrating a multi-head self-attention mechanism (MSA). We have improved the feature fusion method to merge adjacent features, to enhance information utilisation in the path aggregation network (PAN). Lastly, we introduce supervision on the target centre points in the detection network. Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision (AP) of 92.46% on IST-A and a mean average precision (mAP) of 84.82% on FLIR. The results surpass those of contemporary state-of-the-art detectors, striking a balance between precision and speed. LA-YOLO emerges as a viable and efficacious solution for ISTD, making a substantial contribution to the progression of infrared imagery analysis. The code is available at https://github.com/liusjo/LA-YOLO.

在红外小目标探测（ISTD）领域，在昏暗环境中探测目标的能力至关重要，因为它提高了夜间和恶劣天气条件下目标识别的性能。红外小目标轮廓模糊、尺寸小、分布稀疏等特点，增加了在杂乱背景下识别红外小目标的难度。现有的方法不能满足红外小目标检测和分类的要求。为了解决这些问题，提高小目标检测和分类的精度，本文引入了一种新颖的位置细化和相邻特征融合YOLO （LA-YOLO）方法，该方法通过集成多头自注意机制（MSA）来增强特征提取。为了提高路径聚合网络（PAN）的信息利用率，我们改进了特征融合方法来合并相邻特征。最后，我们引入了对检测网络中目标中心点的监督。公开数据集的实证结果表明，LA-YOLO在IST-A上的平均精度（AP）达到了92.46%，在FLIR上的平均精度（mAP）达到了84.82%。其结果超过了当代最先进的探测器，在精度和速度之间取得了平衡。LA-YOLO作为一种可行且有效的ISTD解决方案，为红外图像分析的发展做出了重大贡献。代码可在https://github.com/liusjo/LA-YOLO上获得。

{"title":"LA-YOLO: Location Refinement and Adjacent Feature Fusion-Based Infrared Small Target Detection","authors":"Shijie Liu, Chenqi Luo, Kang Yan, Feiwei Qin, Ruiquan Ge, Yong Peng, Jie Huang, Nenggan Zheng, Yongquan Zhang, Changmiao Wang","doi":"10.1049/cit2.70070","DOIUrl":"https://doi.org/10.1049/cit2.70070","url":null,"abstract":"In the field of infrared small target detection (ISTD), the ability to detect targets in dim environments is critical, as it improves the performance of target recognition in nighttime and harsh weather conditions. The blurry contour, small size and sparse distribution of infrared small targets increase the difficulty of identifying such targets in cluttered backgrounds. Existing methodologies fall short of satisfying the requisites for the detection and categorisation of infrared small targets. To address these challenges and to enhance the precision of small object detection and classification, this paper introduces an innovative approach called location refinement and adjacent feature fusion YOLO (LA-YOLO), which enhances feature extraction by integrating a multi-head self-attention mechanism (MSA). We have improved the feature fusion method to merge adjacent features, to enhance information utilisation in the path aggregation network (PAN). Lastly, we introduce supervision on the target centre points in the detection network. Empirical results on publicly available datasets demonstrate that LA-YOLO achieves an impressive average precision (AP) of 92.46% on IST-A and a mean average precision (mAP) of 84.82% on FLIR. The results surpass those of contemporary state-of-the-art detectors, striking a balance between precision and speed. LA-YOLO emerges as a viable and efficacious solution for ISTD, making a substantial contribution to the progression of infrared imagery analysis. The code is available at https://github.com/liusjo/LA-YOLO.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1893-1903"},"PeriodicalIF":7.3,"publicationDate":"2025-10-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70070","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

DNA Encoding Optimisation Based on Thermodynamics 基于热力学的DNA编码优化

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-09-13 DOI: 10.1049/cit2.70055

Xianhang Luo, Kai Zhang, Enqiang Zhu, Jin Xu

Due to their exceptional programmability, DNA molecules are widely employed in the design of molecular circuits for applications such as DNA computing, DNA storage and cancer diagnosis and treatment. The quality of DNA sequences directly determines the reliability of these molecular circuits. However, existing DNA encoding algorithms suffer from limitations such as reliance on Hamming distance and conflicts among multiple objectives, resulting in insufficient stability of the generated sequences. To address these issues, this paper proposes a thermodynamics-based multi-objective evolutionary optimisation algorithm (TEMOA). The core innovations of the proposed algorithm are as follows: First, a thermodynamics-based DNA encoding modelling strategy (TDEMS) is introduced, which simplifies the encoding process and significantly improves the sequence quality by incorporating thermodynamic stability constraints. Second, two diversity optimisation strategies—the diversity assessment strategy (DAS) and the front equalisation nondominated sorting (FENS) strategy—are designed to enhance the algorithm's global search capability. Finally, a flexible fitness function design is incorporated to accommodate diverse user requirements. Experimental results demonstrate that TEMOA is more effective than state-of-the-art methods on challenging multi-objective optimisation problems, whereas the DNA sequences generated by TEMOA exhibit greater reliability compared to those produced by traditional DNA encoding algorithms.

由于其特殊的可编程性，DNA分子被广泛应用于分子电路的设计，如DNA计算、DNA存储和癌症诊断和治疗。DNA序列的质量直接决定了这些分子电路的可靠性。然而，现有的DNA编码算法存在依赖汉明距离、多目标之间存在冲突等局限性，导致生成的序列稳定性不足。针对这些问题，本文提出了一种基于热力学的多目标进化优化算法（TEMOA）。该算法的核心创新点在于：首先，引入了基于热力学的DNA编码建模策略（tdem），通过引入热力学稳定性约束，简化了编码过程，显著提高了序列质量；其次，设计了两种多样性优化策略——多样性评估策略（DAS）和前端均衡非主导排序策略（FENS），以增强算法的全局搜索能力。最后，结合灵活的健身功能设计，以适应不同的用户需求。实验结果表明，在具有挑战性的多目标优化问题上，TEMOA比最先进的方法更有效，而与传统DNA编码算法产生的DNA序列相比，TEMOA产生的DNA序列表现出更高的可靠性。

{"title":"DNA Encoding Optimisation Based on Thermodynamics","authors":"Xianhang Luo, Kai Zhang, Enqiang Zhu, Jin Xu","doi":"10.1049/cit2.70055","DOIUrl":"https://doi.org/10.1049/cit2.70055","url":null,"abstract":"Due to their exceptional programmability, DNA molecules are widely employed in the design of molecular circuits for applications such as DNA computing, DNA storage and cancer diagnosis and treatment. The quality of DNA sequences directly determines the reliability of these molecular circuits. However, existing DNA encoding algorithms suffer from limitations such as reliance on Hamming distance and conflicts among multiple objectives, resulting in insufficient stability of the generated sequences. To address these issues, this paper proposes a thermodynamics-based multi-objective evolutionary optimisation algorithm (TEMOA). The core innovations of the proposed algorithm are as follows: First, a thermodynamics-based DNA encoding modelling strategy (TDEMS) is introduced, which simplifies the encoding process and significantly improves the sequence quality by incorporating thermodynamic stability constraints. Second, two diversity optimisation strategies—the diversity assessment strategy (DAS) and the front equalisation nondominated sorting (FENS) strategy—are designed to enhance the algorithm's global search capability. Finally, a flexible fitness function design is incorporated to accommodate diverse user requirements. Experimental results demonstrate that TEMOA is more effective than state-of-the-art methods on challenging multi-objective optimisation problems, whereas the DNA sequences generated by TEMOA exhibit greater reliability compared to those produced by traditional DNA encoding algorithms.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1829-1843"},"PeriodicalIF":7.3,"publicationDate":"2025-09-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70055","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145845880","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Keypoint-Guided Feature Partition Network for Occluded Person Re-Identification 一种关键点导向的遮挡人再识别特征划分网络

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-09-09 DOI: 10.1049/cit2.70057

Die Dai, Xu Zhang, Zhiguang Wu, Hongying Meng, Zuyu Zhang

Existing occluded person re-identification methods employ hard or soft partition strategies to explore fine-grained information. However, the hard partition strategy which extracts region-level features may impair the semantic connectivity of correlated human body parts. A pose-guided soft partition establishes correlations among human keypoints, while the generated pixel-level embeddings may lose the surrounding semantic information. In this paper, we propose a keypoint-guided feature partition (KGFP) method that consists of a feature extractor, a hard partition branch, and a soft partition branch. Specifically, we adopt a vision transformer and a pose estimator to extract features and keypoint information. In the hard partition branch, we partition features into distinct groups and classify them into nonoccluded, semi-occluded, and occluded features to obtain region-level features and filter out occlusions. Furthermore, we design a dissimilarity loss to reduce the similarity between semi-occluded and occluded features. In the soft partition branch, we introduce a graph attention network and consider global and keypoint embeddings as nodes of a graph to discover interrelationships. Additionally, we formulate image alignment as a graph matching problem and propose a feature alignment-based graph to reduce position misalignment. Extensive experiments demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods on Occluded-DukeMTMC, Markt1501, and DukeMTMC-reID.

现有的闭塞人再识别方法采用硬分区或软分区策略来挖掘细粒度信息。然而，提取区域级特征的硬分割策略可能会损害相关人体部位的语义连通性。姿态引导的软分割建立了人体关键点之间的相关性，而生成的像素级嵌入可能会丢失周围的语义信息。本文提出了一种由特征提取器、硬分割分支和软分割分支组成的关键点引导特征分割（KGFP）方法。具体来说，我们采用视觉变换和姿态估计来提取特征和关键点信息。在硬分割分支中，我们将特征划分为不同的组，并将其分为未遮挡、半遮挡和遮挡特征，以获得区域级特征并过滤掉遮挡。此外，我们设计了一个不相似损失来降低半遮挡和遮挡特征之间的相似性。在软划分分支中，我们引入了图关注网络，并将全局嵌入和关键点嵌入作为图的节点来发现相互关系。此外，我们将图像对齐作为一个图匹配问题，并提出了一个基于特征对齐的图来减少位置不对齐。大量的实验表明，与目前最先进的方法相比，该方法在occled - dukemtmc、Markt1501和DukeMTMC-reID上取得了更好的性能。

{"title":"A Keypoint-Guided Feature Partition Network for Occluded Person Re-Identification","authors":"Die Dai, Xu Zhang, Zhiguang Wu, Hongying Meng, Zuyu Zhang","doi":"10.1049/cit2.70057","DOIUrl":"https://doi.org/10.1049/cit2.70057","url":null,"abstract":"Existing occluded person re-identification methods employ hard or soft partition strategies to explore fine-grained information. However, the hard partition strategy which extracts region-level features may impair the semantic connectivity of correlated human body parts. A pose-guided soft partition establishes correlations among human keypoints, while the generated pixel-level embeddings may lose the surrounding semantic information. In this paper, we propose a keypoint-guided feature partition (KGFP) method that consists of a feature extractor, a hard partition branch, and a soft partition branch. Specifically, we adopt a vision transformer and a pose estimator to extract features and keypoint information. In the hard partition branch, we partition features into distinct groups and classify them into nonoccluded, semi-occluded, and occluded features to obtain region-level features and filter out occlusions. Furthermore, we design a dissimilarity loss to reduce the similarity between semi-occluded and occluded features. In the soft partition branch, we introduce a graph attention network and consider global and keypoint embeddings as nodes of a graph to discover interrelationships. Additionally, we formulate image alignment as a graph matching problem and propose a feature alignment-based graph to reduce position misalignment. Extensive experiments demonstrate that the proposed method achieves superior performance compared to state-of-the-art methods on Occluded-DukeMTMC, Markt1501, and DukeMTMC-reID.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1535-1547"},"PeriodicalIF":7.3,"publicationDate":"2025-09-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70057","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366269","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

A Paradigm of Temporal-Weather-Aware Transition Pattern for POI Recommendation POI推荐的时间-天气感知转换模式范式

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-09-06 DOI: 10.1049/cit2.70054

Junyang Chen, Jingcai Guo, Huan Wang, Zhihui Lai, Qin Zhang, Kaishun Wu, Liang-Jie Zhang

Point of interest (POI) recommendation analyses user preferences through historical check-in data. However, existing POI recommendation methods often overlook the influence of weather information and face the challenge of sparse historical data for individual users. To address these issues, this paper proposes a new paradigm, namely temporal-weather-aware transition pattern for POI recommendation (TWTransNet). This paradigm is designed to capture user transition patterns under different times and weather conditions. Additionally, we introduce the construction of a user-POI interaction graph to alleviate the problem of sparse historical data for individual users. Furthermore, when predicting user interests by aggregating graph information, some POIs may not be suitable for visitation under current weather conditions. To account for this, we propose an attention mechanism to filter POI neighbours when aggregating information from the graph, considering the impact of weather and time. Empirical results on two real-world datasets demonstrate the superior performance of our proposed method, showing a substantial improvement of 6.91%–23.31% in terms of prediction accuracy.

兴趣点（POI）推荐通过历史签入数据分析用户偏好。然而，现有的POI推荐方法往往忽略了天气信息的影响，并且面临个体用户历史数据稀疏的挑战。为了解决这些问题，本文提出了一种新的模式，即用于POI推荐的时间天气感知转换模式（TWTransNet）。此范例旨在捕获不同时间和天气条件下的用户转换模式。此外，我们还引入了用户- poi交互图的构建，以缓解单个用户历史数据稀疏的问题。此外，当通过聚合图形信息来预测用户兴趣时，一些poi可能不适合当前天气条件下的访问。为了解释这一点，我们提出了一种注意力机制，在考虑天气和时间的影响，从图中聚合信息时过滤POI邻居。在两个真实数据集上的实证结果证明了本文方法的优越性能，预测精度提高了6.91% ~ 23.31%。

{"title":"A Paradigm of Temporal-Weather-Aware Transition Pattern for POI Recommendation","authors":"Junyang Chen, Jingcai Guo, Huan Wang, Zhihui Lai, Qin Zhang, Kaishun Wu, Liang-Jie Zhang","doi":"10.1049/cit2.70054","DOIUrl":"https://doi.org/10.1049/cit2.70054","url":null,"abstract":"Point of interest (POI) recommendation analyses user preferences through historical check-in data. However, existing POI recommendation methods often overlook the influence of weather information and face the challenge of sparse historical data for individual users. To address these issues, this paper proposes a new paradigm, namely temporal-weather-aware transition pattern for POI recommendation (TWTransNet). This paradigm is designed to capture user transition patterns under different times and weather conditions. Additionally, we introduce the construction of a user-POI interaction graph to alleviate the problem of sparse historical data for individual users. Furthermore, when predicting user interests by aggregating graph information, some POIs may not be suitable for visitation under current weather conditions. To account for this, we propose an attention mechanism to filter POI neighbours when aggregating information from the graph, considering the impact of weather and time. Empirical results on two real-world datasets demonstrate the superior performance of our proposed method, showing a substantial improvement of 6.91%–23.31% in terms of prediction accuracy.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 6","pages":"1675-1687"},"PeriodicalIF":7.3,"publicationDate":"2025-09-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70054","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145824572","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Arbitrary-Scale Point Cloud Upsampling via Enhanced Geometric Spatial Consistency 通过增强几何空间一致性的任意尺度点云上采样

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-08-28 DOI: 10.1049/cit2.70052

Xianjing Cheng, Lintai Wu, Junhui Hou, Zhijun Hu, Jie Wen, Yong Xu

Point cloud upsampling is an essential yet challenging task in various 3D computer vision and graphics applications. Existing methods often struggle with limitations such as the generation of outliers or shrinkage artifacts. Additionally, these methods usually ignore the overall spatial structure of point clouds, leading to suboptimal results. To tackle these challenges, we propose a novel framework that enhances geometric spatial consistency in upsampled point clouds through a dual-supervision mechanism and enables the generation of high-fidelity results with precise geometric structures. Specifically, we first design a tailored feature extractor that iteratively extracts the comprehensive and distinctive features by integrating both fine-grained local geometric details and global structure information. Then, our network predicts the point-to-point distances and Chamfer distances of upsampled points to accurately capture the spatial relation within them. To enhance spatial consistency, we formulate a joint loss function that enables our model to perceive the spatial relations between points by indirect and direct supervision. This ensures the precise alignment between upsampled points and ground truth during training. Furthermore, we propose a coordinate reconstruction to generate more high-quality upsampled points iteratively. We conduct extensive experiments across multiple benchmark datasets and downstream tasks. The results comprehensively demonstrate that our method achieves state-of-the-art performance and exhibits superior generalisation capabilities.

在各种3D计算机视觉和图形应用中，点云上采样是一项必不可少但具有挑战性的任务。现有的方法经常与局限性作斗争，例如产生异常值或收缩伪影。此外，这些方法通常忽略了点云的整体空间结构，导致结果不理想。为了解决这些挑战，我们提出了一个新的框架，通过双重监督机制增强上采样点云的几何空间一致性，并能够生成具有精确几何结构的高保真结果。具体而言，我们首先设计了一种定制化的特征提取器，通过整合细粒度的局部几何细节和全局结构信息，迭代地提取出全面而鲜明的特征。然后，我们的网络预测上采样点的点对点距离和倒角距离，以准确捕捉它们之间的空间关系。为了增强空间一致性，我们制定了一个联合损失函数，使我们的模型能够通过间接和直接监督来感知点之间的空间关系。这确保了在训练期间上采样点和地面真相之间的精确对齐。在此基础上，提出了一种坐标重构方法，迭代生成更多高质量的上采样点。我们在多个基准数据集和下游任务上进行了广泛的实验。结果全面表明，我们的方法达到了最先进的性能，并表现出优越的泛化能力。

{"title":"Arbitrary-Scale Point Cloud Upsampling via Enhanced Geometric Spatial Consistency","authors":"Xianjing Cheng, Lintai Wu, Junhui Hou, Zhijun Hu, Jie Wen, Yong Xu","doi":"10.1049/cit2.70052","DOIUrl":"https://doi.org/10.1049/cit2.70052","url":null,"abstract":"Point cloud upsampling is an essential yet challenging task in various 3D computer vision and graphics applications. Existing methods often struggle with limitations such as the generation of outliers or shrinkage artifacts. Additionally, these methods usually ignore the overall spatial structure of point clouds, leading to suboptimal results. To tackle these challenges, we propose a novel framework that enhances geometric spatial consistency in upsampled point clouds through a dual-supervision mechanism and enables the generation of high-fidelity results with precise geometric structures. Specifically, we first design a tailored feature extractor that iteratively extracts the comprehensive and distinctive features by integrating both fine-grained local geometric details and global structure information. Then, our network predicts the point-to-point distances and Chamfer distances of upsampled points to accurately capture the spatial relation within them. To enhance spatial consistency, we formulate a joint loss function that enables our model to perceive the spatial relations between points by indirect and direct supervision. This ensures the precise alignment between upsampled points and ground truth during training. Furthermore, we propose a coordinate reconstruction to generate more high-quality upsampled points iteratively. We conduct extensive experiments across multiple benchmark datasets and downstream tasks. The results comprehensively demonstrate that our method achieves state-of-the-art performance and exhibits superior generalisation capabilities.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1291-1305"},"PeriodicalIF":7.3,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70052","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366960","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

V-UNet: Medical Image Segmentation Based on Variational Attention Mechanism 基于变分注意机制的医学图像分割

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-08-28 DOI: 10.1049/cit2.70053

Yang Zhang, Qiang Yang, Tian Li, Fanghong Zhang, Yu Ren, Yinhao Li, Chuanyun Xu

Accurate medical image segmentation plays a crucial role in improving the precision of computer-aided diagnosis. However, complex boundary shapes, low contrast and blurred anatomical structures make fine-grained segmentation a challenging task. Variational Bayesian inference quantifies uncertainty through probability distributions and can construct robust probabilistic models for the boundaries of ambiguous organs and tissues. In this paper, we apply variational Bayesian inference to medical image segmentation and propose variational attention to model the uncertainty of low-contrast and blurry tissue and organ boundaries. This enhances the model's ability to perceive segmentation boundaries, improving robustness and segmentation accuracy. Variational attention first estimates the parameters of the probability distribution of latent representations based on input features. Then, it samples latent representations from the learnt distribution to generate attention weights that optimise the interaction between global features and ambiguous boundaries. We integrate variational attention into the U-Net model by replacing its skip connections, constructing a multi-scale variational attention segmentation model (V-UNet). Experiments on the ISBI 2012 and MoNuSeg 2018 datasets show that our method achieves Dice scores of 95.89% and 82.18%, respectively. Moreover, we integrate V-UNet into the Mask R-CNN framework by replacing the FPN feature extraction head and propose a two-stage segmentation method. Compared to the original Mask R-CNN, our method improves the Dice score by 0.81%, mAP by 8.06% and F1 score by 0.51%.

准确的医学图像分割对提高计算机辅助诊断的精度起着至关重要的作用。然而，复杂的边界形状、低对比度和模糊的解剖结构使细粒度分割成为一项具有挑战性的任务。变分贝叶斯推理通过概率分布来量化不确定性，可以为模糊器官和组织的边界构建稳健的概率模型。在本文中，我们将变分贝叶斯推理应用于医学图像分割，并提出变分关注来建模低对比度和模糊的组织和器官边界的不确定性。这增强了模型感知分割边界的能力，提高了鲁棒性和分割精度。变分注意首先根据输入特征估计潜在表征的概率分布参数。然后，它从学习到的分布中采样潜在表征来生成注意力权重，从而优化全局特征和模糊边界之间的交互。我们通过替换U-Net模型的跳跃连接，将变分注意力整合到U-Net模型中，构建了一个多尺度变分注意力分割模型（V-UNet）。在ISBI 2012和MoNuSeg 2018数据集上的实验表明，我们的方法分别达到了95.89%和82.18%的Dice分数。此外，我们通过替换FPN特征提取头，将V-UNet集成到Mask R-CNN框架中，并提出了一种两阶段分割方法。与原始Mask R-CNN相比，我们的方法提高了Dice分数0.81%，mAP分数8.06%，F1分数0.51%。

{"title":"V-UNet: Medical Image Segmentation Based on Variational Attention Mechanism","authors":"Yang Zhang, Qiang Yang, Tian Li, Fanghong Zhang, Yu Ren, Yinhao Li, Chuanyun Xu","doi":"10.1049/cit2.70053","DOIUrl":"https://doi.org/10.1049/cit2.70053","url":null,"abstract":"Accurate medical image segmentation plays a crucial role in improving the precision of computer-aided diagnosis. However, complex boundary shapes, low contrast and blurred anatomical structures make fine-grained segmentation a challenging task. Variational Bayesian inference quantifies uncertainty through probability distributions and can construct robust probabilistic models for the boundaries of ambiguous organs and tissues. In this paper, we apply variational Bayesian inference to medical image segmentation and propose variational attention to model the uncertainty of low-contrast and blurry tissue and organ boundaries. This enhances the model's ability to perceive segmentation boundaries, improving robustness and segmentation accuracy. Variational attention first estimates the parameters of the probability distribution of latent representations based on input features. Then, it samples latent representations from the learnt distribution to generate attention weights that optimise the interaction between global features and ambiguous boundaries. We integrate variational attention into the U-Net model by replacing its skip connections, constructing a multi-scale variational attention segmentation model (V-UNet). Experiments on the ISBI 2012 and MoNuSeg 2018 datasets show that our method achieves Dice scores of 95.89% and 82.18%, respectively. Moreover, we integrate V-UNet into the Mask R-CNN framework by replacing the FPN feature extraction head and propose a two-stage segmentation method. Compared to the original Mask R-CNN, our method improves the Dice score by 0.81%, mAP by 8.06% and F1 score by 0.51%.","PeriodicalId":46211,"journal":{"name":"CAAI Transactions on Intelligence Technology","volume":"10 5","pages":"1350-1362"},"PeriodicalIF":7.3,"publicationDate":"2025-08-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://ietresearch.onlinelibrary.wiley.com/doi/epdf/10.1049/cit2.70053","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145366935","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}

引用次数: 0

Density Peaks Clustering Based on Weighted Density Estimating and Multicluster Merging 基于加权密度估计和多聚类合并的密度峰聚类

IF 7.3 2区计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE

CAAI Transactions on Intelligence Technology

Pub Date : 2025-08-28 DOI: 10.1049/cit2.70050

Xinran Zhou, Qinghua Zhang, Chengying Wu, Qin Xie, Guoyin Wang

Density peaks clustering (DPC) is a density-based clustering algorithm that identifies cluster centres by constructing decision graphs and allocates noncluster centres. Although DPC does not require specifying cluster numbers in advance, the local density is affected by the distribution of data points. Meanwhile, allocating noncluster centres is likely to result in continuous errors. Hence, a novel DPC based on weighted density estimating and multicluster merging (DPC-WDMM) is proposed. Firstly, a novel local density is defined by the nearest neighbour relationship. Then, to avoid incorrect selection of cluster centres, data points with relatively high local density are all marked within the local range. Finally, using these data points to represent microclusters for merging, the final clustering results can be obtained. The performance of this novel algorithm has been demonstrated through the experimental results on several datasets.

密度峰聚类（DPC）是一种基于密度的聚类算法，它通过构造决策图来识别聚类中心并分配非聚类中心。虽然DPC不需要预先指定簇数，但局部密度会受到数据点分布的影响。同时，分配非聚类中心容易导致连续误差。为此，提出了一种基于加权密度估计和多聚类合并的DPC （DPC- wdmm）算法。首先，通过最近邻关系定义一个新的局部密度。然后，为了避免错误地选择聚类中心，将局部密度较高的数据点都标记在局部范围内。最后，用这些数据点代表微聚类进行合并，得到最终的聚类结果。在多个数据集上的实验结果证明了该算法的有效性。

引用次数: 0