首页 > 最新文献

Applied Intelligence最新文献

英文 中文
SFM_MF: A streamflow forecasting model based on model fusion for small-sample data in small and medium-sized rivers SFM_MF:基于模式融合的中小河流小样本流量预报模型
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-26 DOI: 10.1007/s10489-025-07074-0
Suhe Zhang, Yufeng Yu, Qingqing Chen, Qun Zhao, Ziyang Chen

Traditional streamflow forecasting models based on machine learning often struggle to determine optimal parameters and achieve accurate predictions when applied to small and medium-sized rivers with limited data availability. To address this issue, this study proposes an Attention-based Bidirectional Long Short-Term Memory (A_BiLSTM) and, on this basis, develops a streamflow forecasting model based on model fusion, referred to as SFM_MF. The SFM_MF model employs Bayesian Linear Regression (BLR) to learn from small-sample data in the target basin, while leveraging the A_BiLSTM to model data transferred from hydrologically similar source basins. The final streamflow forecast is generated by integrating the outputs of BLR and A_BiLSTM through a weighted averaging method. Experimental results from the Jiulong River Basin and the Qinhuai River Basin demonstrate that A_BiLSTM outperforms baseline models, and that the SFM_MF model significantly surpasses its component models in terms of both predictive accuracy and generalization capability. Compared to the individual models, SFM_MF achieves reductions in root mean square error (RMSE) of 3.98% and 19.33%, respectively. These findings indicate that the SFM_MF model delivers superior forecasting performance for small and medium-sized basins with limited data resources.

传统的基于机器学习的流量预测模型在应用于数据可用性有限的中小型河流时,往往难以确定最佳参数并实现准确预测。为了解决这一问题,本研究提出了一种基于注意的双向长短期记忆(A_BiLSTM)模型,并在此基础上建立了基于模型融合的流量预测模型SFM_MF。SFM_MF模型采用贝叶斯线性回归(BLR)从目标流域的小样本数据中学习,同时利用A_BiLSTM对水文相似的源流域的数据进行建模。通过加权平均的方法对BLR和A_BiLSTM的输出进行综合,得到最终的流量预测结果。九龙河流域和秦淮河流域的试验结果表明,A_BiLSTM模型在预测精度和泛化能力上明显优于基线模型,SFM_MF模型在预测精度和泛化能力上均优于其成分模型。与单个模型相比,SFM_MF的均方根误差(RMSE)分别降低了3.98%和19.33%。结果表明,SFM_MF模型对数据资源有限的中小流域具有较好的预测效果。
{"title":"SFM_MF: A streamflow forecasting model based on model fusion for small-sample data in small and medium-sized rivers","authors":"Suhe Zhang,&nbsp;Yufeng Yu,&nbsp;Qingqing Chen,&nbsp;Qun Zhao,&nbsp;Ziyang Chen","doi":"10.1007/s10489-025-07074-0","DOIUrl":"10.1007/s10489-025-07074-0","url":null,"abstract":"<div>\u0000 \u0000 <p>Traditional streamflow forecasting models based on machine learning often struggle to determine optimal parameters and achieve accurate predictions when applied to small and medium-sized rivers with limited data availability. To address this issue, this study proposes an Attention-based Bidirectional Long Short-Term Memory (A_BiLSTM) and, on this basis, develops a streamflow forecasting model based on model fusion, referred to as SFM_MF. The SFM_MF model employs Bayesian Linear Regression (BLR) to learn from small-sample data in the target basin, while leveraging the A_BiLSTM to model data transferred from hydrologically similar source basins. The final streamflow forecast is generated by integrating the outputs of BLR and A_BiLSTM through a weighted averaging method. Experimental results from the Jiulong River Basin and the Qinhuai River Basin demonstrate that A_BiLSTM outperforms baseline models, and that the SFM_MF model significantly surpasses its component models in terms of both predictive accuracy and generalization capability. Compared to the individual models, SFM_MF achieves reductions in root mean square error (RMSE) of 3.98% and 19.33%, respectively. These findings indicate that the SFM_MF model delivers superior forecasting performance for small and medium-sized basins with limited data resources.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel membrane-inspired evolutionary algorithm framework for VRPTW 一种基于膜的VRPTW进化算法框架
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-07068-y
Zhonghai Bai, Václav Snášel, Seyedali Mirjalili, Bay Vo, Lingping Kong, Xiaopeng Wang

The vehicle routing problem with time windows (VRPTW) has gained much attention recently due to its wide application in operations research and logistics. VRPTW has been proven to be an NP-hard problem whose optimal solution is computationally costly. Scholars have proposed many methods, such as exact algorithms, heuristics, and metaheuristics, to find near-optimal solutions for the VRPTW. Exact algorithms are limited to small-scale problems, while heuristic algorithms and metaheuristics often converge to locally optimal solutions, despite their applicability to larger-scale problems. This paper proposes a novel membrane-inspired evolutionary algorithm framework (MEAF) consisting of isolated evolutionary rules, communication output rules, communication input rules, fusion-exchange information operation, and membrane dissolution rules. By leveraging the advantages of multiple metaheuristics algorithms and avoiding the pitfalls of local optima, MEAF offers a promising solution to address complex problems. The effectiveness of the proposed MEAF is verified by applying three classical metaheuristics, namely Genetic Algorithm (GA), Ant Colony System (ACS), and Particle Swarm Algorithm (PSO), to solve the VRPTW problem. The experiments are run on 56 instances of Solomon with 100 client benchmarks. The evaluation of the experimental results combined with the mean and standard deviation values show that the algorithm performs better in 54 out of 56 instances, demonstrating the effectiveness and stability of the proposed algorithm.

带时间窗的车辆路径问题由于在运筹学和物流领域的广泛应用,近年来受到了广泛的关注。VRPTW已被证明是一个np困难问题,其最优解的计算成本很高。学者们提出了许多方法,如精确算法、启发式和元启发式,来寻找VRPTW的近最优解。精确算法仅限于小规模问题,而启发式算法和元启发式算法通常收敛于局部最优解,尽管它们适用于更大规模的问题。本文提出了一种新的膜启发进化算法框架(MEAF),该框架由孤立进化规则、通信输出规则、通信输入规则、融合交换信息操作和膜溶解规则组成。通过利用多个元启发式算法的优点,避免了局部最优的缺陷,MEAF为解决复杂问题提供了一个有希望的解决方案。通过应用遗传算法(GA)、蚁群算法(ACS)和粒子群算法(PSO)三种经典的元启发式算法来解决VRPTW问题,验证了MEAF算法的有效性。实验运行在56个Solomon实例和100个客户机基准测试上。结合均值和标准差值对实验结果进行评价,56例中有54例算法表现较好,证明了算法的有效性和稳定性。
{"title":"A novel membrane-inspired evolutionary algorithm framework for VRPTW","authors":"Zhonghai Bai,&nbsp;Václav Snášel,&nbsp;Seyedali Mirjalili,&nbsp;Bay Vo,&nbsp;Lingping Kong,&nbsp;Xiaopeng Wang","doi":"10.1007/s10489-025-07068-y","DOIUrl":"10.1007/s10489-025-07068-y","url":null,"abstract":"<div><p>The vehicle routing problem with time windows (VRPTW) has gained much attention recently due to its wide application in operations research and logistics. VRPTW has been proven to be an NP-hard problem whose optimal solution is computationally costly. Scholars have proposed many methods, such as exact algorithms, heuristics, and metaheuristics, to find near-optimal solutions for the VRPTW. Exact algorithms are limited to small-scale problems, while heuristic algorithms and metaheuristics often converge to locally optimal solutions, despite their applicability to larger-scale problems. This paper proposes a novel membrane-inspired evolutionary algorithm framework (MEAF) consisting of isolated evolutionary rules, communication output rules, communication input rules, fusion-exchange information operation, and membrane dissolution rules. By leveraging the advantages of multiple metaheuristics algorithms and avoiding the pitfalls of local optima, MEAF offers a promising solution to address complex problems. The effectiveness of the proposed MEAF is verified by applying three classical metaheuristics, namely Genetic Algorithm (GA), Ant Colony System (ACS), and Particle Swarm Algorithm (PSO), to solve the VRPTW problem. The experiments are run on 56 instances of Solomon with 100 client benchmarks. The evaluation of the experimental results combined with the mean and standard deviation values show that the algorithm performs better in 54 out of 56 instances, demonstrating the effectiveness and stability of the proposed algorithm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-07068-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AFFNet: Adaptive feature fusion network for defect detection of industrial product surface 用于工业产品表面缺陷检测的自适应特征融合网络
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-026-07099-z
Zhicheng Jia, Shaoqing Wang, Jinghua Zheng, Xiaobo Han, Yongwei Tang, Fuzhen Sun

Industrial product surface defect detection plays a crucial role in guaranteeing the quality of industrial products. Several primary challenges remain in achieving efficient and accurate automatic detection: (1) Industrial product surface defects demonstrate irregular morphological features. (2) Significant size disparities between different defects result in the risk of information loss during feature fusion in Feature Pyramid Networks (FPN) typically employed in the neck component. (3) Visual similarity among different defect categories means that slight deviations in predicted box locations can result in misclassification errors. This research initially employs a DCN-C3k2 module design, incorporating DCNv3 to improve the model’s sensitivity to varied morphological information of objects. Furthermore, we design an Adaptive Focusing Diffusion Network (AFDN) that aggregates multi-scale features and implements adaptive channel selection and fusion, then propagates critical features through upsampling and downsampling processes to improve defect detection precision. Lastly, we introduce a Task Dynamic Interactive Detection Head (TDIDH). The TDIDH constructs detection head architecture according to the specific properties and variations of different detection tasks, with the objective of optimizing detection performance through enhanced inter-task dynamic interaction. Experiments are performed on public datasets GC10-DET and NEU-DET. The proposed method achieves AP, AP(_{50}), and AP(_{75}) scores of 41.7%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.

工业产品表面缺陷检测对保证工业产品质量起着至关重要的作用。实现高效、准确的自动检测仍然存在几个主要挑战:(1)工业产品表面缺陷表现出不规则的形态特征。(2)特征金字塔网络(feature Pyramid Networks, FPN)中,不同缺陷之间存在较大的尺寸差异,导致特征融合过程中存在信息丢失的风险。(3)不同缺陷类别之间的视觉相似性意味着预测框位置的微小偏差可能导致误分类错误。本研究初步采用DCN-C3k2模块设计,结合DCNv3提高模型对物体形态信息变化的敏感性。此外,我们设计了一个自适应聚焦扩散网络(AFDN),该网络聚合多尺度特征并实现自适应信道选择和融合,然后通过上采样和下采样过程传播关键特征以提高缺陷检测精度。最后,我们介绍了一种任务动态交互检测头(TDIDH)。TDIDH根据不同检测任务的特性和变化构造检测头结构,通过增强任务间的动态交互来优化检测性能。实验在公共数据集GC10-DET和nue - det上进行。该方法的AP、AP (_{50})、AP (_{75})得分均为41.7分%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.
{"title":"AFFNet: Adaptive feature fusion network for defect detection of industrial product surface","authors":"Zhicheng Jia,&nbsp;Shaoqing Wang,&nbsp;Jinghua Zheng,&nbsp;Xiaobo Han,&nbsp;Yongwei Tang,&nbsp;Fuzhen Sun","doi":"10.1007/s10489-026-07099-z","DOIUrl":"10.1007/s10489-026-07099-z","url":null,"abstract":"<div><p>Industrial product surface defect detection plays a crucial role in guaranteeing the quality of industrial products. Several primary challenges remain in achieving efficient and accurate automatic detection: (1) Industrial product surface defects demonstrate irregular morphological features. (2) Significant size disparities between different defects result in the risk of information loss during feature fusion in Feature Pyramid Networks (FPN) typically employed in the neck component. (3) Visual similarity among different defect categories means that slight deviations in predicted box locations can result in misclassification errors. This research initially employs a DCN-C3k2 module design, incorporating DCNv3 to improve the model’s sensitivity to varied morphological information of objects. Furthermore, we design an Adaptive Focusing Diffusion Network (AFDN) that aggregates multi-scale features and implements adaptive channel selection and fusion, then propagates critical features through upsampling and downsampling processes to improve defect detection precision. Lastly, we introduce a Task Dynamic Interactive Detection Head (TDIDH). The TDIDH constructs detection head architecture according to the specific properties and variations of different detection tasks, with the objective of optimizing detection performance through enhanced inter-task dynamic interaction. Experiments are performed on public datasets GC10-DET and NEU-DET. The proposed method achieves AP, AP<span>(_{50})</span>, and AP<span>(_{75})</span> scores of 41.7%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative deep learning framework based on adaptive feature fusion for malignancy prediction of lung nodules 基于自适应特征融合的协同深度学习框架肺结节恶性预测
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-07073-1
Ruoyu Wu, Changyu Liang, Yuan Li, Qijuan Tan, Jiuquan Zhang, Hong Huang

Malignancy prediction of lung nodules on computed tomography (CT) is a task of significant clinical importance. The malignancy of lung nodules on CT is not only reflected in some local details but also closely related to several global attributes. However, previous studies often ignore this vital fact, resulting in a bottleneck of performance improvement. In this paper, a collaborative deep learning framework based on adaptive feature fusion (CDLF-AFF) is proposed to improve the malignancy prediction performance. In the CDLF-AFF method, a multi-view input strategy is designed to drive the pre-trained models to capture the abundant spatial information of nodule CT images. Furthermore, a dual-branch architecture is developed to simultaneously learn the local structure features and long-range dependency relations. To improve the feature fusion efficiency, a feature aggregation module is constructed to adaptively fuse the feature maps produced by two different styles of learning branches. The CDLF-AFF method is evaluated on three different datasets. On the benchmark dataset LIDC-IDRI, it achieved an AUC of 97.25% (95% CI: 96.97%-97.54%), representing improvements of 1.47% and 0.86% over the corresponding unimodal models, ResNet-Model and ViT-Model, respectively. On the clinical dataset CQUCH-LND, it achieved an AUC of 94.09% (95% CI: 93.34%-94.84%), showing improvements of 3.41% and 1.39% over the corresponding ResNet-Model and ViT-Model, respectively. On the competition dataset LUNGx, it achieved an AUC of 79.13% (95% CI: 78.07%-80.19%), surpassing the best-performing algorithm listed on the competition leaderboard by 11.13%. These results demonstrate that the CDLF-AFF can effectively predict the malignancy of lung nodules.

计算机断层扫描(CT)对肺结节的恶性预测是一项具有重要临床意义的任务。CT上肺结节的恶性不仅反映在局部细节上,还与若干全局属性密切相关。然而,以往的研究往往忽略了这一重要事实,导致性能提升的瓶颈。为了提高恶性肿瘤的预测性能,本文提出了一种基于自适应特征融合(CDLF-AFF)的协同深度学习框架。在CDLF-AFF方法中,设计了一种多视图输入策略,驱动预训练模型捕获结节CT图像丰富的空间信息。在此基础上,提出了一种双分支结构,可以同时学习局部结构特征和远程依赖关系。为了提高特征融合效率,构建了特征聚合模块,对两种不同学习方式生成的特征映射进行自适应融合。在三个不同的数据集上对CDLF-AFF方法进行了评估。在基准数据集LIDC-IDRI上,其AUC达到97.25% (95% CI: 96.97%-97.54%),比相应的单峰模型ResNet-Model和viti - model分别提高了1.47%和0.86%。在临床数据集CQUCH-LND上,实现了94.09%的AUC (95% CI: 93.34%-94.84%),比相应的ResNet-Model和viti - model分别提高了3.41%和1.39%。在比赛数据集LUNGx上,它的AUC达到了79.13% (95% CI: 78.07%-80.19%),比比赛排行榜上表现最好的算法高出11.13%。这些结果表明CDLF-AFF能有效预测肺结节的恶性。
{"title":"Collaborative deep learning framework based on adaptive feature fusion for malignancy prediction of lung nodules","authors":"Ruoyu Wu,&nbsp;Changyu Liang,&nbsp;Yuan Li,&nbsp;Qijuan Tan,&nbsp;Jiuquan Zhang,&nbsp;Hong Huang","doi":"10.1007/s10489-025-07073-1","DOIUrl":"10.1007/s10489-025-07073-1","url":null,"abstract":"<div>\u0000 \u0000 <p>Malignancy prediction of lung nodules on computed tomography (CT) is a task of significant clinical importance. The malignancy of lung nodules on CT is not only reflected in some local details but also closely related to several global attributes. However, previous studies often ignore this vital fact, resulting in a bottleneck of performance improvement. In this paper, a collaborative deep learning framework based on adaptive feature fusion (CDLF-AFF) is proposed to improve the malignancy prediction performance. In the CDLF-AFF method, a multi-view input strategy is designed to drive the pre-trained models to capture the abundant spatial information of nodule CT images. Furthermore, a dual-branch architecture is developed to simultaneously learn the local structure features and long-range dependency relations. To improve the feature fusion efficiency, a feature aggregation module is constructed to adaptively fuse the feature maps produced by two different styles of learning branches. The CDLF-AFF method is evaluated on three different datasets. On the benchmark dataset LIDC-IDRI, it achieved an AUC of 97.25% (95% CI: 96.97%-97.54%), representing improvements of 1.47% and 0.86% over the corresponding unimodal models, ResNet-Model and ViT-Model, respectively. On the clinical dataset CQUCH-LND, it achieved an AUC of 94.09% (95% CI: 93.34%-94.84%), showing improvements of 3.41% and 1.39% over the corresponding ResNet-Model and ViT-Model, respectively. On the competition dataset LUNGx, it achieved an AUC of 79.13% (95% CI: 78.07%-80.19%), surpassing the best-performing algorithm listed on the competition leaderboard by 11.13%. These results demonstrate that the CDLF-AFF can effectively predict the malignancy of lung nodules.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling all-scale multi-point manufacturing problems with a single neural model 基于单神经模型的全尺度多点制造问题调度
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-026-07091-7
Jie Liu, Hwa Jen Yap, Anis Salwa Mohd Khairuddin

This research presents a novel energy-efficient task sequencing method for manufacturing operations involving multiple processing points, such as precision drilling and contact welding. The problem is formulated as a multi-dimensional weighted variant of the Traveling Salesman Problem (TSP), and solved using a Multi-gate Mixture of Experts (MMOE) neural architecture. Unlike previous approaches that require separate models for each TSP size, our method employs a single neural network to handle TSPs of all sizes, significantly improving scalability and reducing training overhead. With an uncertainty-based loss weighting strategy, the model effectively balances multiple learning objectives. Experiments show that MMOE-9 achieves performance comparable to state-of-the-art methods with only one-third of the parameters of NAR4TSP, and its training time is similar to that of a single TSP100 model. Further, we extend the model to cover 91 TSP sizes (from 10 to 100) within the same unified framework, demonstrating strong generalization across scales.

针对精密钻孔和接触焊接等多加工点制造作业,提出了一种新的节能任务排序方法。该问题被表述为旅行商问题(TSP)的多维加权变体,并使用多门混合专家(MMOE)神经结构进行求解。与之前需要为每个TSP大小单独建立模型的方法不同,我们的方法使用单个神经网络来处理所有大小的TSP,显著提高了可扩展性并减少了训练开销。该模型采用基于不确定性的损失加权策略,有效地平衡了多个学习目标。实验表明,MMOE-9仅使用NAR4TSP三分之一的参数就可以达到与最先进方法相当的性能,其训练时间与单个TSP100模型相似。此外,我们将模型扩展到同一统一框架内的91个TSP大小(从10到100),展示了跨尺度的强泛化。
{"title":"Scheduling all-scale multi-point manufacturing problems with a single neural model","authors":"Jie Liu,&nbsp;Hwa Jen Yap,&nbsp;Anis Salwa Mohd Khairuddin","doi":"10.1007/s10489-026-07091-7","DOIUrl":"10.1007/s10489-026-07091-7","url":null,"abstract":"<div><p>This research presents a novel energy-efficient task sequencing method for manufacturing operations involving multiple processing points, such as precision drilling and contact welding. The problem is formulated as a multi-dimensional weighted variant of the Traveling Salesman Problem (TSP), and solved using a Multi-gate Mixture of Experts (MMOE) neural architecture. Unlike previous approaches that require separate models for each TSP size, our method employs a single neural network to handle TSPs of all sizes, significantly improving scalability and reducing training overhead. With an uncertainty-based loss weighting strategy, the model effectively balances multiple learning objectives. Experiments show that MMOE-9 achieves performance comparable to state-of-the-art methods with only one-third of the parameters of NAR4TSP, and its training time is similar to that of a single TSP100 model. Further, we extend the model to cover 91 TSP sizes (from 10 to 100) within the same unified framework, demonstrating strong generalization across scales.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-026-07091-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-level supervised and fine-grained feature enhancement for person search 人员搜索的多级监督和细粒度特征增强
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-06996-z
Xuyang Zhang, Sijie Yang, Hongkun Liu, Heyan Jin, Guangqiang Yin, Ye Li

The person search task aims to address both person detection and person re-identification(re-id) simultaneously, integrating these two tasks into a unified objective. Person search is commonly used in surveillance and security fields. Currently, person search tasks in surveillance scenarios face many severe challenges, such as scale variations and occlusion issues caused by cameras. Existing approaches often overlook the discrepancies between multi-scale features and typically perform direct feature fusion. Most methods addressing occlusion rely on feature completion techniques, without fully utilizing the inherent fine-grained information from the original images. This paper proposes a Multi-level Supervised and Fine-grained Feature Enhancement for Person Search (MFPS) to mitigate these issues. MFPS employs cascaded encoders and decoders to extract person detection features from the backbone network. To generate re-id features robust to scale variations, MFPS introduces a Multi-Level Supervision method (MLS), which aggregates features of different scales and levels, enriching the semantic information of person features. Furthermore, to address the issue of missing re-id features caused by occlusion, this paper proposes a deformable fine-grained attention module. This module extracts fine-grained re-id features with accurate semantic information through sampling point offset operations. Finally, fine-grained features and multi-scale features are fused, and the re-id features extracted through multi-level supervised fine-grained feature extraction significantly improve recognition accuracy for person search tasks in surveillance scenarios. The experimental results show that MFPS improves the mAP metrics by 0.8 and the top-1 metrics by 1.8 compared to the state-of-the-art method on the PRW dataset, proving its superiority in complex environments. The source code is available at https://github.com/FengHua0208/MFPS.

人员搜索任务旨在同时解决人员检测和人员再识别(re-id)这两个任务,将这两个任务集成为一个统一的目标。人身搜查是一种常用的监视和安全领域。目前,监控场景下的人员搜索任务面临着许多严峻的挑战,如摄像机引起的规模变化和遮挡问题。现有的方法往往忽略了多尺度特征之间的差异,通常直接进行特征融合。大多数解决遮挡的方法依赖于特征补全技术,没有充分利用原始图像中固有的细粒度信息。本文提出了一种多层次监督和细粒度特征增强的人物搜索(MFPS)方法来缓解这些问题。MFPS采用级联编码器和解码器从骨干网络中提取人物检测特征。为了生成对尺度变化具有鲁棒性的re-id特征,MFPS引入了多层监督方法(Multi-Level Supervision method, MLS),该方法将不同尺度和层次的特征聚合在一起,丰富了人物特征的语义信息。此外,为了解决遮挡导致的re-id特征缺失问题,本文提出了一种可变形的细粒度注意力模块。该模块通过采样点偏移操作提取具有准确语义信息的细粒度re-id特征。最后,将细粒度特征与多尺度特征融合,通过多级监督细粒度特征提取提取的re-id特征显著提高了监控场景下人员搜索任务的识别准确率。实验结果表明,与PRW数据集上最先进的方法相比,MFPS的mAP指标提高了0.8,top-1指标提高了1.8,证明了其在复杂环境下的优越性。源代码可从https://github.com/FengHua0208/MFPS获得。
{"title":"Multi-level supervised and fine-grained feature enhancement for person search","authors":"Xuyang Zhang,&nbsp;Sijie Yang,&nbsp;Hongkun Liu,&nbsp;Heyan Jin,&nbsp;Guangqiang Yin,&nbsp;Ye Li","doi":"10.1007/s10489-025-06996-z","DOIUrl":"10.1007/s10489-025-06996-z","url":null,"abstract":"<div><p>The person search task aims to address both person detection and person re-identification(re-id) simultaneously, integrating these two tasks into a unified objective. Person search is commonly used in surveillance and security fields. Currently, person search tasks in surveillance scenarios face many severe challenges, such as scale variations and occlusion issues caused by cameras. Existing approaches often overlook the discrepancies between multi-scale features and typically perform direct feature fusion. Most methods addressing occlusion rely on feature completion techniques, without fully utilizing the inherent fine-grained information from the original images. This paper proposes a Multi-level Supervised and Fine-grained Feature Enhancement for Person Search (MFPS) to mitigate these issues. MFPS employs cascaded encoders and decoders to extract person detection features from the backbone network. To generate re-id features robust to scale variations, MFPS introduces a Multi-Level Supervision method (MLS), which aggregates features of different scales and levels, enriching the semantic information of person features. Furthermore, to address the issue of missing re-id features caused by occlusion, this paper proposes a deformable fine-grained attention module. This module extracts fine-grained re-id features with accurate semantic information through sampling point offset operations. Finally, fine-grained features and multi-scale features are fused, and the re-id features extracted through multi-level supervised fine-grained feature extraction significantly improve recognition accuracy for person search tasks in surveillance scenarios. The experimental results show that MFPS improves the mAP metrics by 0.8 and the top-1 metrics by 1.8 compared to the state-of-the-art method on the PRW dataset, proving its superiority in complex environments. The source code is available at https://github.com/FengHua0208/MFPS.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional guided diffusion model in latent space for social recommendation 社会推荐潜在空间条件引导扩散模型
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1007/s10489-026-07094-4
Yijun Hu, Rui Tang, Xian Mo

Social recommendation utilizes social connections as auxiliary information to deeply mine user preferences, thereby improving the recommendation performance. Existing methods often employ graph neural networks to encode social graphs. However, average aggregation may lead to node distortion, and diffusion models may reconstruct embeddings of multi-faceted interests and attributes that are misaligned with task-relevant directions. To address these limitations, this paper introduces a Conditional Guided Diffusion architecture in Latent Space for social recommendation (CGDLS). Specifically, CGDLS first leverages singular value decomposition (SVD) to encode the social connection graph and the co-interacted items graph among users into the low-dimensional latent space. To alleviate noise distortion, CGDLS captures its key features by leveraging SVD to encode the social connection graph. During the reverse process, CGDLS incorporates the embedding of co-interacted items among users as conditional guidance. It guides the reverse process to reconstruct a highly task-relevant social connection embedding. Extensive experiments conducted on three social datasets demonstrate that CGDLS and its components outperform various state-of-the-art methods.

社交推荐利用社交关系作为辅助信息,深度挖掘用户偏好,从而提高推荐性能。现有的方法通常采用图神经网络对社交图进行编码。然而,平均聚集可能会导致节点失真,扩散模型可能会重建与任务相关方向不一致的多方面兴趣和属性的嵌入。为了解决这些限制,本文引入了一种基于潜在空间的条件引导扩散架构(CGDLS)。具体而言,CGDLS首先利用奇异值分解(SVD)将用户之间的社会联系图和共同互动项目图编码到低维潜在空间中。为了减轻噪声失真,CGDLS通过利用SVD对社会连接图进行编码来捕捉其关键特征。在逆向过程中,CGDLS将用户间交互项的嵌入作为条件引导。它引导逆向过程重建一个高度任务相关的社会联系嵌入。在三个社会数据集上进行的大量实验表明,CGDLS及其组件优于各种最先进的方法。
{"title":"Conditional guided diffusion model in latent space for social recommendation","authors":"Yijun Hu,&nbsp;Rui Tang,&nbsp;Xian Mo","doi":"10.1007/s10489-026-07094-4","DOIUrl":"10.1007/s10489-026-07094-4","url":null,"abstract":"<div>\u0000 \u0000 <p>Social recommendation utilizes social connections as auxiliary information to deeply mine user preferences, thereby improving the recommendation performance. Existing methods often employ graph neural networks to encode social graphs. However, average aggregation may lead to node distortion, and diffusion models may reconstruct embeddings of multi-faceted interests and attributes that are misaligned with task-relevant directions. To address these limitations, this paper introduces a <u>C</u>onditional <u>G</u>uided <u>D</u>iffusion architecture in <u>L</u>atent <u>S</u>pace for social recommendation (CGDLS). Specifically, CGDLS first leverages singular value decomposition (SVD) to encode the social connection graph and the co-interacted items graph among users into the low-dimensional latent space. To alleviate noise distortion, CGDLS captures its key features by leveraging SVD to encode the social connection graph. During the reverse process, CGDLS incorporates the embedding of co-interacted items among users as conditional guidance. It guides the reverse process to reconstruct a highly task-relevant social connection embedding. Extensive experiments conducted on three social datasets demonstrate that CGDLS and its components outperform various state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DG-Morph: dense convolutional and gated feature extraction network for multimodal 3D prostate MRI registration DG-Morph:用于多模态三维前列腺MRI配准的密集卷积和门控特征提取网络
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1007/s10489-025-07078-w
Mengxing Huang, Zhihao Huang, Zehao Ni, Yu Zhang, Nana Liu, Uzair Aslam Bhatti, Jing Chen, Gang Wang, Zhiming Bai

Among men globally, prostate cancer ranks as one of the most frequently occurring types of cancer. Timely identification and accurate treatment rely on high-quality medical image registration, especially the precise registration of multimodal images such as diffusion-weighted imaging (DWI) and T2-weighted MRI. However, due to the low resolution, high noise and blurred boundaries of DWI, the registration with T2-weighted images is extremely challenging. The significant differences between the two imaging modalities make the multimodal registration task complex, which directly affects the accuracy of diagnosis and treatment. To address these problems, this paper proposes a dense convolutional and gated feature extraction network (DG-Morph), which leverages the strengths of Convolutional Neural Networks (CNN) and Transformer models to enhance the efficiency of feature representation and extraction. The gated residual fusion module (GRFM) dynamically fuses features, the Multi-scale Transformer module (MST) improves multi-scale feature extraction, and the dense 3D-convolutional(Dense Conv3D) block optimizes feature reuse. Experimental results show that DG-Morph significantly outperforms traditional methods regarding the Dice Similarity Coefficient(DSC) and deformation field smoothness, showing excellent performance in complex multimodal medical image registration. These results show that DG-Morph has significant advantages in accuracy and robustness in multimodal registration, especially for complex multimodal medical image registration tasks.

在全球男性中,前列腺癌是最常见的癌症类型之一。及时识别和准确治疗依赖于高质量的医学图像配准,特别是弥散加权成像(DWI)和t2加权MRI等多模态图像的精确配准。然而,由于DWI的低分辨率、高噪声和模糊的边界,与t2加权图像的配准非常具有挑战性。两种成像方式的显著差异使得多模态配准任务复杂,直接影响诊断和治疗的准确性。为了解决这些问题,本文提出了一种密集卷积门控特征提取网络(DG-Morph),该网络利用卷积神经网络(CNN)和Transformer模型的优势来提高特征表示和提取的效率。门控残差融合模块(GRFM)对特征进行动态融合,多尺度变压器模块(MST)改进了多尺度特征提取,密集三维卷积块(dense Conv3D)优化了特征重用。实验结果表明,DG-Morph在骰子相似系数(DSC)和变形场平滑度方面明显优于传统方法,在复杂的多模态医学图像配准中表现出优异的性能。结果表明,DG-Morph在多模态配准方面具有显著的精度和鲁棒性优势,尤其适用于复杂的多模态医学图像配准任务。
{"title":"DG-Morph: dense convolutional and gated feature extraction network for multimodal 3D prostate MRI registration","authors":"Mengxing Huang,&nbsp;Zhihao Huang,&nbsp;Zehao Ni,&nbsp;Yu Zhang,&nbsp;Nana Liu,&nbsp;Uzair Aslam Bhatti,&nbsp;Jing Chen,&nbsp;Gang Wang,&nbsp;Zhiming Bai","doi":"10.1007/s10489-025-07078-w","DOIUrl":"10.1007/s10489-025-07078-w","url":null,"abstract":"<div><p>Among men globally, prostate cancer ranks as one of the most frequently occurring types of cancer. Timely identification and accurate treatment rely on high-quality medical image registration, especially the precise registration of multimodal images such as diffusion-weighted imaging (DWI) and T2-weighted MRI. However, due to the low resolution, high noise and blurred boundaries of DWI, the registration with T2-weighted images is extremely challenging. The significant differences between the two imaging modalities make the multimodal registration task complex, which directly affects the accuracy of diagnosis and treatment. To address these problems, this paper proposes a dense convolutional and gated feature extraction network (DG-Morph), which leverages the strengths of Convolutional Neural Networks (CNN) and Transformer models to enhance the efficiency of feature representation and extraction. The gated residual fusion module (GRFM) dynamically fuses features, the Multi-scale Transformer module (MST) improves multi-scale feature extraction, and the dense 3D-convolutional(Dense Conv3D) block optimizes feature reuse. Experimental results show that DG-Morph significantly outperforms traditional methods regarding the Dice Similarity Coefficient(DSC) and deformation field smoothness, showing excellent performance in complex multimodal medical image registration. These results show that DG-Morph has significant advantages in accuracy and robustness in multimodal registration, especially for complex multimodal medical image registration tasks.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027022","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Hybrid stock price forecasting model using LSTM of attention mechanism combining noise removal and neighborhood rough set 结合噪声去除和邻域粗糙集的注意力机制LSTM混合股价预测模型
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-21 DOI: 10.1007/s10489-025-07048-2
Yuqi Guo, Bingzhen Sun, Juncheng Bai, Weiping Ding

Financial markets are highly non-linear and noisy, which complicates stock price forecasting. Traditional machine learning models often fail to appropriately generalize noisy data. Additionally, incorporating redundant attributes increases training complexity and overfitting risk. In this study, a hybrid model is proposed that combines noise reduction and feature selection techniques to improve forecasting accuracy and efficiency. Specifically, we propose a fusion noise identification and attribute reduction method based on independent component analysis (ICA), signal-to-noise ratio (SNR) theory, and neighborhood rough set (NRS) theory to address the aforementioned problems. First, technical indicators are established and used as alternative inputs for ICA. The aim is to determine the optimal ICA input dimension and noise sequence by maximizing the SNR while minimizing the reconstructed similarity of the series. Second, we calculate the attribute significance using the neighborhood rough set and obtain an attribute subset that satisfies the threshold filter condition. Finally, an attention mechanism-based (AM) long short-term memory (LSTM) neural network is used to build a forecasting model for the noise-reduced close price and reduced set of attributes. The research findings indicate that for time series denoising based on ICA, optimal results are achieved with an appropriate input dimension. Excessive input sequences introduce noise or redundant information. Attribute reduction based on NRS effectively reduces both temporal and spatial computational complexity without sacrificing generalization accuracy.

金融市场是高度非线性和噪声的,这使得股票价格预测变得复杂。传统的机器学习模型往往不能适当地泛化噪声数据。此外,合并冗余属性增加了训练的复杂性和过度拟合的风险。本研究提出了一种结合降噪和特征选择技术的混合模型,以提高预测的准确性和效率。具体而言,我们提出了一种基于独立分量分析(ICA)、信噪比(SNR)理论和邻域粗糙集(NRS)理论的融合噪声识别和属性约简方法来解决上述问题。首先,建立技术指标并将其作为ICA的替代投入。其目的是在最大信噪比的同时最小化序列的重构相似度,确定最优的ICA输入维数和噪声序列。其次,利用邻域粗糙集计算属性显著性,得到满足阈值过滤条件的属性子集;最后,利用基于注意机制的长短期记忆(LSTM)神经网络建立了去噪收盘价和约简属性集的预测模型。研究结果表明,在适当的输入维数下,基于ICA的时间序列去噪效果最优。过多的输入序列会引入噪声或冗余信息。基于NRS的属性约简在不牺牲泛化精度的前提下,有效地降低了时间和空间的计算复杂度。
{"title":"Hybrid stock price forecasting model using LSTM of attention mechanism combining noise removal and neighborhood rough set","authors":"Yuqi Guo,&nbsp;Bingzhen Sun,&nbsp;Juncheng Bai,&nbsp;Weiping Ding","doi":"10.1007/s10489-025-07048-2","DOIUrl":"10.1007/s10489-025-07048-2","url":null,"abstract":"<div><p>Financial markets are highly non-linear and noisy, which complicates stock price forecasting. Traditional machine learning models often fail to appropriately generalize noisy data. Additionally, incorporating redundant attributes increases training complexity and overfitting risk. In this study, a hybrid model is proposed that combines noise reduction and feature selection techniques to improve forecasting accuracy and efficiency. Specifically, we propose a fusion noise identification and attribute reduction method based on independent component analysis (ICA), signal-to-noise ratio (SNR) theory, and neighborhood rough set (NRS) theory to address the aforementioned problems. First, technical indicators are established and used as alternative inputs for ICA. The aim is to determine the optimal ICA input dimension and noise sequence by maximizing the SNR while minimizing the reconstructed similarity of the series. Second, we calculate the attribute significance using the neighborhood rough set and obtain an attribute subset that satisfies the threshold filter condition. Finally, an attention mechanism-based (AM) long short-term memory (LSTM) neural network is used to build a forecasting model for the noise-reduced close price and reduced set of attributes. The research findings indicate that for time series denoising based on ICA, optimal results are achieved with an appropriate input dimension. Excessive input sequences introduce noise or redundant information. Attribute reduction based on NRS effectively reduces both temporal and spatial computational complexity without sacrificing generalization accuracy.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027000","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SiamDMCF: a dynamic multi-order context fusion siamese network for robust visual tracking SiamDMCF:用于鲁棒视觉跟踪的动态多阶上下文融合连体网络
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-20 DOI: 10.1007/s10489-025-07077-x
Yu-e Lin, Xingyuan Ge, Xingzhu Liang, Jinliang Zhang

Siamese-based tracking approaches have demonstrated notable efficacy in visual object tracking recently. Nevertheless, existing Siamese networks suffer from core limitations in their feature extraction and interaction mechanisms: namely, a lack of adaptive modeling capability for multi-scale context and low sensitivity to target instance variations. These shortcomings cause trackers to be prone to drifting when encountering challenges such as occlusion, scale variations, and fast motion. To address this problem, we develop an original dynamic multi-order context fusion siamese network for object tracking, constructing a three-level progressive functional coupling architecture. By introducing the Multi-Order Feature Gated Fusion module, the Adaptive Fine-grained Channel Cross-correlation module, and the Spatial-Channel Coordinated Attention module, we effectively enhance discriminative representation learning, cross-correlation matching accuracy, and target activation. We conducted extensive experiments on four benchmark datasets: OTB100, UAV123, GOT-10K, and LaSOT, to confirm our tracker’s superior performance. Our code is available at: https://github.com/JSJ515-Group/SiamDMCF.

近年来,基于连体体的跟踪方法在视觉目标跟踪中表现出了显著的效果。然而,现有的Siamese网络在特征提取和交互机制方面存在核心限制:即缺乏多尺度上下文的自适应建模能力和对目标实例变化的低灵敏度。这些缺点导致跟踪器在遇到遮挡、规模变化和快速运动等挑战时容易漂移。为了解决这一问题,我们开发了一种用于目标跟踪的原始动态多阶上下文融合连体网络,构建了一个三层递进功能耦合体系结构。通过引入多阶特征门控融合模块、自适应细粒度通道相互关联模块和空间通道协调注意模块,有效提高了判别表征学习、相互关联匹配精度和目标激活能力。我们在OTB100、UAV123、GOT-10K和LaSOT四个基准数据集上进行了广泛的实验,以证实我们的跟踪器的优越性能。我们的代码可在:https://github.com/JSJ515-Group/SiamDMCF。
{"title":"SiamDMCF: a dynamic multi-order context fusion siamese network for robust visual tracking","authors":"Yu-e Lin,&nbsp;Xingyuan Ge,&nbsp;Xingzhu Liang,&nbsp;Jinliang Zhang","doi":"10.1007/s10489-025-07077-x","DOIUrl":"10.1007/s10489-025-07077-x","url":null,"abstract":"<div>\u0000 \u0000 <p>Siamese-based tracking approaches have demonstrated notable efficacy in visual object tracking recently. Nevertheless, existing Siamese networks suffer from core limitations in their feature extraction and interaction mechanisms: namely, a lack of adaptive modeling capability for multi-scale context and low sensitivity to target instance variations. These shortcomings cause trackers to be prone to drifting when encountering challenges such as occlusion, scale variations, and fast motion. To address this problem, we develop an original dynamic multi-order context fusion siamese network for object tracking, constructing a three-level progressive functional coupling architecture. By introducing the Multi-Order Feature Gated Fusion module, the Adaptive Fine-grained Channel Cross-correlation module, and the Spatial-Channel Coordinated Attention module, we effectively enhance discriminative representation learning, cross-correlation matching accuracy, and target activation. We conducted extensive experiments on four benchmark datasets: OTB100, UAV123, GOT-10K, and LaSOT, to confirm our tracker’s superior performance. Our code is available at: https://github.com/JSJ515-Group/SiamDMCF.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146026730","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1