首页 > 最新文献

Applied Intelligence最新文献

英文 中文
Research on improved fast-RCNN target detection algorithm based on Kolmogorov-Arnold network 基于Kolmogorov-Arnold网络的改进快速rcnn目标检测算法研究
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1007/s10489-025-06817-3
Zhigang Ren, Xiangjun Tang, Guoquan Ren, Dinghai Wu

To address the dual challenges of high parameter complexity and lack of interpretability in deep neural networks, this study proposes KAN-RCNN—a novel object detection framework based on the mathematical formulation of Kolmogorov-Arnold Networks (KANs). By integrating KANs with conventional CNN architectures, comparative experiments on the PASCAL VOC 2012 benchmark dataset demonstrate that KAN-RCNN achieves: 1) 13.6% parameter reduction compared to the original Faster R-CNN baseline; 2) 1.3% improvement in detection accuracy; 3) enhanced model interpretability. Through systematic validation with 1D synthetic signals, MNIST grayscale images, and multimodal data from PASCAL VOC 2012, the experimental results confirm that KAN-RCNN maintains competitive detection performance while attaining superior computational efficiency. This research provides new methodological insights for developing efficient and interpretable computer vision models.

为了解决深度神经网络中高参数复杂性和缺乏可解释性的双重挑战,本研究提出了kan - rcnn -一种基于Kolmogorov-Arnold网络(KANs)数学公式的新型目标检测框架。通过将KANs与传统CNN架构集成,在PASCAL VOC 2012基准数据集上的对比实验表明,KAN-RCNN实现了:1)与原始Faster R-CNN基线相比,参数减少了13.6%;2)检测精度提高1.3%;3)增强模型可解释性。通过1D合成信号、MNIST灰度图像和PASCAL VOC 2012的多模态数据的系统验证,实验结果证实了KAN-RCNN在保持竞争力的检测性能的同时获得了卓越的计算效率。本研究为开发高效、可解释的计算机视觉模型提供了新的方法见解。
{"title":"Research on improved fast-RCNN target detection algorithm based on Kolmogorov-Arnold network","authors":"Zhigang Ren,&nbsp;Xiangjun Tang,&nbsp;Guoquan Ren,&nbsp;Dinghai Wu","doi":"10.1007/s10489-025-06817-3","DOIUrl":"10.1007/s10489-025-06817-3","url":null,"abstract":"<div><p>To address the dual challenges of high parameter complexity and lack of interpretability in deep neural networks, this study proposes KAN-RCNN—a novel object detection framework based on the mathematical formulation of Kolmogorov-Arnold Networks (KANs). By integrating KANs with conventional CNN architectures, comparative experiments on the PASCAL VOC 2012 benchmark dataset demonstrate that KAN-RCNN achieves: 1) 13.6% parameter reduction compared to the original Faster R-CNN baseline; 2) 1.3% improvement in detection accuracy; 3) enhanced model interpretability. Through systematic validation with 1D synthetic signals, MNIST grayscale images, and multimodal data from PASCAL VOC 2012, the experimental results confirm that KAN-RCNN maintains competitive detection performance while attaining superior computational efficiency. This research provides new methodological insights for developing efficient and interpretable computer vision models.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082850","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Spill-free liquid container handling using deep reinforcement learning agents in feedback control 反馈控制中使用深度强化学习代理的无溢液容器处理
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1007/s10489-025-07041-9
Ashish Kumar Shakya, Mike Fogel, Gopinatha Pillai, Laurent Burlion, Sohom Chakrabarty

Liquid sloshing in moving open containers poses significant risks in various industrial and engineering applications, often leading to spillage, contamination, and reduced operational safety. Effective control of sloshing is therefore critical for ensuring product integrity and preventing losses during transportation. This paper presents three novel Deep Reinforcement Learning (DRL)-based feedback control frameworks for automatic motion planning of an open cylindrical liquid container moving along a straight-line trajectory. The sloshing dynamics are modeled as a nonlinear underactuated system—specifically, a simple pendulum mounted on a moving cart—to capture the essential fluid-structure interaction while enabling control design in a simulation environment. Each proposed framework employs a DRL agent trained using the Deep Deterministic Policy Gradient (DDPG) algorithm to generate optimal control actions that minimize sloshing and reduce overall travel time. The agents are trained in a closed-loop feedback setting using the pendulum-cart model to ensure robustness and adaptability to dynamic disturbances induced by the sloshing liquid. The performance of the proposed DRL-based frameworks is rigorously evaluated and benchmarked against several conventional control strategies, including Super Twisting Control (STC), Linear Quadratic Regulator (LQR) and adaptive Sliding Mode Control (ASMC), under disturbance condition. Furthermore, to validate the practical applicability of the learned policies, the DRL-generated trajectories are tested in open-loop simulations using FLOW-3D computational fluid dynamics (CFD) software. This dual-layered validation approach demonstrates the effectiveness and robustness of the proposed methods in achieving efficient, spill-free transport in liquid handling systems.

在各种工业和工程应用中,移动的开放式容器中的液体晃动会带来重大风险,通常会导致泄漏,污染和降低操作安全性。因此,有效控制晃动对于确保产品完整性和防止运输过程中的损失至关重要。提出了基于深度强化学习(DRL)的三种新型反馈控制框架,用于开放圆柱形液体容器沿直线运动的自动运动规划。晃动动力学建模为非线性欠驱动系统,具体来说,是一个安装在移动小车上的简单摆锤,以捕捉基本的流固相互作用,同时实现仿真环境中的控制设计。每个提出的框架都使用了一个使用深度确定性策略梯度(DDPG)算法训练的DRL代理,以生成最优控制动作,最大限度地减少晃动并减少总体行驶时间。利用摆车模型在闭环反馈环境中训练智能体,以确保对晃动液体引起的动态扰动的鲁棒性和适应性。在干扰条件下,对基于drl的框架的性能进行了严格的评估,并与几种传统控制策略(包括超扭转控制(STC)、线性二次型调节器(LQR)和自适应滑模控制(ASMC))进行了基准测试。此外,为了验证学习策略的实际适用性,使用FLOW-3D计算流体动力学(CFD)软件在开环模拟中测试了drl生成的轨迹。这种双层验证方法证明了所提出的方法在实现液体处理系统中高效、无泄漏运输方面的有效性和鲁棒性。
{"title":"Spill-free liquid container handling using deep reinforcement learning agents in feedback control","authors":"Ashish Kumar Shakya,&nbsp;Mike Fogel,&nbsp;Gopinatha Pillai,&nbsp;Laurent Burlion,&nbsp;Sohom Chakrabarty","doi":"10.1007/s10489-025-07041-9","DOIUrl":"10.1007/s10489-025-07041-9","url":null,"abstract":"<div><p>Liquid sloshing in moving open containers poses significant risks in various industrial and engineering applications, often leading to spillage, contamination, and reduced operational safety. Effective control of sloshing is therefore critical for ensuring product integrity and preventing losses during transportation. This paper presents three novel Deep Reinforcement Learning (DRL)-based feedback control frameworks for automatic motion planning of an open cylindrical liquid container moving along a straight-line trajectory. The sloshing dynamics are modeled as a nonlinear underactuated system—specifically, a simple pendulum mounted on a moving cart—to capture the essential fluid-structure interaction while enabling control design in a simulation environment. Each proposed framework employs a DRL agent trained using the Deep Deterministic Policy Gradient (DDPG) algorithm to generate optimal control actions that minimize sloshing and reduce overall travel time. The agents are trained in a closed-loop feedback setting using the pendulum-cart model to ensure robustness and adaptability to dynamic disturbances induced by the sloshing liquid. The performance of the proposed DRL-based frameworks is rigorously evaluated and benchmarked against several conventional control strategies, including Super Twisting Control (STC), Linear Quadratic Regulator (LQR) and adaptive Sliding Mode Control (ASMC), under disturbance condition. Furthermore, to validate the practical applicability of the learned policies, the DRL-generated trajectories are tested in open-loop simulations using FLOW-3D computational fluid dynamics (CFD) software. This dual-layered validation approach demonstrates the effectiveness and robustness of the proposed methods in achieving efficient, spill-free transport in liquid handling systems.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082851","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SevenNet: rethinking convolutional neural networks with a formula-based architecture SevenNet:用基于公式的架构重新思考卷积神经网络
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-27 DOI: 10.1007/s10489-026-07084-6
Amira Bendaoud, Fella Hachouf

Convolutional neural networks (CNNs) are a powerful tool for image-related applications due to their ability to learn features of images hierarchically. However, even after more than a decade, CNNs still present many challenges. Among these challenges, there is the arbitrary choice of parameters that makes the design of CNNs a difficult task. This work presents a new CNN model -SevenNet- for classifying tomato leaf diseases from the PlantVillage dataset. SevenNet’s Architecture has been built from scratch using a formulation extracted through extensive experimentation. SevenNet’s main advantages are the large number of extracted feature maps, fast convergence, and an overall reduced number of learnable parameters. A detailed study explored training the network on different data partitions, ranging from standard partition to cross-validation split in addition to other non-standard partition. Validation of SevenNet has been conducted against several state-of-the-art models, with all networks being trained from scratch. Obtained results were not only found to be outstanding and comparable to leading models, but SevenNet’s architecture demonstrated distinctive advantages, matching the performance of these established models. Notably, SevenNet’s convergence has been achieved more rapidly in terms of accuracy and loss. Additionally, the highest overall accuracy has been achieved when tested with an unusual partition (10% training, 10% validation, 80% test). The proposed CNNs were also found to be superior in terms of execution speed and convergence, solidifying SevenNet’s advantages over existing approaches.

卷积神经网络(cnn)具有分层学习图像特征的能力,是图像相关应用的强大工具。然而,即使在十多年后,cnn仍然面临着许多挑战。在这些挑战中,参数的任意选择使得cnn的设计成为一项艰巨的任务。这项工作提出了一个新的CNN模型- sevennet -用于分类来自PlantVillage数据集的番茄叶片疾病。SevenNet的架构从零开始建立,使用经过广泛实验提取的配方。SevenNet的主要优点是提取的特征图数量多,收敛速度快,可学习参数的总体数量减少。详细的研究探讨了在不同的数据分区上训练网络,从标准分区到交叉验证分割,以及其他非标准分区。SevenNet的验证已经在几个最先进的模型上进行了,所有的网络都是从零开始训练的。所获得的结果不仅可以与领先的模型相媲美,而且SevenNet的架构显示出独特的优势,与这些已建立的模型的性能相匹配。值得注意的是,在准确性和损失方面,SevenNet的收敛速度更快。此外,当使用不寻常的分区(10%训练,10%验证,80%测试)进行测试时,达到了最高的总体准确性。研究还发现,拟议的cnn在执行速度和收敛性方面也更胜一筹,巩固了SevenNet相对于现有方法的优势。
{"title":"SevenNet: rethinking convolutional neural networks with a formula-based architecture","authors":"Amira Bendaoud,&nbsp;Fella Hachouf","doi":"10.1007/s10489-026-07084-6","DOIUrl":"10.1007/s10489-026-07084-6","url":null,"abstract":"<div>\u0000 \u0000 <p>Convolutional neural networks (CNNs) are a powerful tool for image-related applications due to their ability to learn features of images hierarchically. However, even after more than a decade, CNNs still present many challenges. Among these challenges, there is the arbitrary choice of parameters that makes the design of CNNs a difficult task. This work presents a new CNN model -SevenNet- for classifying tomato leaf diseases from the PlantVillage dataset. SevenNet’s Architecture has been built from scratch using a formulation extracted through extensive experimentation. SevenNet’s main advantages are the large number of extracted feature maps, fast convergence, and an overall reduced number of learnable parameters. A detailed study explored training the network on different data partitions, ranging from standard partition to cross-validation split in addition to other non-standard partition. Validation of SevenNet has been conducted against several state-of-the-art models, with all networks being trained from scratch. Obtained results were not only found to be outstanding and comparable to leading models, but SevenNet’s architecture demonstrated distinctive advantages, matching the performance of these established models. Notably, SevenNet’s convergence has been achieved more rapidly in terms of accuracy and loss. Additionally, the highest overall accuracy has been achieved when tested with an unusual partition (10% training, 10% validation, 80% test). The proposed CNNs were also found to be superior in terms of execution speed and convergence, solidifying SevenNet’s advantages over existing approaches.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082767","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
SFM_MF: A streamflow forecasting model based on model fusion for small-sample data in small and medium-sized rivers SFM_MF:基于模式融合的中小河流小样本流量预报模型
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-26 DOI: 10.1007/s10489-025-07074-0
Suhe Zhang, Yufeng Yu, Qingqing Chen, Qun Zhao, Ziyang Chen

Traditional streamflow forecasting models based on machine learning often struggle to determine optimal parameters and achieve accurate predictions when applied to small and medium-sized rivers with limited data availability. To address this issue, this study proposes an Attention-based Bidirectional Long Short-Term Memory (A_BiLSTM) and, on this basis, develops a streamflow forecasting model based on model fusion, referred to as SFM_MF. The SFM_MF model employs Bayesian Linear Regression (BLR) to learn from small-sample data in the target basin, while leveraging the A_BiLSTM to model data transferred from hydrologically similar source basins. The final streamflow forecast is generated by integrating the outputs of BLR and A_BiLSTM through a weighted averaging method. Experimental results from the Jiulong River Basin and the Qinhuai River Basin demonstrate that A_BiLSTM outperforms baseline models, and that the SFM_MF model significantly surpasses its component models in terms of both predictive accuracy and generalization capability. Compared to the individual models, SFM_MF achieves reductions in root mean square error (RMSE) of 3.98% and 19.33%, respectively. These findings indicate that the SFM_MF model delivers superior forecasting performance for small and medium-sized basins with limited data resources.

传统的基于机器学习的流量预测模型在应用于数据可用性有限的中小型河流时,往往难以确定最佳参数并实现准确预测。为了解决这一问题,本研究提出了一种基于注意的双向长短期记忆(A_BiLSTM)模型,并在此基础上建立了基于模型融合的流量预测模型SFM_MF。SFM_MF模型采用贝叶斯线性回归(BLR)从目标流域的小样本数据中学习,同时利用A_BiLSTM对水文相似的源流域的数据进行建模。通过加权平均的方法对BLR和A_BiLSTM的输出进行综合,得到最终的流量预测结果。九龙河流域和秦淮河流域的试验结果表明,A_BiLSTM模型在预测精度和泛化能力上明显优于基线模型,SFM_MF模型在预测精度和泛化能力上均优于其成分模型。与单个模型相比,SFM_MF的均方根误差(RMSE)分别降低了3.98%和19.33%。结果表明,SFM_MF模型对数据资源有限的中小流域具有较好的预测效果。
{"title":"SFM_MF: A streamflow forecasting model based on model fusion for small-sample data in small and medium-sized rivers","authors":"Suhe Zhang,&nbsp;Yufeng Yu,&nbsp;Qingqing Chen,&nbsp;Qun Zhao,&nbsp;Ziyang Chen","doi":"10.1007/s10489-025-07074-0","DOIUrl":"10.1007/s10489-025-07074-0","url":null,"abstract":"<div>\u0000 \u0000 <p>Traditional streamflow forecasting models based on machine learning often struggle to determine optimal parameters and achieve accurate predictions when applied to small and medium-sized rivers with limited data availability. To address this issue, this study proposes an Attention-based Bidirectional Long Short-Term Memory (A_BiLSTM) and, on this basis, develops a streamflow forecasting model based on model fusion, referred to as SFM_MF. The SFM_MF model employs Bayesian Linear Regression (BLR) to learn from small-sample data in the target basin, while leveraging the A_BiLSTM to model data transferred from hydrologically similar source basins. The final streamflow forecast is generated by integrating the outputs of BLR and A_BiLSTM through a weighted averaging method. Experimental results from the Jiulong River Basin and the Qinhuai River Basin demonstrate that A_BiLSTM outperforms baseline models, and that the SFM_MF model significantly surpasses its component models in terms of both predictive accuracy and generalization capability. Compared to the individual models, SFM_MF achieves reductions in root mean square error (RMSE) of 3.98% and 19.33%, respectively. These findings indicate that the SFM_MF model delivers superior forecasting performance for small and medium-sized basins with limited data resources.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146082860","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel membrane-inspired evolutionary algorithm framework for VRPTW 一种基于膜的VRPTW进化算法框架
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-07068-y
Zhonghai Bai, Václav Snášel, Seyedali Mirjalili, Bay Vo, Lingping Kong, Xiaopeng Wang

The vehicle routing problem with time windows (VRPTW) has gained much attention recently due to its wide application in operations research and logistics. VRPTW has been proven to be an NP-hard problem whose optimal solution is computationally costly. Scholars have proposed many methods, such as exact algorithms, heuristics, and metaheuristics, to find near-optimal solutions for the VRPTW. Exact algorithms are limited to small-scale problems, while heuristic algorithms and metaheuristics often converge to locally optimal solutions, despite their applicability to larger-scale problems. This paper proposes a novel membrane-inspired evolutionary algorithm framework (MEAF) consisting of isolated evolutionary rules, communication output rules, communication input rules, fusion-exchange information operation, and membrane dissolution rules. By leveraging the advantages of multiple metaheuristics algorithms and avoiding the pitfalls of local optima, MEAF offers a promising solution to address complex problems. The effectiveness of the proposed MEAF is verified by applying three classical metaheuristics, namely Genetic Algorithm (GA), Ant Colony System (ACS), and Particle Swarm Algorithm (PSO), to solve the VRPTW problem. The experiments are run on 56 instances of Solomon with 100 client benchmarks. The evaluation of the experimental results combined with the mean and standard deviation values show that the algorithm performs better in 54 out of 56 instances, demonstrating the effectiveness and stability of the proposed algorithm.

带时间窗的车辆路径问题由于在运筹学和物流领域的广泛应用,近年来受到了广泛的关注。VRPTW已被证明是一个np困难问题,其最优解的计算成本很高。学者们提出了许多方法,如精确算法、启发式和元启发式,来寻找VRPTW的近最优解。精确算法仅限于小规模问题,而启发式算法和元启发式算法通常收敛于局部最优解,尽管它们适用于更大规模的问题。本文提出了一种新的膜启发进化算法框架(MEAF),该框架由孤立进化规则、通信输出规则、通信输入规则、融合交换信息操作和膜溶解规则组成。通过利用多个元启发式算法的优点,避免了局部最优的缺陷,MEAF为解决复杂问题提供了一个有希望的解决方案。通过应用遗传算法(GA)、蚁群算法(ACS)和粒子群算法(PSO)三种经典的元启发式算法来解决VRPTW问题,验证了MEAF算法的有效性。实验运行在56个Solomon实例和100个客户机基准测试上。结合均值和标准差值对实验结果进行评价,56例中有54例算法表现较好,证明了算法的有效性和稳定性。
{"title":"A novel membrane-inspired evolutionary algorithm framework for VRPTW","authors":"Zhonghai Bai,&nbsp;Václav Snášel,&nbsp;Seyedali Mirjalili,&nbsp;Bay Vo,&nbsp;Lingping Kong,&nbsp;Xiaopeng Wang","doi":"10.1007/s10489-025-07068-y","DOIUrl":"10.1007/s10489-025-07068-y","url":null,"abstract":"<div><p>The vehicle routing problem with time windows (VRPTW) has gained much attention recently due to its wide application in operations research and logistics. VRPTW has been proven to be an NP-hard problem whose optimal solution is computationally costly. Scholars have proposed many methods, such as exact algorithms, heuristics, and metaheuristics, to find near-optimal solutions for the VRPTW. Exact algorithms are limited to small-scale problems, while heuristic algorithms and metaheuristics often converge to locally optimal solutions, despite their applicability to larger-scale problems. This paper proposes a novel membrane-inspired evolutionary algorithm framework (MEAF) consisting of isolated evolutionary rules, communication output rules, communication input rules, fusion-exchange information operation, and membrane dissolution rules. By leveraging the advantages of multiple metaheuristics algorithms and avoiding the pitfalls of local optima, MEAF offers a promising solution to address complex problems. The effectiveness of the proposed MEAF is verified by applying three classical metaheuristics, namely Genetic Algorithm (GA), Ant Colony System (ACS), and Particle Swarm Algorithm (PSO), to solve the VRPTW problem. The experiments are run on 56 instances of Solomon with 100 client benchmarks. The evaluation of the experimental results combined with the mean and standard deviation values show that the algorithm performs better in 54 out of 56 instances, demonstrating the effectiveness and stability of the proposed algorithm.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-025-07068-y.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027459","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AFFNet: Adaptive feature fusion network for defect detection of industrial product surface 用于工业产品表面缺陷检测的自适应特征融合网络
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-026-07099-z
Zhicheng Jia, Shaoqing Wang, Jinghua Zheng, Xiaobo Han, Yongwei Tang, Fuzhen Sun

Industrial product surface defect detection plays a crucial role in guaranteeing the quality of industrial products. Several primary challenges remain in achieving efficient and accurate automatic detection: (1) Industrial product surface defects demonstrate irregular morphological features. (2) Significant size disparities between different defects result in the risk of information loss during feature fusion in Feature Pyramid Networks (FPN) typically employed in the neck component. (3) Visual similarity among different defect categories means that slight deviations in predicted box locations can result in misclassification errors. This research initially employs a DCN-C3k2 module design, incorporating DCNv3 to improve the model’s sensitivity to varied morphological information of objects. Furthermore, we design an Adaptive Focusing Diffusion Network (AFDN) that aggregates multi-scale features and implements adaptive channel selection and fusion, then propagates critical features through upsampling and downsampling processes to improve defect detection precision. Lastly, we introduce a Task Dynamic Interactive Detection Head (TDIDH). The TDIDH constructs detection head architecture according to the specific properties and variations of different detection tasks, with the objective of optimizing detection performance through enhanced inter-task dynamic interaction. Experiments are performed on public datasets GC10-DET and NEU-DET. The proposed method achieves AP, AP(_{50}), and AP(_{75}) scores of 41.7%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.

工业产品表面缺陷检测对保证工业产品质量起着至关重要的作用。实现高效、准确的自动检测仍然存在几个主要挑战:(1)工业产品表面缺陷表现出不规则的形态特征。(2)特征金字塔网络(feature Pyramid Networks, FPN)中,不同缺陷之间存在较大的尺寸差异,导致特征融合过程中存在信息丢失的风险。(3)不同缺陷类别之间的视觉相似性意味着预测框位置的微小偏差可能导致误分类错误。本研究初步采用DCN-C3k2模块设计,结合DCNv3提高模型对物体形态信息变化的敏感性。此外,我们设计了一个自适应聚焦扩散网络(AFDN),该网络聚合多尺度特征并实现自适应信道选择和融合,然后通过上采样和下采样过程传播关键特征以提高缺陷检测精度。最后,我们介绍了一种任务动态交互检测头(TDIDH)。TDIDH根据不同检测任务的特性和变化构造检测头结构,通过增强任务间的动态交互来优化检测性能。实验在公共数据集GC10-DET和nue - det上进行。该方法的AP、AP (_{50})、AP (_{75})得分均为41.7分%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.
{"title":"AFFNet: Adaptive feature fusion network for defect detection of industrial product surface","authors":"Zhicheng Jia,&nbsp;Shaoqing Wang,&nbsp;Jinghua Zheng,&nbsp;Xiaobo Han,&nbsp;Yongwei Tang,&nbsp;Fuzhen Sun","doi":"10.1007/s10489-026-07099-z","DOIUrl":"10.1007/s10489-026-07099-z","url":null,"abstract":"<div><p>Industrial product surface defect detection plays a crucial role in guaranteeing the quality of industrial products. Several primary challenges remain in achieving efficient and accurate automatic detection: (1) Industrial product surface defects demonstrate irregular morphological features. (2) Significant size disparities between different defects result in the risk of information loss during feature fusion in Feature Pyramid Networks (FPN) typically employed in the neck component. (3) Visual similarity among different defect categories means that slight deviations in predicted box locations can result in misclassification errors. This research initially employs a DCN-C3k2 module design, incorporating DCNv3 to improve the model’s sensitivity to varied morphological information of objects. Furthermore, we design an Adaptive Focusing Diffusion Network (AFDN) that aggregates multi-scale features and implements adaptive channel selection and fusion, then propagates critical features through upsampling and downsampling processes to improve defect detection precision. Lastly, we introduce a Task Dynamic Interactive Detection Head (TDIDH). The TDIDH constructs detection head architecture according to the specific properties and variations of different detection tasks, with the objective of optimizing detection performance through enhanced inter-task dynamic interaction. Experiments are performed on public datasets GC10-DET and NEU-DET. The proposed method achieves AP, AP<span>(_{50})</span>, and AP<span>(_{75})</span> scores of 41.7%, 77.1%, and 40.8% respectively on NEU-DET, demonstrating improvements of 3.7%, 4.5%, and 4.1% compared to YOLOv11. Results on GC10-DET show scores of 44.3%, 81.5%, and 39.8%, with improvements of 3.3%, 0.9%, and 3.3% respectively compared to YOLOv11. Additionally, we achieve a 50% reduction in both parameter count and computational load through model pruning while preserving detection accuracy equivalent to the original AFFNet model, facilitating deployment on edge devices. The experimental results validate the effectiveness of our proposed method. The code is released at: https://github.com/yifansdut/affnet.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027460","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Collaborative deep learning framework based on adaptive feature fusion for malignancy prediction of lung nodules 基于自适应特征融合的协同深度学习框架肺结节恶性预测
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-07073-1
Ruoyu Wu, Changyu Liang, Yuan Li, Qijuan Tan, Jiuquan Zhang, Hong Huang

Malignancy prediction of lung nodules on computed tomography (CT) is a task of significant clinical importance. The malignancy of lung nodules on CT is not only reflected in some local details but also closely related to several global attributes. However, previous studies often ignore this vital fact, resulting in a bottleneck of performance improvement. In this paper, a collaborative deep learning framework based on adaptive feature fusion (CDLF-AFF) is proposed to improve the malignancy prediction performance. In the CDLF-AFF method, a multi-view input strategy is designed to drive the pre-trained models to capture the abundant spatial information of nodule CT images. Furthermore, a dual-branch architecture is developed to simultaneously learn the local structure features and long-range dependency relations. To improve the feature fusion efficiency, a feature aggregation module is constructed to adaptively fuse the feature maps produced by two different styles of learning branches. The CDLF-AFF method is evaluated on three different datasets. On the benchmark dataset LIDC-IDRI, it achieved an AUC of 97.25% (95% CI: 96.97%-97.54%), representing improvements of 1.47% and 0.86% over the corresponding unimodal models, ResNet-Model and ViT-Model, respectively. On the clinical dataset CQUCH-LND, it achieved an AUC of 94.09% (95% CI: 93.34%-94.84%), showing improvements of 3.41% and 1.39% over the corresponding ResNet-Model and ViT-Model, respectively. On the competition dataset LUNGx, it achieved an AUC of 79.13% (95% CI: 78.07%-80.19%), surpassing the best-performing algorithm listed on the competition leaderboard by 11.13%. These results demonstrate that the CDLF-AFF can effectively predict the malignancy of lung nodules.

计算机断层扫描(CT)对肺结节的恶性预测是一项具有重要临床意义的任务。CT上肺结节的恶性不仅反映在局部细节上,还与若干全局属性密切相关。然而,以往的研究往往忽略了这一重要事实,导致性能提升的瓶颈。为了提高恶性肿瘤的预测性能,本文提出了一种基于自适应特征融合(CDLF-AFF)的协同深度学习框架。在CDLF-AFF方法中,设计了一种多视图输入策略,驱动预训练模型捕获结节CT图像丰富的空间信息。在此基础上,提出了一种双分支结构,可以同时学习局部结构特征和远程依赖关系。为了提高特征融合效率,构建了特征聚合模块,对两种不同学习方式生成的特征映射进行自适应融合。在三个不同的数据集上对CDLF-AFF方法进行了评估。在基准数据集LIDC-IDRI上,其AUC达到97.25% (95% CI: 96.97%-97.54%),比相应的单峰模型ResNet-Model和viti - model分别提高了1.47%和0.86%。在临床数据集CQUCH-LND上,实现了94.09%的AUC (95% CI: 93.34%-94.84%),比相应的ResNet-Model和viti - model分别提高了3.41%和1.39%。在比赛数据集LUNGx上,它的AUC达到了79.13% (95% CI: 78.07%-80.19%),比比赛排行榜上表现最好的算法高出11.13%。这些结果表明CDLF-AFF能有效预测肺结节的恶性。
{"title":"Collaborative deep learning framework based on adaptive feature fusion for malignancy prediction of lung nodules","authors":"Ruoyu Wu,&nbsp;Changyu Liang,&nbsp;Yuan Li,&nbsp;Qijuan Tan,&nbsp;Jiuquan Zhang,&nbsp;Hong Huang","doi":"10.1007/s10489-025-07073-1","DOIUrl":"10.1007/s10489-025-07073-1","url":null,"abstract":"<div>\u0000 \u0000 <p>Malignancy prediction of lung nodules on computed tomography (CT) is a task of significant clinical importance. The malignancy of lung nodules on CT is not only reflected in some local details but also closely related to several global attributes. However, previous studies often ignore this vital fact, resulting in a bottleneck of performance improvement. In this paper, a collaborative deep learning framework based on adaptive feature fusion (CDLF-AFF) is proposed to improve the malignancy prediction performance. In the CDLF-AFF method, a multi-view input strategy is designed to drive the pre-trained models to capture the abundant spatial information of nodule CT images. Furthermore, a dual-branch architecture is developed to simultaneously learn the local structure features and long-range dependency relations. To improve the feature fusion efficiency, a feature aggregation module is constructed to adaptively fuse the feature maps produced by two different styles of learning branches. The CDLF-AFF method is evaluated on three different datasets. On the benchmark dataset LIDC-IDRI, it achieved an AUC of 97.25% (95% CI: 96.97%-97.54%), representing improvements of 1.47% and 0.86% over the corresponding unimodal models, ResNet-Model and ViT-Model, respectively. On the clinical dataset CQUCH-LND, it achieved an AUC of 94.09% (95% CI: 93.34%-94.84%), showing improvements of 3.41% and 1.39% over the corresponding ResNet-Model and ViT-Model, respectively. On the competition dataset LUNGx, it achieved an AUC of 79.13% (95% CI: 78.07%-80.19%), surpassing the best-performing algorithm listed on the competition leaderboard by 11.13%. These results demonstrate that the CDLF-AFF can effectively predict the malignancy of lung nodules.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027457","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Scheduling all-scale multi-point manufacturing problems with a single neural model 基于单神经模型的全尺度多点制造问题调度
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-026-07091-7
Jie Liu, Hwa Jen Yap, Anis Salwa Mohd Khairuddin

This research presents a novel energy-efficient task sequencing method for manufacturing operations involving multiple processing points, such as precision drilling and contact welding. The problem is formulated as a multi-dimensional weighted variant of the Traveling Salesman Problem (TSP), and solved using a Multi-gate Mixture of Experts (MMOE) neural architecture. Unlike previous approaches that require separate models for each TSP size, our method employs a single neural network to handle TSPs of all sizes, significantly improving scalability and reducing training overhead. With an uncertainty-based loss weighting strategy, the model effectively balances multiple learning objectives. Experiments show that MMOE-9 achieves performance comparable to state-of-the-art methods with only one-third of the parameters of NAR4TSP, and its training time is similar to that of a single TSP100 model. Further, we extend the model to cover 91 TSP sizes (from 10 to 100) within the same unified framework, demonstrating strong generalization across scales.

针对精密钻孔和接触焊接等多加工点制造作业,提出了一种新的节能任务排序方法。该问题被表述为旅行商问题(TSP)的多维加权变体,并使用多门混合专家(MMOE)神经结构进行求解。与之前需要为每个TSP大小单独建立模型的方法不同,我们的方法使用单个神经网络来处理所有大小的TSP,显著提高了可扩展性并减少了训练开销。该模型采用基于不确定性的损失加权策略,有效地平衡了多个学习目标。实验表明,MMOE-9仅使用NAR4TSP三分之一的参数就可以达到与最先进方法相当的性能,其训练时间与单个TSP100模型相似。此外,我们将模型扩展到同一统一框架内的91个TSP大小(从10到100),展示了跨尺度的强泛化。
{"title":"Scheduling all-scale multi-point manufacturing problems with a single neural model","authors":"Jie Liu,&nbsp;Hwa Jen Yap,&nbsp;Anis Salwa Mohd Khairuddin","doi":"10.1007/s10489-026-07091-7","DOIUrl":"10.1007/s10489-026-07091-7","url":null,"abstract":"<div><p>This research presents a novel energy-efficient task sequencing method for manufacturing operations involving multiple processing points, such as precision drilling and contact welding. The problem is formulated as a multi-dimensional weighted variant of the Traveling Salesman Problem (TSP), and solved using a Multi-gate Mixture of Experts (MMOE) neural architecture. Unlike previous approaches that require separate models for each TSP size, our method employs a single neural network to handle TSPs of all sizes, significantly improving scalability and reducing training overhead. With an uncertainty-based loss weighting strategy, the model effectively balances multiple learning objectives. Experiments show that MMOE-9 achieves performance comparable to state-of-the-art methods with only one-third of the parameters of NAR4TSP, and its training time is similar to that of a single TSP100 model. Further, we extend the model to cover 91 TSP sizes (from 10 to 100) within the same unified framework, demonstrating strong generalization across scales.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://link.springer.com/content/pdf/10.1007/s10489-026-07091-7.pdf","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027458","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-level supervised and fine-grained feature enhancement for person search 人员搜索的多级监督和细粒度特征增强
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-24 DOI: 10.1007/s10489-025-06996-z
Xuyang Zhang, Sijie Yang, Hongkun Liu, Heyan Jin, Guangqiang Yin, Ye Li

The person search task aims to address both person detection and person re-identification(re-id) simultaneously, integrating these two tasks into a unified objective. Person search is commonly used in surveillance and security fields. Currently, person search tasks in surveillance scenarios face many severe challenges, such as scale variations and occlusion issues caused by cameras. Existing approaches often overlook the discrepancies between multi-scale features and typically perform direct feature fusion. Most methods addressing occlusion rely on feature completion techniques, without fully utilizing the inherent fine-grained information from the original images. This paper proposes a Multi-level Supervised and Fine-grained Feature Enhancement for Person Search (MFPS) to mitigate these issues. MFPS employs cascaded encoders and decoders to extract person detection features from the backbone network. To generate re-id features robust to scale variations, MFPS introduces a Multi-Level Supervision method (MLS), which aggregates features of different scales and levels, enriching the semantic information of person features. Furthermore, to address the issue of missing re-id features caused by occlusion, this paper proposes a deformable fine-grained attention module. This module extracts fine-grained re-id features with accurate semantic information through sampling point offset operations. Finally, fine-grained features and multi-scale features are fused, and the re-id features extracted through multi-level supervised fine-grained feature extraction significantly improve recognition accuracy for person search tasks in surveillance scenarios. The experimental results show that MFPS improves the mAP metrics by 0.8 and the top-1 metrics by 1.8 compared to the state-of-the-art method on the PRW dataset, proving its superiority in complex environments. The source code is available at https://github.com/FengHua0208/MFPS.

人员搜索任务旨在同时解决人员检测和人员再识别(re-id)这两个任务,将这两个任务集成为一个统一的目标。人身搜查是一种常用的监视和安全领域。目前,监控场景下的人员搜索任务面临着许多严峻的挑战,如摄像机引起的规模变化和遮挡问题。现有的方法往往忽略了多尺度特征之间的差异,通常直接进行特征融合。大多数解决遮挡的方法依赖于特征补全技术,没有充分利用原始图像中固有的细粒度信息。本文提出了一种多层次监督和细粒度特征增强的人物搜索(MFPS)方法来缓解这些问题。MFPS采用级联编码器和解码器从骨干网络中提取人物检测特征。为了生成对尺度变化具有鲁棒性的re-id特征,MFPS引入了多层监督方法(Multi-Level Supervision method, MLS),该方法将不同尺度和层次的特征聚合在一起,丰富了人物特征的语义信息。此外,为了解决遮挡导致的re-id特征缺失问题,本文提出了一种可变形的细粒度注意力模块。该模块通过采样点偏移操作提取具有准确语义信息的细粒度re-id特征。最后,将细粒度特征与多尺度特征融合,通过多级监督细粒度特征提取提取的re-id特征显著提高了监控场景下人员搜索任务的识别准确率。实验结果表明,与PRW数据集上最先进的方法相比,MFPS的mAP指标提高了0.8,top-1指标提高了1.8,证明了其在复杂环境下的优越性。源代码可从https://github.com/FengHua0208/MFPS获得。
{"title":"Multi-level supervised and fine-grained feature enhancement for person search","authors":"Xuyang Zhang,&nbsp;Sijie Yang,&nbsp;Hongkun Liu,&nbsp;Heyan Jin,&nbsp;Guangqiang Yin,&nbsp;Ye Li","doi":"10.1007/s10489-025-06996-z","DOIUrl":"10.1007/s10489-025-06996-z","url":null,"abstract":"<div><p>The person search task aims to address both person detection and person re-identification(re-id) simultaneously, integrating these two tasks into a unified objective. Person search is commonly used in surveillance and security fields. Currently, person search tasks in surveillance scenarios face many severe challenges, such as scale variations and occlusion issues caused by cameras. Existing approaches often overlook the discrepancies between multi-scale features and typically perform direct feature fusion. Most methods addressing occlusion rely on feature completion techniques, without fully utilizing the inherent fine-grained information from the original images. This paper proposes a Multi-level Supervised and Fine-grained Feature Enhancement for Person Search (MFPS) to mitigate these issues. MFPS employs cascaded encoders and decoders to extract person detection features from the backbone network. To generate re-id features robust to scale variations, MFPS introduces a Multi-Level Supervision method (MLS), which aggregates features of different scales and levels, enriching the semantic information of person features. Furthermore, to address the issue of missing re-id features caused by occlusion, this paper proposes a deformable fine-grained attention module. This module extracts fine-grained re-id features with accurate semantic information through sampling point offset operations. Finally, fine-grained features and multi-scale features are fused, and the re-id features extracted through multi-level supervised fine-grained feature extraction significantly improve recognition accuracy for person search tasks in surveillance scenarios. The experimental results show that MFPS improves the mAP metrics by 0.8 and the top-1 metrics by 1.8 compared to the state-of-the-art method on the PRW dataset, proving its superiority in complex environments. The source code is available at https://github.com/FengHua0208/MFPS.</p></div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Conditional guided diffusion model in latent space for social recommendation 社会推荐潜在空间条件引导扩散模型
IF 3.5 2区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-22 DOI: 10.1007/s10489-026-07094-4
Yijun Hu, Rui Tang, Xian Mo

Social recommendation utilizes social connections as auxiliary information to deeply mine user preferences, thereby improving the recommendation performance. Existing methods often employ graph neural networks to encode social graphs. However, average aggregation may lead to node distortion, and diffusion models may reconstruct embeddings of multi-faceted interests and attributes that are misaligned with task-relevant directions. To address these limitations, this paper introduces a Conditional Guided Diffusion architecture in Latent Space for social recommendation (CGDLS). Specifically, CGDLS first leverages singular value decomposition (SVD) to encode the social connection graph and the co-interacted items graph among users into the low-dimensional latent space. To alleviate noise distortion, CGDLS captures its key features by leveraging SVD to encode the social connection graph. During the reverse process, CGDLS incorporates the embedding of co-interacted items among users as conditional guidance. It guides the reverse process to reconstruct a highly task-relevant social connection embedding. Extensive experiments conducted on three social datasets demonstrate that CGDLS and its components outperform various state-of-the-art methods.

社交推荐利用社交关系作为辅助信息,深度挖掘用户偏好,从而提高推荐性能。现有的方法通常采用图神经网络对社交图进行编码。然而,平均聚集可能会导致节点失真,扩散模型可能会重建与任务相关方向不一致的多方面兴趣和属性的嵌入。为了解决这些限制,本文引入了一种基于潜在空间的条件引导扩散架构(CGDLS)。具体而言,CGDLS首先利用奇异值分解(SVD)将用户之间的社会联系图和共同互动项目图编码到低维潜在空间中。为了减轻噪声失真,CGDLS通过利用SVD对社会连接图进行编码来捕捉其关键特征。在逆向过程中,CGDLS将用户间交互项的嵌入作为条件引导。它引导逆向过程重建一个高度任务相关的社会联系嵌入。在三个社会数据集上进行的大量实验表明,CGDLS及其组件优于各种最先进的方法。
{"title":"Conditional guided diffusion model in latent space for social recommendation","authors":"Yijun Hu,&nbsp;Rui Tang,&nbsp;Xian Mo","doi":"10.1007/s10489-026-07094-4","DOIUrl":"10.1007/s10489-026-07094-4","url":null,"abstract":"<div>\u0000 \u0000 <p>Social recommendation utilizes social connections as auxiliary information to deeply mine user preferences, thereby improving the recommendation performance. Existing methods often employ graph neural networks to encode social graphs. However, average aggregation may lead to node distortion, and diffusion models may reconstruct embeddings of multi-faceted interests and attributes that are misaligned with task-relevant directions. To address these limitations, this paper introduces a <u>C</u>onditional <u>G</u>uided <u>D</u>iffusion architecture in <u>L</u>atent <u>S</u>pace for social recommendation (CGDLS). Specifically, CGDLS first leverages singular value decomposition (SVD) to encode the social connection graph and the co-interacted items graph among users into the low-dimensional latent space. To alleviate noise distortion, CGDLS captures its key features by leveraging SVD to encode the social connection graph. During the reverse process, CGDLS incorporates the embedding of co-interacted items among users as conditional guidance. It guides the reverse process to reconstruct a highly task-relevant social connection embedding. Extensive experiments conducted on three social datasets demonstrate that CGDLS and its components outperform various state-of-the-art methods.</p>\u0000 </div>","PeriodicalId":8041,"journal":{"name":"Applied Intelligence","volume":"56 2","pages":""},"PeriodicalIF":3.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146027019","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Intelligence
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1