Pub Date : 2025-12-29DOI: 10.1007/s40747-025-02166-3
Li Tan, Haixia Zhao
Deep reinforcement learning shows broad prospects in multi-unmanned aerial vehicle(UAV) collaborative search and rescue tasks. However, in the face of high-dimensional collaborative decision-making spaces and limited computing resources, its performance is vulnerable to limitations. This paper proposes a deep deterministic policy gradient method based on linear attention. By introducing the linear attention mechanism based on random feature mapping, while effectively modeling the interaction among UAVs, the computational and storage overcosts caused by the increase in the number of UAVs have been significantly reduced. Furthermore, by combining smooth experience replay and adaptive importance sampling mechanism, the training efficiency and strategy stability have been further improved. The simulation experiments on both post-disaster response search and dynamic containment tasks demonstrate that the proposed algorithm consistently outperforms existing methods. In small-scale scenarios, it maintains nearly perfect success rates, while in medium- and large-scale settings it achieves up to 90.6% and 85.2% success rates in the post-disaster response search task and up to 90.1% and 80.2% in the containment task, corresponding to relative improvements of 15–21% over baselines. These results highlight both the robustness of the method in simple cases and its clear advantage under more challenging multi-UAV conditions.
{"title":"A multi-UAV rapid post-disaster search and rescue method based on deep reinforcement learning","authors":"Li Tan, Haixia Zhao","doi":"10.1007/s40747-025-02166-3","DOIUrl":"https://doi.org/10.1007/s40747-025-02166-3","url":null,"abstract":"Deep reinforcement learning shows broad prospects in multi-unmanned aerial vehicle(UAV) collaborative search and rescue tasks. However, in the face of high-dimensional collaborative decision-making spaces and limited computing resources, its performance is vulnerable to limitations. This paper proposes a deep deterministic policy gradient method based on linear attention. By introducing the linear attention mechanism based on random feature mapping, while effectively modeling the interaction among UAVs, the computational and storage overcosts caused by the increase in the number of UAVs have been significantly reduced. Furthermore, by combining smooth experience replay and adaptive importance sampling mechanism, the training efficiency and strategy stability have been further improved. The simulation experiments on both post-disaster response search and dynamic containment tasks demonstrate that the proposed algorithm consistently outperforms existing methods. In small-scale scenarios, it maintains nearly perfect success rates, while in medium- and large-scale settings it achieves up to 90.6% and 85.2% success rates in the post-disaster response search task and up to 90.1% and 80.2% in the containment task, corresponding to relative improvements of 15–21% over baselines. These results highlight both the robustness of the method in simple cases and its clear advantage under more challenging multi-UAV conditions.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"30 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847121","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-28DOI: 10.1007/s40747-025-02208-w
Juan Lei, Jiangpeng Tian, Xiong You, Zhiwei He
Scene graph generation (SGG), which involves jointly detecting entities and inferring their relationships from images, plays a critical role in high-level visual scene understanding and reasoning tasks. Most existing SGG methods primarily focus on learning dependencies within individual triplets and follow a unidirectional reasoning paradigm, thereby overlooking the reverse constraints from predicates to entities. Moreover, they generally fail to capture inter-relationship dependencies, resulting in isolated predictions that ignore the global contextual information formed by shared entities or semantic associations. To address these limitations, this paper proposes I 2 D-SGG, a novel framework that jointly models both I ntra- and I nter-relationship Dependencies to improve the accuracy and efficacy of SGG. First, we introduce a triple-decoder architecture with dedicated modules for decoding subject, object, and predicate, connected through a prior-enhanced sparse relation matrix. Second, decoupled conditional queries comprising position queries and content queries are strengthened via cross-layer fusion and bidirectional attention, facilitating deeper geometric and semantic interaction within each triplet. Third, a global correlation graph-based reasoning module is employed to model inter-relationships across triplets. This module utilizes Graph Convolutional Networks (GCNs) to enable cross-triplet message passing and dynamic feature aggregation, thereby supporting global context-aware relational reasoning beyond isolated triplet. Experiments on the VG-150 dataset demonstrate that I 2 D-SGG achieves a mean Recall@100 (mR@100) of 35.41%, outperforming the state-of-the-art one-stage method by 1.57%. Qualitative analyses further confirm its superior capability in fine-grained scene understanding. Ablation studies validate the effectiveness and generalizability of our proposed dual dependency modeling mechanism. I 2 D-SGG enhances the model's capacity to comprehend both intra- and inter-relationship, overcoming limitations of unidirectional propagation, entangled query design, and isolated triplet reasoning in conventional approaches, thereby offering a new perspective for fine-grained relational modeling in complex visual scenes.
{"title":"I2D-SGG: scene graph generation via joint modeling of intra- and inter-relationship dependencies","authors":"Juan Lei, Jiangpeng Tian, Xiong You, Zhiwei He","doi":"10.1007/s40747-025-02208-w","DOIUrl":"https://doi.org/10.1007/s40747-025-02208-w","url":null,"abstract":"Scene graph generation (SGG), which involves jointly detecting entities and inferring their relationships from images, plays a critical role in high-level visual scene understanding and reasoning tasks. Most existing SGG methods primarily focus on learning dependencies within individual triplets and follow a unidirectional reasoning paradigm, thereby overlooking the reverse constraints from predicates to entities. Moreover, they generally fail to capture inter-relationship dependencies, resulting in isolated predictions that ignore the global contextual information formed by shared entities or semantic associations. To address these limitations, this paper proposes I <jats:sup>2</jats:sup> D-SGG, a novel framework that jointly models both I ntra- and I nter-relationship Dependencies to improve the accuracy and efficacy of SGG. First, we introduce a triple-decoder architecture with dedicated modules for decoding subject, object, and predicate, connected through a prior-enhanced sparse relation matrix. Second, decoupled conditional queries comprising position queries and content queries are strengthened via cross-layer fusion and bidirectional attention, facilitating deeper geometric and semantic interaction within each triplet. Third, a global correlation graph-based reasoning module is employed to model inter-relationships across triplets. This module utilizes Graph Convolutional Networks (GCNs) to enable cross-triplet message passing and dynamic feature aggregation, thereby supporting global context-aware relational reasoning beyond isolated triplet. Experiments on the VG-150 dataset demonstrate that I <jats:sup>2</jats:sup> D-SGG achieves a mean Recall@100 (mR@100) of 35.41%, outperforming the state-of-the-art one-stage method by 1.57%. Qualitative analyses further confirm its superior capability in fine-grained scene understanding. Ablation studies validate the effectiveness and generalizability of our proposed dual dependency modeling mechanism. I <jats:sup>2</jats:sup> D-SGG enhances the model's capacity to comprehend both intra- and inter-relationship, overcoming limitations of unidirectional propagation, entangled query design, and isolated triplet reasoning in conventional approaches, thereby offering a new perspective for fine-grained relational modeling in complex visual scenes.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"58 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145847126","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1007/s40747-025-02190-3
Fangtao Qin, Bin Fang, Yi Wang
Pedestrian trajectory prediction from egocentric monocular video is hindered by camera motion, intermittent occlusions, and complex social interactions. We present NIM-STGCN, a unified framework whose core contribution is a differentiable view normalization(GVN) that couples an enhanced differentiable PnP layer (ED-PnP) with an $$textrm{SE}(3)$$SE(3) warp to align past observations into a single virtual static camera frame. Because GVN is trained end-to-end, forecasting losses back-propagate to pose estimation, yielding geometrically cleaner inputs. On the normalized histories, a lightweight Gated Convolutional Imputation Module (GCIM) recovers missing bounding-box measurements while preserving observed entries, and an efficient spatio-temporal GCN encodes agent dynamics and interactions (optionally augmented by a physics-guided kinematics–interaction prior, PKIM). A Gaussian-mixture predictor produces multi-modal futures and is optimized with a sequence-level negative log-likelihood together with a time-weighted position loss. Extensive experiments on the JAAD and PIE benchmarks show that NIM-STGCN reduces Average Displacement Error (ADE) and Final Displacement Error (FDE) by 12–18 % compared to state-of-the-art methods. Code is available at https://github.com/fantot/NIM-STGCN . Graphical abstract
从以自我为中心的单目视频中预测行人轨迹会受到摄像机运动、间歇性遮挡和复杂的社会互动的阻碍。我们提出了NIM-STGCN,这是一个统一的框架,其核心贡献是可微视图归一化(GVN),将增强的可微PnP层(ED-PnP)与$$textrm{SE}(3)$$ SE(3)扭曲相结合,将过去的观测结果对齐到单个虚拟静态相机帧中。由于GVN是端到端训练的,预测损失反向传播到姿态估计,从而产生几何上更清晰的输入。在归一化历史上,轻量级门控卷积输入模块(GCIM)恢复丢失的边界盒测量,同时保留观察到的条目,高效的时空GCN编码代理动态和交互(可选地通过物理引导的运动学交互先验,PKIM进行增强)。高斯混合预测器产生多模态未来,并通过序列级负对数似然以及时间加权的位置损失进行优化。在JAAD和PIE基准测试中进行的大量实验表明,nimm - stgcn将平均位移误差(ADE)和最终位移误差(FDE)降低了12 - 18% % compared to state-of-the-art methods. Code is available at https://github.com/fantot/NIM-STGCN . Graphical abstract
{"title":"NIM-STGCN: Differentiable motion decomposition for egocentric pedestrian trajectory prediction","authors":"Fangtao Qin, Bin Fang, Yi Wang","doi":"10.1007/s40747-025-02190-3","DOIUrl":"https://doi.org/10.1007/s40747-025-02190-3","url":null,"abstract":"Pedestrian trajectory prediction from egocentric monocular video is hindered by camera motion, intermittent occlusions, and complex social interactions. We present NIM-STGCN, a unified framework whose core contribution is a differentiable view normalization(GVN) that couples an enhanced differentiable PnP layer (ED-PnP) with an <jats:inline-formula> <jats:alternatives> <jats:tex-math>$$textrm{SE}(3)$$</jats:tex-math> <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:mrow> <mml:mtext>SE</mml:mtext> <mml:mo>(</mml:mo> <mml:mn>3</mml:mn> <mml:mo>)</mml:mo> </mml:mrow> </mml:math> </jats:alternatives> </jats:inline-formula> warp to align past observations into a single virtual static camera frame. Because GVN is trained end-to-end, forecasting losses back-propagate to pose estimation, yielding geometrically cleaner inputs. On the normalized histories, a lightweight Gated Convolutional Imputation Module (GCIM) recovers missing bounding-box measurements while preserving observed entries, and an efficient spatio-temporal GCN encodes agent dynamics and interactions (optionally augmented by a physics-guided kinematics–interaction prior, PKIM). A Gaussian-mixture predictor produces multi-modal futures and is optimized with a sequence-level negative log-likelihood together with a time-weighted position loss. Extensive experiments on the JAAD and PIE benchmarks show that NIM-STGCN reduces Average Displacement Error (ADE) and Final Displacement Error (FDE) by 12–18 % compared to state-of-the-art methods. Code is available at <jats:ext-link xmlns:xlink=\"http://www.w3.org/1999/xlink\" xlink:href=\"https://github.com/fantot/NIM-STGCN\" ext-link-type=\"uri\">https://github.com/fantot/NIM-STGCN</jats:ext-link> . Graphical abstract","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"29 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836244","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1007/s40747-025-02192-1
Chuanhe Shen, Wenjing Pan, Xu Shen
{"title":"Credit risk prediction and heterogeneity analysis for SMEs based on large language models and multimodal data fusion","authors":"Chuanhe Shen, Wenjing Pan, Xu Shen","doi":"10.1007/s40747-025-02192-1","DOIUrl":"https://doi.org/10.1007/s40747-025-02192-1","url":null,"abstract":"","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"32 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145836246","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1007/s40747-025-02204-0
Jiyan Salim Mahmud, Zakarya Farou, Imre Lendák
Evaluating unsupervised anomaly detection presents significant challenges due to the absence of ground truth labels and the complex nature of anomaly distributions. In this study, we introduce two novel intrinsic evaluation metrics: the Anomaly Separation Index (ASI) and the Anomaly Separation and Overlap Index (ASOI), designed to overcome the limitations of traditional metrics, which cannot assess model performance without labels. ASI quantifies the degree of separation between detected anomalies and normal distributions, while ASOI incorporates both separation and distributional overlap between them, providing an innovative evaluation approach for anomaly detection models, enabling performance assessment even in the absence of ground truth labels. Extensive experiments through precision degradation tests and unsupervised anomaly detection algorithms were conducted on multiple datasets. The results indicate that the metrics consistently correlate with traditional metrics, such as the $$F_1$$F1 score, in various benchmark datasets characterized by complex feature interactions and varying levels of anomaly contamination. ASOI showed a higher correlation with the $$F_1$$F1 score compared to ASI and several other classical intrinsic metrics. Furthermore, the findings underscore the utility of ASOI as an internal validation measure for model optimization in unsupervised anomaly tasks. The proposed metrics are computationally efficient, scalable, and adaptable to a variety of anomaly detection scenarios, making them practical for real-world applications across industries such as cybersecurity, fraud detection, and predictive maintenance.
{"title":"ASOI: anomaly separation and overlap index, an internal evaluation metric for unsupervised anomaly detection","authors":"Jiyan Salim Mahmud, Zakarya Farou, Imre Lendák","doi":"10.1007/s40747-025-02204-0","DOIUrl":"https://doi.org/10.1007/s40747-025-02204-0","url":null,"abstract":"Evaluating unsupervised anomaly detection presents significant challenges due to the absence of ground truth labels and the complex nature of anomaly distributions. In this study, we introduce two novel intrinsic evaluation metrics: the Anomaly Separation Index (ASI) and the Anomaly Separation and Overlap Index (ASOI), designed to overcome the limitations of traditional metrics, which cannot assess model performance without labels. ASI quantifies the degree of separation between detected anomalies and normal distributions, while ASOI incorporates both separation and distributional overlap between them, providing an innovative evaluation approach for anomaly detection models, enabling performance assessment even in the absence of ground truth labels. Extensive experiments through precision degradation tests and unsupervised anomaly detection algorithms were conducted on multiple datasets. The results indicate that the metrics consistently correlate with traditional metrics, such as the <jats:inline-formula> <jats:alternatives> <jats:tex-math>$$F_1$$</jats:tex-math> <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msub> <mml:mi>F</mml:mi> <mml:mn>1</mml:mn> </mml:msub> </mml:math> </jats:alternatives> </jats:inline-formula> score, in various benchmark datasets characterized by complex feature interactions and varying levels of anomaly contamination. ASOI showed a higher correlation with the <jats:inline-formula> <jats:alternatives> <jats:tex-math>$$F_1$$</jats:tex-math> <mml:math xmlns:mml=\"http://www.w3.org/1998/Math/MathML\"> <mml:msub> <mml:mi>F</mml:mi> <mml:mn>1</mml:mn> </mml:msub> </mml:math> </jats:alternatives> </jats:inline-formula> score compared to ASI and several other classical intrinsic metrics. Furthermore, the findings underscore the utility of ASOI as an internal validation measure for model optimization in unsupervised anomaly tasks. The proposed metrics are computationally efficient, scalable, and adaptable to a variety of anomaly detection scenarios, making them practical for real-world applications across industries such as cybersecurity, fraud detection, and predictive maintenance.","PeriodicalId":10524,"journal":{"name":"Complex & Intelligent Systems","volume":"118 1","pages":""},"PeriodicalIF":5.8,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145829958","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":2,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}