首页 > 最新文献

Expert Systems with Applications最新文献

英文 中文
Investigating spatial-temporal bias of LLMs 法学硕士的时空偏差研究
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131542
Zijun Li
Large Language Models (LLMs) are emerging as powerful knowledge and expert systems with notable capabilities in understanding and inferring various intelligent tasks. However, their spatiotemporal cognition biases remain largely underexplored, despite being highly consequential for effectively leveraging LLMs to power diverse applications in understanding, explaining, and forecasting such tasks. In light of this, this paper presents an investigation of the presence and patterns of spatiotemporal bias in LLMs. Specifically, this paper first constructs two datasets from the perspectives of economic and social forecasting, each paired with corresponding model-predicted values for the same spatiotemporal scope across four different LLMs. Then, a novel autocorrelation measurement approach is introduced, alongside a set of quantification methods, to jointly evaluate correlation in biases across both space and time. The results show notable variation in performance and bias across models and tasks, with uncommon and more sensitive tasks exhibiting worse performance, and certain LLMs producing regionally clustered errors while others exhibit near-random distributions. Out of all other methods of changing prompts, incorporating temporal context significantly improves predictive accuracy, particularly for volatile or low-frequency events. Overall, these findings highlight the partial but inconsistent internalization of real-world spatiotemporal patterns in LLMs, and the proposed methods provide tools for quantifying and interpreting spatiotemporal bias, thereby offering guidance for designing fairer and more reliable LLM-based expert systems and applications.
大型语言模型(llm)作为一种强大的知识和专家系统,在理解和推断各种智能任务方面具有显著的能力。然而,他们的时空认知偏差在很大程度上仍未得到充分探索,尽管有效地利用法学硕士来推动理解、解释和预测这些任务的各种应用是非常重要的。鉴于此,本文对法学硕士中时空偏差的存在和模式进行了研究。具体而言,本文首先从经济和社会预测的角度构建了两个数据集,每个数据集对应四个不同llm在相同时空范围内的相应模型预测值。然后,引入了一种新的自相关测量方法,以及一套量化方法,以联合评估跨空间和时间的偏差相关性。结果显示,不同模型和任务的性能和偏差存在显著差异,不常见和更敏感的任务表现出更差的性能,某些llm产生区域聚类错误,而其他llm则表现出近乎随机的分布。在所有其他改变提示的方法中,结合时间上下文可以显著提高预测的准确性,特别是对于易变事件或低频事件。总体而言,这些发现突出了法学硕士对现实世界时空模式的部分但不一致的内在化,所提出的方法为量化和解释时空偏见提供了工具,从而为设计更公平、更可靠的基于法学硕士的专家系统和应用程序提供了指导。
{"title":"Investigating spatial-temporal bias of LLMs","authors":"Zijun Li","doi":"10.1016/j.eswa.2026.131542","DOIUrl":"10.1016/j.eswa.2026.131542","url":null,"abstract":"<div><div>Large Language Models (LLMs) are emerging as powerful knowledge and expert systems with notable capabilities in understanding and inferring various intelligent tasks. However, their spatiotemporal cognition biases remain largely underexplored, despite being highly consequential for effectively leveraging LLMs to power diverse applications in understanding, explaining, and forecasting such tasks. In light of this, this paper presents an investigation of the presence and patterns of spatiotemporal bias in LLMs. Specifically, this paper first constructs two datasets from the perspectives of economic and social forecasting, each paired with corresponding model-predicted values for the same spatiotemporal scope across four different LLMs. Then, a novel autocorrelation measurement approach is introduced, alongside a set of quantification methods, to jointly evaluate correlation in biases across both space and time. The results show notable variation in performance and bias across models and tasks, with uncommon and more sensitive tasks exhibiting worse performance, and certain LLMs producing regionally clustered errors while others exhibit near-random distributions. Out of all other methods of changing prompts, incorporating temporal context significantly improves predictive accuracy, particularly for volatile or low-frequency events. Overall, these findings highlight the partial but inconsistent internalization of real-world spatiotemporal patterns in LLMs, and the proposed methods provide tools for quantifying and interpreting spatiotemporal bias, thereby offering guidance for designing fairer and more reliable LLM-based expert systems and applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131542"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On ground track design of unmanned fixed-wing drone aided relaying in windy environments 多风环境下无人固定翼无人机辅助接力地面轨道设计
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131539
Xuan Zhu , Xiaodong Ji , Ansheng Yin
This article studies an unmanned fixed-wing drone (UFD) aided relaying in windy environments, where the UFD serves as a full-duplex amplify-and-forward relay to forward a desired size of data for two ground terminals. In light of aerodynamics and the wind triangle, the UFD’s engine power required for flying at a constant airspeed along a circular ground track in a three dimensional uniform wind is analyzed, giving a corresponding closed-form expression. It is shown that the UFD’s engine power depends upon its airspeed and bank angle in addition to the wind-speed and the corresponding vertical angle. On this basis, an optimization problem corresponding to the UFD’s ground track design is investigated. Using the block coordinate descent technique, the initial problem is decomposed into two sub-problems, which are addressed by four algorithms (Algorithms 1–4). This leads to an iterative algorithm (Algorithm 5) that optimizes the UFD’s airspeed and adjusts its flight parameters (e.g., time, radius, and the angles of pitch, course, crab, heading, and bank) to follow the desired ground track. Computer simulation results verified that the proposed algorithm achieves the best energy-saving performance, and generates a small bank angle with minimal variation during flight. This characteristic alleviates the demand for fast bank angle command following when adjusting the UFD’s flight parameters in windy environments.
本文研究了无人固定翼无人机(UFD)在多风环境中的辅助中继,UFD作为全双工放大转发中继,为两个地面终端转发所需大小的数据。从空气动力学和风三角的角度出发,分析了在三维均布风条件下沿圆形地面轨道匀速飞行所需的发动机功率,给出了相应的封闭表达式。结果表明,UFD的发动机功率除了取决于风速和相应的垂直角度外,还取决于其空速和倾侧角。在此基础上,研究了UFD地面轨道设计的优化问题。采用分块坐标下降技术,将初始问题分解为两个子问题,分别采用算法1-4进行求解。这导致了一个迭代算法(算法5),优化UFD的空速并调整其飞行参数(例如,时间,半径,俯仰角,航向,夹角,航向和倾斜角)以遵循所需的地面轨迹。计算机仿真结果验证了该算法达到了最佳的节能性能,且在飞行过程中产生的倾斜角较小且变化最小。这一特性减轻了在多风环境下调整UFD飞行参数时对快速倾斜角度指令跟随的需求。
{"title":"On ground track design of unmanned fixed-wing drone aided relaying in windy environments","authors":"Xuan Zhu ,&nbsp;Xiaodong Ji ,&nbsp;Ansheng Yin","doi":"10.1016/j.eswa.2026.131539","DOIUrl":"10.1016/j.eswa.2026.131539","url":null,"abstract":"<div><div>This article studies an unmanned fixed-wing drone (UFD) aided relaying in windy environments, where the UFD serves as a full-duplex amplify-and-forward relay to forward a desired size of data for two ground terminals. In light of aerodynamics and the wind triangle, the UFD’s engine power required for flying at a constant airspeed along a circular ground track in a three dimensional uniform wind is analyzed, giving a corresponding closed-form expression. It is shown that the UFD’s engine power depends upon its airspeed and bank angle in addition to the wind-speed and the corresponding vertical angle. On this basis, an optimization problem corresponding to the UFD’s ground track design is investigated. Using the block coordinate descent technique, the initial problem is decomposed into two sub-problems, which are addressed by four algorithms (Algorithms 1–4). This leads to an iterative algorithm (Algorithm 5) that optimizes the UFD’s airspeed and adjusts its flight parameters (e.g., time, radius, and the angles of pitch, course, crab, heading, and bank) to follow the desired ground track. Computer simulation results verified that the proposed algorithm achieves the best energy-saving performance, and generates a small bank angle with minimal variation during flight. This characteristic alleviates the demand for fast bank angle command following when adjusting the UFD’s flight parameters in windy environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131539"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Improved visibility for monocular camera-based perception in autonomous driving systems under rain with fog: An efficient vision transformer approach 雨雾条件下自动驾驶系统中基于单目摄像头感知的可视性改进:一种有效的视觉转换方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2025-12-04 DOI: 10.1016/j.eswa.2025.130661
Yao-Jiun Huang , Kang Li
Adverse weather conditions degrade image quality and reduce the reliability of monocular camera-based perception systems in autonomous driving. Most existing deraining methods focus on removing rain streaks while overlooking fog effects, which further obscure scene details and hinder downstream perception tasks. This paper presents an efficient Vision Transformer (ViT)-based restoration framework that addresses both rain streaks and fog without compromising perception accuracy. The method incorporates a Depth-Guided Spatial Feature Transform (DG-SFT) block, which leverages depth information predicted by a lightweight CNN-based decoder. The DG-SFT is designed based on a mathematical rain model to effectively remove rain and distance-dependent haze. A semantic loss function is introduced to constrain the segmentation output discrepancy between original and restored images to within 4 %. Experiments on the RainCityscapes dataset and real-world rainy images demonstrate improvements in PSNR and SSIM over existing ViT- and CNN-based approaches, with an inference latency of 7.86 ms, supporting its deployment in latency-critical autonomous vehicle platforms.
恶劣的天气条件会降低图像质量,降低自动驾驶中基于单目摄像头的感知系统的可靠性。大多数现有的脱轨方法侧重于去除雨纹,而忽略了雾的效果,这进一步模糊了场景细节并阻碍了下游的感知任务。本文提出了一种高效的基于视觉变压器(ViT)的恢复框架,该框架可以在不影响感知精度的情况下处理雨纹和雾。该方法结合了深度引导空间特征变换(DG-SFT)块,该块利用了基于cnn的轻量级解码器预测的深度信息。DG-SFT是基于数学降雨模型设计的,可以有效地去除雨和距离相关的雾霾。引入语义损失函数,将原始图像与恢复图像的分割输出误差控制在4%以内。在raincityscape数据集和真实雨天图像上的实验表明,与现有的基于ViT和cnn的方法相比,PSNR和SSIM得到了改进,推理延迟为7.86 ms,支持其在延迟关键型自动驾驶汽车平台上的部署。
{"title":"Improved visibility for monocular camera-based perception in autonomous driving systems under rain with fog: An efficient vision transformer approach","authors":"Yao-Jiun Huang ,&nbsp;Kang Li","doi":"10.1016/j.eswa.2025.130661","DOIUrl":"10.1016/j.eswa.2025.130661","url":null,"abstract":"<div><div>Adverse weather conditions degrade image quality and reduce the reliability of monocular camera-based perception systems in autonomous driving. Most existing deraining methods focus on removing rain streaks while overlooking fog effects, which further obscure scene details and hinder downstream perception tasks. This paper presents an efficient Vision Transformer (ViT)-based restoration framework that addresses both rain streaks and fog without compromising perception accuracy. The method incorporates a Depth-Guided Spatial Feature Transform (DG-SFT) block, which leverages depth information predicted by a lightweight CNN-based decoder. The DG-SFT is designed based on a mathematical rain model to effectively remove rain and distance-dependent haze. A semantic loss function is introduced to constrain the segmentation output discrepancy between original and restored images to within 4 %. Experiments on the RainCityscapes dataset and real-world rainy images demonstrate improvements in PSNR and SSIM over existing ViT- and CNN-based approaches, with an inference latency of 7.86 ms, supporting its deployment in latency-critical autonomous vehicle platforms.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 130661"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154279","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Three decades of differential evolution: a bibliometric analysis (1995-2025) 三十年的差异演化:文献计量学分析(1995-2025)
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-12 DOI: 10.1016/j.eswa.2026.131451
Pooja Verma , Reshu Chaudhary , Hina Gupta , Rohit Salgotra
Since its introduction in 1995, Differential Evolution (DE) has emerged as a foundational algorithm in the domain of computational intelligence and metaheuristic optimization. This paper presents a comprehensive bibliometric and thematic review of DE research over three decades (1995–2025), based on 9,900+ publications retrieved from Scopus, Web of Science (WoS), and IEEE Xplore. By using advanced visualization tools, including Sankey diagrams (to trace institution-country-keyword flows), Choropleth maps (to reveal global research distribution), citation and co-authorship networks, and heatmaps (to assess cross-domain influence), the review uncovers major contributors, thematic concentrations, and emerging frontiers. The analysis spans publication trajectories, prolific authors and institutions, core research directions, and domain-specific applications in 12 prominent fields such as engineering optimization, artificial intelligence, bioinformatics, energy systems, and control systems. The study emphasizes the evolution of DE and its increasing interdisciplinary integration, and the growing dominance of Asia, particularly China, India, and Iran, as key centers of DE innovation. Through a synthesis of keyword co-occurrence, collaborative clustering, and citation dynamics, this review maps the landscape of DE research and outlines pressing challenges and promising avenues for future inquiry.
自1995年推出以来,差分进化(DE)已成为计算智能和元启发式优化领域的基础算法。本文基于从Scopus、Web of Science (WoS)和IEEE explore检索到的9900多篇论文,对过去三十年(1995-2025)DE研究进行了全面的文献计量和专题回顾。通过使用先进的可视化工具,包括Sankey图表(追踪机构-国家-关键字流动)、Choropleth地图(揭示全球研究分布)、引文和合著者网络以及热图(评估跨领域影响),该综述揭示了主要贡献者、主题集中度和新兴前沿。该分析涵盖了工程优化、人工智能、生物信息学、能源系统和控制系统等12个突出领域的出版轨迹、多产作者和机构、核心研究方向和领域特定应用。该研究强调了DE的演变及其跨学科整合的增加,以及亚洲,特别是中国,印度和伊朗作为DE创新的主要中心的日益增长的主导地位。通过对关键词共现、协同聚类和引文动态的综合分析,本文描绘了DE研究的前景,概述了未来研究的紧迫挑战和有希望的途径。
{"title":"Three decades of differential evolution: a bibliometric analysis (1995-2025)","authors":"Pooja Verma ,&nbsp;Reshu Chaudhary ,&nbsp;Hina Gupta ,&nbsp;Rohit Salgotra","doi":"10.1016/j.eswa.2026.131451","DOIUrl":"10.1016/j.eswa.2026.131451","url":null,"abstract":"<div><div>Since its introduction in 1995, Differential Evolution (DE) has emerged as a foundational algorithm in the domain of computational intelligence and metaheuristic optimization. This paper presents a comprehensive bibliometric and thematic review of DE research over three decades (1995–2025), based on 9,900+ publications retrieved from Scopus, Web of Science (WoS), and IEEE Xplore. By using advanced visualization tools, including Sankey diagrams (to trace institution-country-keyword flows), Choropleth maps (to reveal global research distribution), citation and co-authorship networks, and heatmaps (to assess cross-domain influence), the review uncovers major contributors, thematic concentrations, and emerging frontiers. The analysis spans publication trajectories, prolific authors and institutions, core research directions, and domain-specific applications in 12 prominent fields such as engineering optimization, artificial intelligence, bioinformatics, energy systems, and control systems. The study emphasizes the evolution of DE and its increasing interdisciplinary integration, and the growing dominance of Asia, particularly China, India, and Iran, as key centers of DE innovation. Through a synthesis of keyword co-occurrence, collaborative clustering, and citation dynamics, this review maps the landscape of DE research and outlines pressing challenges and promising avenues for future inquiry.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131451"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154277","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sureillance camera authentication system based on PRNU 基于PRNU的监控摄像头认证系统
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131548
Jian Li , Lisheng Yan , Bin Ma , Xiaolong Li , Zhenxing Qian
In this paper, we propose a camera authentication scheme to enhance access security for front-end devices in surveillance networks. The scheme leverages the Photo-Response Non-Uniformity (PRNU) pattern noise of camera sensors and combines traditional encryption techniques to strengthen system security. During registration, the camera captures images to extract PRNU, generates a compressed device fingerprint stored on the server as a root key. For authentication, the server sends a challenge sequence randomly generated from the root key to the front-end, which captures a new image to generate a root key approximation for response. To prevent attackers from extracting device fingerprints from public images, it incorporates anonymization, proposing a DWT-based PRNU anonymization algorithm. This improves PSNR by 8.08 dB and SSIM by 0.08 on average compared to previous methods. Security Analysis and Experimental results show high authentication accuracy and security, effectively resisting replay and man-in-the-middle attacks, providing a robust solution for surveillance network devices.
为了提高监控网络中前端设备的访问安全性,本文提出了一种摄像机认证方案。该方案利用相机传感器的光响应非均匀性(PRNU)模式噪声,结合传统的加密技术来增强系统的安全性。注册过程中,摄像头采集图像提取PRNU,生成压缩后的设备指纹作为根密钥存储在服务器上。对于身份验证,服务器将从根密钥随机生成的质询序列发送到前端,前端捕获一个新图像以生成响应的根密钥近似值。为了防止攻击者从公开图像中提取设备指纹,结合匿名化,提出了一种基于dwt的PRNU匿名化算法。与以前的方法相比,PSNR提高了8.08 dB, SSIM平均提高了0.08。安全性分析和实验结果表明,该方法具有较高的认证准确性和安全性,能有效抵御重放和中间人攻击,为监控网络设备提供了可靠的解决方案。
{"title":"Sureillance camera authentication system based on PRNU","authors":"Jian Li ,&nbsp;Lisheng Yan ,&nbsp;Bin Ma ,&nbsp;Xiaolong Li ,&nbsp;Zhenxing Qian","doi":"10.1016/j.eswa.2026.131548","DOIUrl":"10.1016/j.eswa.2026.131548","url":null,"abstract":"<div><div>In this paper, we propose a camera authentication scheme to enhance access security for front-end devices in surveillance networks. The scheme leverages the Photo-Response Non-Uniformity (PRNU) pattern noise of camera sensors and combines traditional encryption techniques to strengthen system security. During registration, the camera captures images to extract PRNU, generates a compressed device fingerprint stored on the server as a root key. For authentication, the server sends a challenge sequence randomly generated from the root key to the front-end, which captures a new image to generate a root key approximation for response. To prevent attackers from extracting device fingerprints from public images, it incorporates anonymization, proposing a DWT-based PRNU anonymization algorithm. This improves PSNR by 8.08 dB and SSIM by 0.08 on average compared to previous methods. Security Analysis and Experimental results show high authentication accuracy and security, effectively resisting replay and man-in-the-middle attacks, providing a robust solution for surveillance network devices.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131548"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Object search strategy for service robots with knowledge-based viewpoint selection and hierarchical action decisions 基于知识的视点选择和分层行动决策的服务机器人目标搜索策略
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131538
Yuhao Wang, Guohui Tian
Service robots are frequently tasked with searching for target objects relevant to specific operations. However, the dynamic nature of object locations poses significant challenges for precise localization and tracking. To address this, we propose a unified framework for efficient object search and navigation that integrates viewpoint selection, dynamic map construction, and adaptive hierarchical planning. Our method constructs a visual-topological map (VTMap) that fuses prior knowledge, object-room and object-object co-occurrence statistics, and spatial probability distributions modeled via Gaussian Mixture Models (GMM). The robot continuously generates and updates a room-level probability map, enabling systematic selection of optimal viewpoints. This process maximizes the likelihood of target detection while minimizing travel distance through a utility-based strategy. Multimodal sensory observations are represented as graph nodes, with navigation actions encoded as edges, supporting accurate localization and action planning. To complement global planning, we introduce a hierarchical search strategy that unifies long-term exploration objectives with adaptive local exploration informed by imitation learning. The agent dynamically adjusts its search direction by integrating prior experiences with real-time sensory cues. Local exploration is formulated as a partially observable Markov decision process (POMDP), guided by spatial memory and semantic targets. Furthermore, action cost modeling and an auxiliary inflection point prediction task refine the local exploration process, enabling the system to flexibly transition between global and local search strategies. Collectively, these components facilitate robust and efficient object-oriented navigation in complex and dynamic environments.
服务机器人的任务通常是搜索与特定操作相关的目标物体。然而,物体位置的动态性对精确定位和跟踪提出了重大挑战。为了解决这个问题,我们提出了一个统一的高效目标搜索和导航框架,该框架集成了视点选择、动态地图构建和自适应分层规划。我们的方法构建了一个视觉拓扑地图(VTMap),该地图融合了先验知识、对象空间和对象共现统计以及通过高斯混合模型(GMM)建模的空间概率分布。机器人不断生成和更新房间级别的概率图,从而系统地选择最佳视点。这个过程最大限度地提高了目标检测的可能性,同时通过基于效用的策略最小化了旅行距离。多模态感官观察被表示为图节点,导航动作被编码为边,支持精确的定位和行动计划。为了补充全局规划,我们引入了一种分层搜索策略,该策略将长期探索目标与模仿学习的适应性局部探索相结合。智能体通过整合先验经验和实时感知线索来动态调整其搜索方向。局部探索是由空间记忆和语义目标引导的部分可观察马尔可夫决策过程(POMDP)。此外,行动成本模型和辅助拐点预测任务改进了局部搜索过程,使系统能够灵活地在全局和局部搜索策略之间转换。总的来说,这些组件有助于在复杂和动态的环境中实现健壮和高效的面向对象导航。
{"title":"Object search strategy for service robots with knowledge-based viewpoint selection and hierarchical action decisions","authors":"Yuhao Wang,&nbsp;Guohui Tian","doi":"10.1016/j.eswa.2026.131538","DOIUrl":"10.1016/j.eswa.2026.131538","url":null,"abstract":"<div><div>Service robots are frequently tasked with searching for target objects relevant to specific operations. However, the dynamic nature of object locations poses significant challenges for precise localization and tracking. To address this, we propose a unified framework for efficient object search and navigation that integrates viewpoint selection, dynamic map construction, and adaptive hierarchical planning. Our method constructs a visual-topological map (VTMap) that fuses prior knowledge, object-room and object-object co-occurrence statistics, and spatial probability distributions modeled via Gaussian Mixture Models (GMM). The robot continuously generates and updates a room-level probability map, enabling systematic selection of optimal viewpoints. This process maximizes the likelihood of target detection while minimizing travel distance through a utility-based strategy. Multimodal sensory observations are represented as graph nodes, with navigation actions encoded as edges, supporting accurate localization and action planning. To complement global planning, we introduce a hierarchical search strategy that unifies long-term exploration objectives with adaptive local exploration informed by imitation learning. The agent dynamically adjusts its search direction by integrating prior experiences with real-time sensory cues. Local exploration is formulated as a partially observable Markov decision process (POMDP), guided by spatial memory and semantic targets. Furthermore, action cost modeling and an auxiliary inflection point prediction task refine the local exploration process, enabling the system to flexibly transition between global and local search strategies. Collectively, these components facilitate robust and efficient object-oriented navigation in complex and dynamic environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131538"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154276","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Inverse identification of unsteady disturbance sources in mine ventilation systems 矿井通风系统非定常扰动源的逆识别
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-09 DOI: 10.1016/j.eswa.2026.131603
Yonghong Liu , Ziming Wang , Yupeng Xie , De Huang
Maintaining stable airflow within mine ventilation systems is essential for ensuring safe and continuous underground operations. However, unsteady airflow disturbances induced by the intermittent movement of mine cars or hoisting cages generate transient, low-amplitude perturbations that are dynamically coupled across the ventilation network. These disturbances are superimposed on the steady mechanical ventilation field, producing unsteady signals that conventional steady-state models cannot effectively decouple or localize, leading to discrepancies between monitoring data and actual ventilation conditions. To address this challenge, a mathematical model was developed to characterize unsteady airflow disturbances in underground tunnels, and the dynamic effects of mine car movement on ventilation airflow were systematically analyzed. A network-based algorithm was further designed to solve the unsteady disturbance field, and a simulation platform was constructed to reproduce dynamic airflow behavior, showing minimal deviation from theoretical predictions. Building on this foundation, a hybrid Maximum Information Coefficient-Long Short-Term Memory (MIC-LSTM) neural network model was proposed for the inverse identification of unsteady disturbance sources. The Maximum Information Coefficient (MIC) was utilized to extract informative features from airflow velocity time-series data, while the LSTM network identified disturbance sources from temporal dependencies. Experimental results demonstrate that when the disturbance threshold is 0.1 and the monitoring coverage ratio is 0.3, all evaluation metrics approximately 90%. Validation in an operational mine ventilation system further confirms the model’s accuracy, robustness, and generalizability. This study establishes an artificial intelligence-driven framework for intelligent monitoring and control of unsteady disturbances, providing actionable insights toward safer and more efficient mine ventilation management.
矿井通风系统内保持稳定的气流是保证井下安全连续作业的关键。然而,由矿车或提升笼的间歇运动引起的非定常气流扰动会产生瞬态的、低振幅的扰动,这些扰动在通风网络中动态耦合。这些扰动叠加在稳定的机械通风场上,产生了常规稳态模型无法有效解耦或局部化的非定常信号,导致监测数据与实际通风条件存在差异。针对这一挑战,建立了地下巷道非定常气流扰动的数学模型,系统分析了矿车运动对通风气流的动力学影响。设计了一种基于网络的非定常扰动场求解算法,并构建了一个模拟平台,以再现与理论预测偏差最小的动态气流行为。在此基础上,提出了一种最大信息系数-长短期记忆(MIC-LSTM)混合神经网络模型,用于非定常干扰源的逆识别。利用最大信息系数(MIC)从气流速度时间序列数据中提取信息特征,而LSTM网络从时间依赖性中识别干扰源。实验结果表明,当干扰阈值为0.1,监测覆盖率为0.3时,所有评价指标均接近90%。在实际矿井通风系统中的验证进一步证实了该模型的准确性、鲁棒性和通用性。本研究建立了一个人工智能驱动的框架,用于非定常扰动的智能监测和控制,为更安全、更有效的矿井通风管理提供可操作的见解。
{"title":"Inverse identification of unsteady disturbance sources in mine ventilation systems","authors":"Yonghong Liu ,&nbsp;Ziming Wang ,&nbsp;Yupeng Xie ,&nbsp;De Huang","doi":"10.1016/j.eswa.2026.131603","DOIUrl":"10.1016/j.eswa.2026.131603","url":null,"abstract":"<div><div>Maintaining stable airflow within mine ventilation systems is essential for ensuring safe and continuous underground operations. However, unsteady airflow disturbances induced by the intermittent movement of mine cars or hoisting cages generate transient, low-amplitude perturbations that are dynamically coupled across the ventilation network. These disturbances are superimposed on the steady mechanical ventilation field, producing unsteady signals that conventional steady-state models cannot effectively decouple or localize, leading to discrepancies between monitoring data and actual ventilation conditions. To address this challenge, a mathematical model was developed to characterize unsteady airflow disturbances in underground tunnels, and the dynamic effects of mine car movement on ventilation airflow were systematically analyzed. A network-based algorithm was further designed to solve the unsteady disturbance field, and a simulation platform was constructed to reproduce dynamic airflow behavior, showing minimal deviation from theoretical predictions. Building on this foundation, a hybrid Maximum Information Coefficient-Long Short-Term Memory (MIC-LSTM) neural network model was proposed for the inverse identification of unsteady disturbance sources. The Maximum Information Coefficient (MIC) was utilized to extract informative features from airflow velocity time-series data, while the LSTM network identified disturbance sources from temporal dependencies. Experimental results demonstrate that when the disturbance threshold is 0.1 and the monitoring coverage ratio is 0.3, all evaluation metrics approximately 90%. Validation in an operational mine ventilation system further confirms the model’s accuracy, robustness, and generalizability. This study establishes an artificial intelligence-driven framework for intelligent monitoring and control of unsteady disturbances, providing actionable insights toward safer and more efficient mine ventilation management.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131603"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154281","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction LDM-DTI:一个集成了预训练语言模型和几何图网络的多模态框架,用于可解释的药物-靶点相互作用预测
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-06-01 Epub Date: 2026-02-05 DOI: 10.1016/j.eswa.2026.131485
Yuanyuan Ji , Zhuo Chen , Zhihan Liu , Xiaofeng Man , Junwei Du , Bin Yu
Accurate prediction of drug-target interactions (DTIs) is essential for speeding up the discovery of new therapeutics. Although significant progress has been made with deep learning-based approaches, considerable challenges remain in learning informative molecular representations and modeling the intricate nature of drug-target associations. To overcome these limitations, an end-to-end predictive architecture, termed LDM-DTI, is proposed. In this framework, drug and protein sequences are encoded via pretrained large language models. Specifically, ChemBERTa is utilized to derive high-dimensional semantic and structural features from SMILES strings, while ProtBERT is employed to extract contextual representations from amino acid sequences. To further incorporate spatial molecular information, a three-layer Graph Convolutional Network (GCN) and an Equivariant Graph Neural Network (EGNN) are integrated to capture both 2D topological and 3D geometric characteristics of drug molecules. Protein-level features are refined through dynamic convolutional operations and multi-head self-attention mechanisms. These representations are then fused via a Dynamic Interactive Attention Module (DIAM) to model cross-modal dependencies between drugs and targets. The proposed framework demonstrates superior predictive performance and generalizability across four public benchmark datasets, consistently surpassing ten state-of-the-art baselines. Ablation experiments are conducted to quantify the contributions of individual components, and protein-level attention maps are visualized to enhance interpretability. Overall, LDM-DTI offers a robust and interpretable solution for DTI prediction, with strong potential for accelerating structure-informed drug discovery.
准确预测药物-靶标相互作用(DTIs)对于加速新疗法的发现至关重要。尽管基于深度学习的方法取得了重大进展,但在学习信息分子表示和模拟药物靶标关联的复杂性质方面仍然存在相当大的挑战。为了克服这些限制,提出了一种称为LDM-DTI的端到端预测体系结构。在这个框架中,药物和蛋白质序列通过预训练的大型语言模型进行编码。具体来说,ChemBERTa用于从SMILES字符串中提取高维语义和结构特征,而ProtBERT用于从氨基酸序列中提取上下文表示。为了进一步整合空间分子信息,我们集成了三层图卷积网络(GCN)和等变图神经网络(EGNN)来捕捉药物分子的二维拓扑和三维几何特征。通过动态卷积运算和多头自注意机制来细化蛋白质水平的特征。然后,这些表征通过动态交互注意模块(DIAM)进行融合,以模拟药物和靶标之间的跨模态依赖关系。提出的框架在四个公共基准数据集上展示了卓越的预测性能和通用性,始终超过十个最先进的基线。消融实验是为了量化单个成分的贡献,蛋白质水平的注意图是可视化的,以提高可解释性。总体而言,LDM-DTI为DTI预测提供了一个稳健且可解释的解决方案,具有加速结构信息药物发现的强大潜力。
{"title":"LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction","authors":"Yuanyuan Ji ,&nbsp;Zhuo Chen ,&nbsp;Zhihan Liu ,&nbsp;Xiaofeng Man ,&nbsp;Junwei Du ,&nbsp;Bin Yu","doi":"10.1016/j.eswa.2026.131485","DOIUrl":"10.1016/j.eswa.2026.131485","url":null,"abstract":"<div><div>Accurate prediction of drug-target interactions (DTIs) is essential for speeding up the discovery of new therapeutics. Although significant progress has been made with deep learning-based approaches, considerable challenges remain in learning informative molecular representations and modeling the intricate nature of drug-target associations. To overcome these limitations, an end-to-end predictive architecture, termed LDM-DTI, is proposed. In this framework, drug and protein sequences are encoded via pretrained large language models. Specifically, ChemBERTa is utilized to derive high-dimensional semantic and structural features from SMILES strings, while ProtBERT is employed to extract contextual representations from amino acid sequences. To further incorporate spatial molecular information, a three-layer Graph Convolutional Network (GCN) and an Equivariant Graph Neural Network (EGNN) are integrated to capture both 2D topological and 3D geometric characteristics of drug molecules. Protein-level features are refined through dynamic convolutional operations and multi-head self-attention mechanisms. These representations are then fused via a Dynamic Interactive Attention Module (DIAM) to model cross-modal dependencies between drugs and targets. The proposed framework demonstrates superior predictive performance and generalizability across four public benchmark datasets, consistently surpassing ten state-of-the-art baselines. Ablation experiments are conducted to quantify the contributions of individual components, and protein-level attention maps are visualized to enhance interpretability. Overall, LDM-DTI offers a robust and interpretable solution for DTI prediction, with strong potential for accelerating structure-informed drug discovery.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131485"},"PeriodicalIF":7.5,"publicationDate":"2026-06-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
RVFormer: Keypoint-based fusion of 4D radar and vision for 3D object detection in autonomous driving RVFormer:基于关键点的四维雷达与视觉融合,用于自动驾驶中3D物体检测
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-25 Epub Date: 2026-02-03 DOI: 10.1016/j.eswa.2026.131497
Xin Bi , Caien Weng , Panpan Tong , Arno Eichberger , Lu Xiong
Multi-modal fusion is crucial in autonomous driving perception, enhancing reliability, completeness, and accuracy, which extends the performance limits of perception systems. Specifically, large-scale perception through 4D radar and vision fusion has become a key research focus aimed at improving driving safety, enhancing complex scene understanding, and supporting fine-grained local planning and control. However, existing 3D object detection methods typically rely on fixed-voxel representations to maintain detection accuracy. As the perception range increases, these methods incur considerable computational overhead. While transformer-based query methods show strong potential in capturing dependencies over large receptive fields in image-domain tasks, their application in radar-vision fusion is limited due to radar point cloud sparsity and cross-modal alignment challenges. To address these limitations, we propose RVFormer, a dual-branch feature-level fusion network that uses a sparse keypoint-based query strategy to integrate features from both modalities, thereby mitigating the impact of large-scale scenes on inference speed. Additionally, we introduce clustered voxel query initialization (CVQI) to accelerate convergence and enhance object localization. By incorporating the radar voxel painter (RVP), radar-image cross-attention (RICA), and gated adaptive fusion (GAF) modules, our framework enables deep and adaptive fusion of radar and visual features, effectively mitigating issues caused by point cloud sparsity and modality inconsistency. Compared to existing radar-vision fusion models, RVFormer demonstrates competitive performance, with an inference speed of approximately 15.2 frames per second. It delivers accuracy comparable to CNN-based approaches, while outperforming baseline methods by at least 4.72% in 3D mean average precision and 5.82% in bird’s-eye view mean average precision.
多模态融合在自动驾驶感知中至关重要,提高了感知系统的可靠性、完整性和准确性,扩展了感知系统的性能极限。具体而言,通过4D雷达和视觉融合进行大规模感知已成为提高驾驶安全性、增强复杂场景理解和支持细粒度局部规划和控制的关键研究热点。然而,现有的3D物体检测方法通常依赖于固定体素表示来保持检测精度。随着感知范围的增加,这些方法会产生相当大的计算开销。虽然基于变压器的查询方法在图像域任务中显示出捕获大型接受域依赖关系的强大潜力,但由于雷达点云稀疏和跨模态对齐挑战,它们在雷达视觉融合中的应用受到限制。为了解决这些限制,我们提出了RVFormer,这是一种双分支特征级融合网络,它使用基于稀疏关键点的查询策略来整合两种模式的特征,从而减轻了大规模场景对推理速度的影响。此外,我们引入了聚类体素查询初始化(CVQI)来加速收敛和增强目标定位。通过整合雷达体素绘制(RVP)、雷达图像交叉关注(RICA)和门控自适应融合(GAF)模块,我们的框架能够实现雷达和视觉特征的深度和自适应融合,有效缓解点云稀疏和模态不一致造成的问题。与现有的雷达-视觉融合模型相比,RVFormer具有竞争力的性能,推理速度约为每秒15.2帧。它提供的精度与基于cnn的方法相当,而在3D平均精度上至少比基线方法高4.72%,在鸟瞰平均精度上至少比基线方法高5.82%。
{"title":"RVFormer: Keypoint-based fusion of 4D radar and vision for 3D object detection in autonomous driving","authors":"Xin Bi ,&nbsp;Caien Weng ,&nbsp;Panpan Tong ,&nbsp;Arno Eichberger ,&nbsp;Lu Xiong","doi":"10.1016/j.eswa.2026.131497","DOIUrl":"10.1016/j.eswa.2026.131497","url":null,"abstract":"<div><div>Multi-modal fusion is crucial in autonomous driving perception, enhancing reliability, completeness, and accuracy, which extends the performance limits of perception systems. Specifically, large-scale perception through 4D radar and vision fusion has become a key research focus aimed at improving driving safety, enhancing complex scene understanding, and supporting fine-grained local planning and control. However, existing 3D object detection methods typically rely on fixed-voxel representations to maintain detection accuracy. As the perception range increases, these methods incur considerable computational overhead. While transformer-based query methods show strong potential in capturing dependencies over large receptive fields in image-domain tasks, their application in radar-vision fusion is limited due to radar point cloud sparsity and cross-modal alignment challenges. To address these limitations, we propose RVFormer, a dual-branch feature-level fusion network that uses a sparse keypoint-based query strategy to integrate features from both modalities, thereby mitigating the impact of large-scale scenes on inference speed. Additionally, we introduce clustered voxel query initialization (CVQI) to accelerate convergence and enhance object localization. By incorporating the radar voxel painter (RVP), radar-image cross-attention (RICA), and gated adaptive fusion (GAF) modules, our framework enables deep and adaptive fusion of radar and visual features, effectively mitigating issues caused by point cloud sparsity and modality inconsistency. Compared to existing radar-vision fusion models, RVFormer demonstrates competitive performance, with an inference speed of approximately 15.2 frames per second. It delivers accuracy comparable to CNN-based approaches, while outperforming baseline methods by at least 4.72% in 3D mean average precision and 5.82% in bird’s-eye view mean average precision.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131497"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192560","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Amplitude -phase decomposition-based latent diffusion model for underwater image enhancement 基于幅相分解的水下图像潜扩散增强模型
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-05-25 Epub Date: 2026-01-29 DOI: 10.1016/j.eswa.2026.131296
Hansen Zhang , Miao Yang , Can Pan , Leyuan Wang , Jiaju Tao
Underwater images commonly suffer from blurring, low contrast, and color distortion due to light scattering effects and wavelength-dependent absorption. From a frequency-domain perspective, these degradations significantly impair the spectral representation: they not only attenuate high-frequency amplitudes (causing detail loss) but also weaken specific color channel energies (inducing color deviation) while reducing overall spectral energy (leading to contrast deterioration). To address these challenges, this paper proposes an innovative two-stage underwater image enhancement framework, termed APD-LDM. In the first stage, we design an Amplitude-Phase Decomposition Network (APDNet) that performs end-to-end learning on paired underwater image data to preliminarily recover amplitude information degraded by absorption and scattering effects. The second stage employs a conditional diffusion model for refined reconstruction, where latent representations of degraded images serve as conditional constraints to guide the diffusion process toward more realistic underwater image features. Additionally, we introduce a self-constrained consistency loss function to further optimize network training. Extensive experiments demonstrate that the proposed method achieves superior effectiveness and robustness in both subjective visual quality and objective metrics. The code is available at https://github.com/JOU-UIP/APD-LDM.
由于光散射效应和波长依赖的吸收,水下图像通常会出现模糊、低对比度和颜色失真。从频域的角度来看,这些退化严重损害了光谱表示:它们不仅衰减了高频幅度(导致细节损失),而且减弱了特定颜色通道能量(诱导颜色偏差),同时降低了总体光谱能量(导致对比度下降)。为了解决这些挑战,本文提出了一种创新的两阶段水下图像增强框架,称为APD-LDM。在第一阶段,我们设计了一个幅相分解网络(APDNet),对成对的水下图像数据进行端到端学习,初步恢复被吸收和散射效应退化的幅值信息。第二阶段采用条件扩散模型进行精细重建,其中退化图像的潜在表示作为条件约束,引导扩散过程向更真实的水下图像特征扩散。此外,我们引入了自约束一致性损失函数来进一步优化网络训练。大量的实验表明,该方法在主观视觉质量和客观度量方面都具有良好的有效性和鲁棒性。代码可在https://github.com/JOU-UIP/APD-LDM上获得。
{"title":"Amplitude -phase decomposition-based latent diffusion model for underwater image enhancement","authors":"Hansen Zhang ,&nbsp;Miao Yang ,&nbsp;Can Pan ,&nbsp;Leyuan Wang ,&nbsp;Jiaju Tao","doi":"10.1016/j.eswa.2026.131296","DOIUrl":"10.1016/j.eswa.2026.131296","url":null,"abstract":"<div><div>Underwater images commonly suffer from blurring, low contrast, and color distortion due to light scattering effects and wavelength-dependent absorption. From a frequency-domain perspective, these degradations significantly impair the spectral representation: they not only attenuate high-frequency amplitudes (causing detail loss) but also weaken specific color channel energies (inducing color deviation) while reducing overall spectral energy (leading to contrast deterioration). To address these challenges, this paper proposes an innovative two-stage underwater image enhancement framework, termed APD-LDM. In the first stage, we design an Amplitude-Phase Decomposition Network (APDNet) that performs end-to-end learning on paired underwater image data to preliminarily recover amplitude information degraded by absorption and scattering effects. The second stage employs a conditional diffusion model for refined reconstruction, where latent representations of degraded images serve as conditional constraints to guide the diffusion process toward more realistic underwater image features. Additionally, we introduce a self-constrained consistency loss function to further optimize network training. Extensive experiments demonstrate that the proposed method achieves superior effectiveness and robustness in both subjective visual quality and objective metrics. The code is available at <span><span>https://github.com/JOU-UIP/APD-LDM</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131296"},"PeriodicalIF":7.5,"publicationDate":"2026-05-25","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192779","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Expert Systems with Applications
全部 Geobiology Appl. Clay Sci. Geochim. Cosmochim. Acta J. Hydrol. Org. Geochem. Carbon Balance Manage. Contrib. Mineral. Petrol. Int. J. Biometeorol. IZV-PHYS SOLID EART+ J. Atmos. Chem. Acta Oceanolog. Sin. Acta Geophys. ACTA GEOL POL ACTA PETROL SIN ACTA GEOL SIN-ENGL AAPG Bull. Acta Geochimica Adv. Atmos. Sci. Adv. Meteorol. Am. J. Phys. Anthropol. Am. J. Sci. Am. Mineral. Annu. Rev. Earth Planet. Sci. Appl. Geochem. Aquat. Geochem. Ann. Glaciol. Archaeol. Anthropol. Sci. ARCHAEOMETRY ARCT ANTARCT ALP RES Asia-Pac. J. Atmos. Sci. ATMOSPHERE-BASEL Atmos. Res. Aust. J. Earth Sci. Atmos. Chem. Phys. Atmos. Meas. Tech. Basin Res. Big Earth Data BIOGEOSCIENCES Geostand. Geoanal. Res. GEOLOGY Geosci. J. Geochem. J. Geochem. Trans. Geosci. Front. Geol. Ore Deposits Global Biogeochem. Cycles Gondwana Res. Geochem. Int. Geol. J. Geophys. Prospect. Geosci. Model Dev. GEOL BELG GROUNDWATER Hydrogeol. J. Hydrol. Earth Syst. Sci. Hydrol. Processes Int. J. Climatol. Int. J. Earth Sci. Int. Geol. Rev. Int. J. Disaster Risk Reduct. Int. J. Geomech. Int. J. Geog. Inf. Sci. Isl. Arc J. Afr. Earth. Sci. J. Adv. Model. Earth Syst. J APPL METEOROL CLIM J. Atmos. Oceanic Technol. J. Atmos. Sol. Terr. Phys. J. Clim. J. Earth Sci. J. Earth Syst. Sci. J. Environ. Eng. Geophys. J. Geog. Sci. Mineral. Mag. Miner. Deposita Mon. Weather Rev. Nat. Hazards Earth Syst. Sci. Nat. Clim. Change Nat. Geosci. Ocean Dyn. Ocean and Coastal Research npj Clim. Atmos. Sci. Ocean Modell. Ocean Sci. Ore Geol. Rev. OCEAN SCI J Paleontol. J. PALAEOGEOGR PALAEOCL PERIOD MINERAL PETROLOGY+ Phys. Chem. Miner. Polar Sci. Prog. Oceanogr. Quat. Sci. Rev. Q. J. Eng. Geol. Hydrogeol. RADIOCARBON Pure Appl. Geophys. Resour. Geol. Rev. Geophys. Sediment. Geol.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1