首页 > 最新文献

Expert Systems with Applications最新文献

英文 中文
Conflict-aware semi-supervised mutual learning for medical image segmentation 基于冲突感知的半监督互学习医学图像分割
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131544
Wenlong Hang , Beijing Wang , Shuang Liang , Qingfeng Zhang , Qiang Wu , Yukun Jin , Qiong Wang , Jing Qin
Semi-supervised learning (SSL) has shown promising performance in medical image segmentation by effectively utilizing extensive unlabeled images. However, inaccurate predictions of unlabeled images can significantly impair the segmentation performance of SSL models. Furthermore, most current SSL methods lack mechanisms to handle cognitive bias, causing the model easily overfit on inaccurate predictions and making self-correction challenging. In this work, we propose a conflict-aware semi-supervised mutual learning framework (CSSML), which integrates two different subnetworks and selectively utilizes conflicting pseudo-labels for mutual supervision to address these challenges. Specifically, we introduce two subnetworks with different architecture incorporating a conflict-aware distinct feature learning (CDFL) regularization to avoid the homogenization of subnetworks while promoting diversified predictions. To handle potential inaccurate predictions, we introduce a geometry-aware mutual pseudo supervision (GMPS) regularization to determine the reliability of conflicting pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The synergistic learning between CDFL and GMPS regularizations during the training process facilitates each subnetwork to selectively incorporates reliable knowledge from the other subnetwork, thereby helping the model overcome cognitive bias. Extensive experiments on three public medical image datasets demonstrate that the proposed CSSML achieves an average of 80.65% DSC, 87.83% Precision, and 14.48mm 95HD using only 20% labeled data, highlight-ing its superior performance. The code is available at: https://github.com/Mwnic-AI/CSSML.
半监督学习(SSL)通过有效地利用大量未标记图像,在医学图像分割中显示出良好的性能。然而,对未标记图像的不准确预测会严重损害SSL模型的分割性能。此外,大多数当前SSL方法缺乏处理认知偏差的机制,导致模型容易对不准确的预测进行过拟合,并使自我纠正变得困难。在这项工作中,我们提出了一个冲突感知半监督相互学习框架(CSSML),它集成了两个不同的子网,并有选择地利用冲突的伪标签进行相互监督来解决这些挑战。具体来说,我们引入了两个具有不同架构的子网,其中包含冲突感知的独特特征学习(CDFL)正则化,以避免子网的同质化,同时促进多样化的预测。为了处理潜在的不准确预测,我们引入了几何感知的相互伪监督(GMPS)正则化来确定未标记图像的冲突伪标签的可靠性,并有选择地利用两个子网中更可靠的伪标签来监督另一个子网。在训练过程中,CDFL和GMPS正则化之间的协同学习有助于每个子网选择性地吸收来自其他子网的可靠知识,从而帮助模型克服认知偏差。在三个公共医学图像数据集上的大量实验表明,仅使用20%的标记数据,CSSML的平均DSC为80.65%,精度为87.83%,95HD为14.48mm,突出了其优越的性能。代码可从https://github.com/Mwnic-AI/CSSML获得。
{"title":"Conflict-aware semi-supervised mutual learning for medical image segmentation","authors":"Wenlong Hang ,&nbsp;Beijing Wang ,&nbsp;Shuang Liang ,&nbsp;Qingfeng Zhang ,&nbsp;Qiang Wu ,&nbsp;Yukun Jin ,&nbsp;Qiong Wang ,&nbsp;Jing Qin","doi":"10.1016/j.eswa.2026.131544","DOIUrl":"10.1016/j.eswa.2026.131544","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) has shown promising performance in medical image segmentation by effectively utilizing extensive unlabeled images. However, inaccurate predictions of unlabeled images can significantly impair the segmentation performance of SSL models. Furthermore, most current SSL methods lack mechanisms to handle cognitive bias, causing the model easily overfit on inaccurate predictions and making self-correction challenging. In this work, we propose a conflict-aware semi-supervised mutual learning framework (CSSML), which integrates two different subnetworks and selectively utilizes conflicting pseudo-labels for mutual supervision to address these challenges. Specifically, we introduce two subnetworks with different architecture incorporating a conflict-aware distinct feature learning (CDFL) regularization to avoid the homogenization of subnetworks while promoting diversified predictions. To handle potential inaccurate predictions, we introduce a geometry-aware mutual pseudo supervision (GMPS) regularization to determine the reliability of conflicting pseudo-labels of unlabeled images, and selectively leverage the more reliable pseudo-labels in the two subnetworks to supervise the other one. The synergistic learning between CDFL and GMPS regularizations during the training process facilitates each subnetwork to selectively incorporates reliable knowledge from the other subnetwork, thereby helping the model overcome cognitive bias. Extensive experiments on three public medical image datasets demonstrate that the proposed CSSML achieves an average of 80.65% DSC, 87.83% Precision, and 14.48mm 95HD using only 20% labeled data, highlight-ing its superior performance. The code is available at: <span><span>https://github.com/Mwnic-AI/CSSML</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131544"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154270","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards robust brain tumor segmentation under modality incompleteness: A contribution-optimized edge-enhanced network 模态不完备下稳健的脑肿瘤分割:一种贡献优化的边缘增强网络
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131396
Yanfeng He, Fangning Hu, Guoxiang Tong
Multimodal medical image segmentation plays a crucial role in disease diagnosis, as different MRI modalities provide complementary structural and lesion information. However, in clinical practice, the absence of certain modalities often leads to a significant decline in segmentation performance, limiting the application of multimodal methods. To address this issue, we propose a multimodal segmentation model called MECS-Net, which combines modality contribution optimization, edge enhancement, and efficient feature fusion. Based on four MRI modalities (Flair, T1ce, T1, T2), we further introduce edge features as auxiliary modalities to enhance the perception of critical structural boundaries. The model incorporates a modality contribution measurement mechanism to quantify the actual predictive value of each modality at the sample level and performs resampling training on low-contribution modalities to mitigate performance degradation caused by modality missing. The feature fusion module combines multi-head cross-attention and state space modeling (Mamba), where the former enhances fine-grained interactions between modalities and the latter models cross-modal global dependencies, synergistically improving semantic alignment and fusion effects. Extensive experiments on the BraTS 2020 dataset demonstrate that MECS-Net achieves outstanding performance under both complete and incomplete modality conditions. The Dice coefficients for WT (whole tumor area) and TC (tumor core area) reach 91.8% and 86.4%, respectively, under complete modality conditions, and average 86.7% and 79.1%, respectively, under incomplete modality conditions.
多模态医学图像分割在疾病诊断中起着至关重要的作用,因为不同的MRI模式提供了互补的结构和病变信息。然而,在临床实践中,某些模态的缺失往往导致分割性能显著下降,限制了多模态方法的应用。为了解决这一问题,我们提出了一种多模态分割模型MECS-Net,该模型结合了模态贡献优化、边缘增强和高效特征融合。基于四种MRI模式(Flair, T1ce, T1, T2),我们进一步引入边缘特征作为辅助模式来增强关键结构边界的感知。该模型结合了模态贡献测量机制,在样本水平上量化每个模态的实际预测值,并对低贡献模态进行重采样训练,以减轻模态缺失导致的性能下降。特征融合模块结合了多头交叉注意和状态空间建模(Mamba),前者增强了模态之间的细粒度交互,后者建模了跨模态的全局依赖,协同提高了语义对齐和融合效果。在BraTS 2020数据集上的大量实验表明,MECS-Net在完全和不完全模态条件下都取得了出色的性能。在完全模态条件下,WT(肿瘤全区)和TC(肿瘤核心区)的Dice系数分别达到91.8%和86.4%,在不完全模态条件下,其平均值分别为86.7%和79.1%。
{"title":"Towards robust brain tumor segmentation under modality incompleteness: A contribution-optimized edge-enhanced network","authors":"Yanfeng He,&nbsp;Fangning Hu,&nbsp;Guoxiang Tong","doi":"10.1016/j.eswa.2026.131396","DOIUrl":"10.1016/j.eswa.2026.131396","url":null,"abstract":"<div><div>Multimodal medical image segmentation plays a crucial role in disease diagnosis, as different MRI modalities provide complementary structural and lesion information. However, in clinical practice, the absence of certain modalities often leads to a significant decline in segmentation performance, limiting the application of multimodal methods. To address this issue, we propose a multimodal segmentation model called MECS-Net, which combines modality contribution optimization, edge enhancement, and efficient feature fusion. Based on four MRI modalities (Flair, T1ce, T1, T2), we further introduce edge features as auxiliary modalities to enhance the perception of critical structural boundaries. The model incorporates a modality contribution measurement mechanism to quantify the actual predictive value of each modality at the sample level and performs resampling training on low-contribution modalities to mitigate performance degradation caused by modality missing. The feature fusion module combines multi-head cross-attention and state space modeling (Mamba), where the former enhances fine-grained interactions between modalities and the latter models cross-modal global dependencies, synergistically improving semantic alignment and fusion effects. Extensive experiments on the BraTS 2020 dataset demonstrate that MECS-Net achieves outstanding performance under both complete and incomplete modality conditions. The Dice coefficients for WT (whole tumor area) and TC (tumor core area) reach 91.8% and 86.4%, respectively, under complete modality conditions, and average 86.7% and 79.1%, respectively, under incomplete modality conditions.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131396"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154271","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Federated learning with dynamics-aware loss for label noise 具有标签噪声动态感知损失的联邦学习
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131523
Chengtian Ouyang , Jihong Mao , Zhiquan Liu , Donglin Zhu , Changjun Zhou , Gangqiang Hu , Taiyong Li
In the domain of Internet of Things, federated learning is gradually becoming a key technology for achieving safe and efficient implementation of artificial intelligence. Through distributed collaboration mechanisms, it enables edge intelligence while protecting data privacy and reducing communication costs. In the real federated learning system, clients usually exhibit variable levels of label noise, and local training tends to overfit the label noise resulting in decreased generalization performance of the model. Despite the existence of many research findings on the problem of data heterogeneity, these methods are not effective in dealing with label noise. Thus, tackling label noise problem is one of the keys to facilitating the development of federal learning. In the research, an adaptive framework FedDAL is proposed to combat federated learning with label noise. In the pre-training stage, the server identifies noisy clients by the unreliability score. The module named distance-sensitive truncation is designed to improve identification accuracy. In the federated learning stage, noisy clients train local models by dynamics-aware loss to mitigate the adverse effects of label noise. Finally, the server carries out loss normalization and weight adjustment aggregation taking into account the data volume and the aggregate class mean loss. Experimental results on multiple datasets demonstrate that FedDAL effectively addresses label noise overfitting, improves model generalization performance and outperforms state-of-the-art methods across multiple distributions of label noise. Our code is available at https://github.com/Donglin0730/FedDAL.
在物联网领域,联邦学习正逐渐成为实现人工智能安全高效实施的关键技术。通过分布式协作机制,它在保护数据隐私和降低通信成本的同时实现边缘智能。在真实的联邦学习系统中,客户端通常表现出不同程度的标签噪声,而局部训练往往会过度拟合标签噪声,导致模型泛化性能下降。尽管存在许多关于数据异质性问题的研究成果,但这些方法在处理标签噪声方面并不有效。因此,解决标签噪音问题是促进联邦学习发展的关键之一。在研究中,提出了一种自适应框架FedDAL来对抗带有标签噪声的联邦学习。在预训练阶段,服务器通过不可靠性评分来识别有噪声的客户端。为了提高识别精度,设计了距离敏感截断模块。在联邦学习阶段,噪声客户端通过动态感知损失来训练局部模型,以减轻标签噪声的不利影响。最后,服务器根据数据量和汇总类平均损失进行损失归一化和权重调整聚合。在多个数据集上的实验结果表明,FedDAL有效地解决了标签噪声过拟合问题,提高了模型泛化性能,并且在标签噪声的多个分布中优于最先进的方法。我们的代码可在https://github.com/Donglin0730/FedDAL上获得。
{"title":"Federated learning with dynamics-aware loss for label noise","authors":"Chengtian Ouyang ,&nbsp;Jihong Mao ,&nbsp;Zhiquan Liu ,&nbsp;Donglin Zhu ,&nbsp;Changjun Zhou ,&nbsp;Gangqiang Hu ,&nbsp;Taiyong Li","doi":"10.1016/j.eswa.2026.131523","DOIUrl":"10.1016/j.eswa.2026.131523","url":null,"abstract":"<div><div>In the domain of Internet of Things, federated learning is gradually becoming a key technology for achieving safe and efficient implementation of artificial intelligence. Through distributed collaboration mechanisms, it enables edge intelligence while protecting data privacy and reducing communication costs. In the real federated learning system, clients usually exhibit variable levels of label noise, and local training tends to overfit the label noise resulting in decreased generalization performance of the model. Despite the existence of many research findings on the problem of data heterogeneity, these methods are not effective in dealing with label noise. Thus, tackling label noise problem is one of the keys to facilitating the development of federal learning. In the research, an adaptive framework FedDAL is proposed to combat federated learning with label noise. In the pre-training stage, the server identifies noisy clients by the unreliability score. The module named distance-sensitive truncation is designed to improve identification accuracy. In the federated learning stage, noisy clients train local models by dynamics-aware loss to mitigate the adverse effects of label noise. Finally, the server carries out loss normalization and weight adjustment aggregation taking into account the data volume and the aggregate class mean loss. Experimental results on multiple datasets demonstrate that FedDAL effectively addresses label noise overfitting, improves model generalization performance and outperforms state-of-the-art methods across multiple distributions of label noise. Our code is available at <span><span>https://github.com/Donglin0730/FedDAL</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131523"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192585","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
TOM: An open-source tongue segmentation method with multi-teacher distillation and task-specific data augmentation TOM:一种基于多教师蒸馏和特定任务数据增强的开源舌头分割方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131499
Jiacheng Xie , Ziyang Zhang , Biplab Poudel , Congyu Guo , Yang Yu , Guanghui An , Xiaoting Tang , Lening Zhao , Chunhui Xu , Dong Xu
Tongue imaging serves as a valuable diagnostic modality, particularly in Traditional Chinese Medicine (TCM). The quality of tongue surface segmentation significantly affects the accuracy of tongue image classification and subsequent diagnosis in intelligent tongue diagnosis systems. However, existing research on tongue image segmentation exhibits significant limitations, including sensitivity to lighting and background noise, similarity in color with surrounding tissues, and a lack of robust and user-friendly segmentation tools. This paper proposes a tongue image segmentation method (TOM) based on multi-teacher knowledge distillation. By introducing a novel diffusion-based data augmentation method, we notably improved the generalization ability of the segmentation model while reducing its parameter size. Notably, after reducing the parameter count by 96.6% compared to the largest teacher models, the student model still achieves an impressive segmentation performance of 95.22% mIoU. Furthermore, we packaged and deployed the trained model as an online and offline segmentation tool (available at https://itongue.cn/), allowing TCM practitioners and researchers to use it without any programming experience. We also present a case study on TCM constitution classification using segmented tongue patches. Experimental results demonstrate that training with tongue patches yields higher classification performance and better interpretability than original tongue images. To the best of our knowledge, this is the first open-source and freely available tongue image segmentation tool.
舌头成像是一种有价值的诊断方式,特别是在中医(TCM)中。在智能舌头诊断系统中,舌面分割的质量直接影响到舌图像分类和后续诊断的准确性。然而,现有的舌头图像分割研究存在明显的局限性,包括对光线和背景噪声的敏感性,与周围组织的颜色相似性,以及缺乏鲁棒性和用户友好的分割工具。提出了一种基于多教师知识精馏的舌头图像分割方法。通过引入一种新的基于扩散的数据增强方法,在减小分割模型参数大小的同时,显著提高了分割模型的泛化能力。值得注意的是,与最大的教师模型相比,在减少了96.6%的参数数量后,学生模型仍然取得了95.22% mIoU的令人印象深刻的分割性能。此外,我们将训练好的模型打包并部署为在线和离线分割工具(可在https://itongue.cn/上获得),允许中医从业者和研究人员在没有任何编程经验的情况下使用它。我们还介绍了一个使用分段舌贴进行中医体质分类的案例研究。实验结果表明,与原始舌图相比,舌片训练具有更高的分类性能和更好的可解释性。据我们所知,这是第一个开源和免费提供的舌头图像分割工具。
{"title":"TOM: An open-source tongue segmentation method with multi-teacher distillation and task-specific data augmentation","authors":"Jiacheng Xie ,&nbsp;Ziyang Zhang ,&nbsp;Biplab Poudel ,&nbsp;Congyu Guo ,&nbsp;Yang Yu ,&nbsp;Guanghui An ,&nbsp;Xiaoting Tang ,&nbsp;Lening Zhao ,&nbsp;Chunhui Xu ,&nbsp;Dong Xu","doi":"10.1016/j.eswa.2026.131499","DOIUrl":"10.1016/j.eswa.2026.131499","url":null,"abstract":"<div><div>Tongue imaging serves as a valuable diagnostic modality, particularly in Traditional Chinese Medicine (TCM). The quality of tongue surface segmentation significantly affects the accuracy of tongue image classification and subsequent diagnosis in intelligent tongue diagnosis systems. However, existing research on tongue image segmentation exhibits significant limitations, including sensitivity to lighting and background noise, similarity in color with surrounding tissues, and a lack of robust and user-friendly segmentation tools. This paper proposes a <strong>to</strong>ngue image segmentation <strong>m</strong><strong>ethod</strong> (TOM) based on multi-teacher knowledge distillation. By introducing a novel diffusion-based data augmentation method, we notably improved the generalization ability of the segmentation model while reducing its parameter size. Notably, after reducing the parameter count by 96.6% compared to the largest teacher models, the student model still achieves an impressive segmentation performance of 95.22% mIoU. Furthermore, we packaged and deployed the trained model as an online and offline segmentation tool (available at <span><span>https://itongue.cn/</span><svg><path></path></svg></span>), allowing TCM practitioners and researchers to use it without any programming experience. We also present a case study on TCM constitution classification using segmented tongue patches. Experimental results demonstrate that training with tongue patches yields higher classification performance and better interpretability than original tongue images. To the best of our knowledge, this is the first open-source and freely available tongue image segmentation tool.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131499"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154240","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Investigating spatial-temporal bias of LLMs 法学硕士的时空偏差研究
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131542
Zijun Li
Large Language Models (LLMs) are emerging as powerful knowledge and expert systems with notable capabilities in understanding and inferring various intelligent tasks. However, their spatiotemporal cognition biases remain largely underexplored, despite being highly consequential for effectively leveraging LLMs to power diverse applications in understanding, explaining, and forecasting such tasks. In light of this, this paper presents an investigation of the presence and patterns of spatiotemporal bias in LLMs. Specifically, this paper first constructs two datasets from the perspectives of economic and social forecasting, each paired with corresponding model-predicted values for the same spatiotemporal scope across four different LLMs. Then, a novel autocorrelation measurement approach is introduced, alongside a set of quantification methods, to jointly evaluate correlation in biases across both space and time. The results show notable variation in performance and bias across models and tasks, with uncommon and more sensitive tasks exhibiting worse performance, and certain LLMs producing regionally clustered errors while others exhibit near-random distributions. Out of all other methods of changing prompts, incorporating temporal context significantly improves predictive accuracy, particularly for volatile or low-frequency events. Overall, these findings highlight the partial but inconsistent internalization of real-world spatiotemporal patterns in LLMs, and the proposed methods provide tools for quantifying and interpreting spatiotemporal bias, thereby offering guidance for designing fairer and more reliable LLM-based expert systems and applications.
大型语言模型(llm)作为一种强大的知识和专家系统,在理解和推断各种智能任务方面具有显著的能力。然而,他们的时空认知偏差在很大程度上仍未得到充分探索,尽管有效地利用法学硕士来推动理解、解释和预测这些任务的各种应用是非常重要的。鉴于此,本文对法学硕士中时空偏差的存在和模式进行了研究。具体而言,本文首先从经济和社会预测的角度构建了两个数据集,每个数据集对应四个不同llm在相同时空范围内的相应模型预测值。然后,引入了一种新的自相关测量方法,以及一套量化方法,以联合评估跨空间和时间的偏差相关性。结果显示,不同模型和任务的性能和偏差存在显著差异,不常见和更敏感的任务表现出更差的性能,某些llm产生区域聚类错误,而其他llm则表现出近乎随机的分布。在所有其他改变提示的方法中,结合时间上下文可以显著提高预测的准确性,特别是对于易变事件或低频事件。总体而言,这些发现突出了法学硕士对现实世界时空模式的部分但不一致的内在化,所提出的方法为量化和解释时空偏见提供了工具,从而为设计更公平、更可靠的基于法学硕士的专家系统和应用程序提供了指导。
{"title":"Investigating spatial-temporal bias of LLMs","authors":"Zijun Li","doi":"10.1016/j.eswa.2026.131542","DOIUrl":"10.1016/j.eswa.2026.131542","url":null,"abstract":"<div><div>Large Language Models (LLMs) are emerging as powerful knowledge and expert systems with notable capabilities in understanding and inferring various intelligent tasks. However, their spatiotemporal cognition biases remain largely underexplored, despite being highly consequential for effectively leveraging LLMs to power diverse applications in understanding, explaining, and forecasting such tasks. In light of this, this paper presents an investigation of the presence and patterns of spatiotemporal bias in LLMs. Specifically, this paper first constructs two datasets from the perspectives of economic and social forecasting, each paired with corresponding model-predicted values for the same spatiotemporal scope across four different LLMs. Then, a novel autocorrelation measurement approach is introduced, alongside a set of quantification methods, to jointly evaluate correlation in biases across both space and time. The results show notable variation in performance and bias across models and tasks, with uncommon and more sensitive tasks exhibiting worse performance, and certain LLMs producing regionally clustered errors while others exhibit near-random distributions. Out of all other methods of changing prompts, incorporating temporal context significantly improves predictive accuracy, particularly for volatile or low-frequency events. Overall, these findings highlight the partial but inconsistent internalization of real-world spatiotemporal patterns in LLMs, and the proposed methods provide tools for quantifying and interpreting spatiotemporal bias, thereby offering guidance for designing fairer and more reliable LLM-based expert systems and applications.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131542"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154236","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
On ground track design of unmanned fixed-wing drone aided relaying in windy environments 多风环境下无人固定翼无人机辅助接力地面轨道设计
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131539
Xuan Zhu , Xiaodong Ji , Ansheng Yin
This article studies an unmanned fixed-wing drone (UFD) aided relaying in windy environments, where the UFD serves as a full-duplex amplify-and-forward relay to forward a desired size of data for two ground terminals. In light of aerodynamics and the wind triangle, the UFD’s engine power required for flying at a constant airspeed along a circular ground track in a three dimensional uniform wind is analyzed, giving a corresponding closed-form expression. It is shown that the UFD’s engine power depends upon its airspeed and bank angle in addition to the wind-speed and the corresponding vertical angle. On this basis, an optimization problem corresponding to the UFD’s ground track design is investigated. Using the block coordinate descent technique, the initial problem is decomposed into two sub-problems, which are addressed by four algorithms (Algorithms 1–4). This leads to an iterative algorithm (Algorithm 5) that optimizes the UFD’s airspeed and adjusts its flight parameters (e.g., time, radius, and the angles of pitch, course, crab, heading, and bank) to follow the desired ground track. Computer simulation results verified that the proposed algorithm achieves the best energy-saving performance, and generates a small bank angle with minimal variation during flight. This characteristic alleviates the demand for fast bank angle command following when adjusting the UFD’s flight parameters in windy environments.
本文研究了无人固定翼无人机(UFD)在多风环境中的辅助中继,UFD作为全双工放大转发中继,为两个地面终端转发所需大小的数据。从空气动力学和风三角的角度出发,分析了在三维均布风条件下沿圆形地面轨道匀速飞行所需的发动机功率,给出了相应的封闭表达式。结果表明,UFD的发动机功率除了取决于风速和相应的垂直角度外,还取决于其空速和倾侧角。在此基础上,研究了UFD地面轨道设计的优化问题。采用分块坐标下降技术,将初始问题分解为两个子问题,分别采用算法1-4进行求解。这导致了一个迭代算法(算法5),优化UFD的空速并调整其飞行参数(例如,时间,半径,俯仰角,航向,夹角,航向和倾斜角)以遵循所需的地面轨迹。计算机仿真结果验证了该算法达到了最佳的节能性能,且在飞行过程中产生的倾斜角较小且变化最小。这一特性减轻了在多风环境下调整UFD飞行参数时对快速倾斜角度指令跟随的需求。
{"title":"On ground track design of unmanned fixed-wing drone aided relaying in windy environments","authors":"Xuan Zhu ,&nbsp;Xiaodong Ji ,&nbsp;Ansheng Yin","doi":"10.1016/j.eswa.2026.131539","DOIUrl":"10.1016/j.eswa.2026.131539","url":null,"abstract":"<div><div>This article studies an unmanned fixed-wing drone (UFD) aided relaying in windy environments, where the UFD serves as a full-duplex amplify-and-forward relay to forward a desired size of data for two ground terminals. In light of aerodynamics and the wind triangle, the UFD’s engine power required for flying at a constant airspeed along a circular ground track in a three dimensional uniform wind is analyzed, giving a corresponding closed-form expression. It is shown that the UFD’s engine power depends upon its airspeed and bank angle in addition to the wind-speed and the corresponding vertical angle. On this basis, an optimization problem corresponding to the UFD’s ground track design is investigated. Using the block coordinate descent technique, the initial problem is decomposed into two sub-problems, which are addressed by four algorithms (Algorithms 1–4). This leads to an iterative algorithm (Algorithm 5) that optimizes the UFD’s airspeed and adjusts its flight parameters (e.g., time, radius, and the angles of pitch, course, crab, heading, and bank) to follow the desired ground track. Computer simulation results verified that the proposed algorithm achieves the best energy-saving performance, and generates a small bank angle with minimal variation during flight. This characteristic alleviates the demand for fast bank angle command following when adjusting the UFD’s flight parameters in windy environments.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131539"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154274","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
The importance of morphology-aware subword tokenization for NLP tasks in Slovak language modeling 斯洛伐克语建模中词法感知子词标记对NLP任务的重要性
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131492
Dávid Držík , Jozef Kapusta
To effectively train large language models (LLMs) for morphologically rich and low-resource languages such as Slovak, high-quality tokenization is essential. Traditional approaches like Byte-Pair Encoding (BPE) overlook linguistic structure, often fragmenting root morphemes and causing semantic loss. This study examines whether morphology-aware tokenization can improve model performance across various NLP tasks. We introduce the SlovaK Morphological Tokenizer (SKMT), which incorporates root morpheme information into the tokenization process, and compare it against a standard BPE tokenizer. Both tokenizers were used to preprocess a Slovak corpus for pretraining two RoBERTa-based models (SK_Morph_BLM and SK_BPE_BLM), which were then fine-tuned on token classification, sequence classification, question answering, and semantic textual similarity tasks. Experimental results show that SK_Morph_BLM achieved slightly higher performance overall, with statistically significant gains in semantic similarity (up to +12.49%) and question answering (up to +3.23%). Complementary quantitative and qualitative analyses further revealed that morphology-aware tokenization is most effective for shorter, morphologically regular texts and improves grammatical and semantic consistency. These findings demonstrate that incorporating morphological information into tokenization can enhance model robustness and semantic understanding for morphologically rich languages.
为了有效地训练大型语言模型(llm)来训练形态学丰富和资源匮乏的语言,如斯洛伐克语,高质量的标记化是必不可少的。像字节对编码(BPE)这样的传统方法忽略了语言结构,经常造成词根语素的碎片化和语义丢失。本研究考察了形态感知标记化是否可以提高各种NLP任务的模型性能。我们介绍了斯洛伐克形态学标记器(SKMT),它将词根语素信息整合到标记过程中,并将其与标准BPE标记器进行比较。两个标记器都用于预处理斯洛伐克语料库,以预训练两个基于roberta的模型(SK_Morph_BLM和SK_BPE_BLM),然后对标记分类、序列分类、问题回答和语义文本相似性任务进行微调。实验结果表明,SK_Morph_BLM总体上取得了稍高的性能,在语义相似度(高达+12.49%)和问题回答(高达+3.23%)方面有统计学上显著的提高。互补的定量和定性分析进一步表明,形态感知的标记化对较短的、形态规则的文本最有效,并能提高语法和语义的一致性。这些发现表明,将形态学信息纳入标记化可以提高模型的鲁棒性和对形态学丰富语言的语义理解。
{"title":"The importance of morphology-aware subword tokenization for NLP tasks in Slovak language modeling","authors":"Dávid Držík ,&nbsp;Jozef Kapusta","doi":"10.1016/j.eswa.2026.131492","DOIUrl":"10.1016/j.eswa.2026.131492","url":null,"abstract":"<div><div>To effectively train large language models (LLMs) for morphologically rich and low-resource languages such as Slovak, high-quality tokenization is essential. Traditional approaches like Byte-Pair Encoding (BPE) overlook linguistic structure, often fragmenting root morphemes and causing semantic loss. This study examines whether morphology-aware tokenization can improve model performance across various NLP tasks. We introduce the SlovaK Morphological Tokenizer (SKMT), which incorporates root morpheme information into the tokenization process, and compare it against a standard BPE tokenizer. Both tokenizers were used to preprocess a Slovak corpus for pretraining two RoBERTa-based models (SK_Morph_BLM and SK_BPE_BLM), which were then fine-tuned on token classification, sequence classification, question answering, and semantic textual similarity tasks. Experimental results show that SK_Morph_BLM achieved slightly higher performance overall, with statistically significant gains in semantic similarity (up to +12.49%) and question answering (up to +3.23%). Complementary quantitative and qualitative analyses further revealed that morphology-aware tokenization is most effective for shorter, morphologically regular texts and improves grammatical and semantic consistency. These findings demonstrate that incorporating morphological information into tokenization can enhance model robustness and semantic understanding for morphologically rich languages.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131492"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A design method for electric vehicle front face styling: based on engineering feasibility optimization of GenAI-generated images 基于genai生成图像工程可行性优化的电动汽车前脸造型设计方法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131522
Huining Pei , Mingzhe Yang , Zhonghang Bai , Man Ding , Wen Li , Yuxin Cao , Yanjun Zhang
To address the low engineering feasibility of electric vehicle (EV) front face styling images generated by generative artificial intelligence (GenAI) tools such as Midjourney, this study proposes an innovative design method that integrates curve optimization with a collaborative evaluation system combining simulated and human experts. The method aims to enhance the manufacturability of AI-generated design schemes while efficiently transferring the styling genes of conventional fuel vehicles to EV front face styling design. First, the large language model ChatGPT-5.0 is employed to construct a styling semantic database based on six categories of conventional fuel vehicle front face datasets. Second, Midjourney is used to generate an initial EV front face styling dataset, and a production-ready styling dataset is subsequently constructed to provide engineering feasibility references for EV front face styling design. Third, “AI-generated curves” and “engineering reference curves” are fused at different ratios, and an EV front face styling scheme is generated using a curve blending algorithm optimized for the figure–ground relationship. Finally, an LLM-based collaborative evaluation system integrating simulated experts (via ChatGPT-5.0) and human experts is established to conduct quantitative evaluation and optimization of the schemes in terms of engineering feasibility and styling design metrics. A case study demonstrates that the optimized scheme’s engineering feasibility score is significantly improved from 2.3 to 7.1 (out of 10), while maintaining a high level of design creativity (7.5). The established LLM-based collaborative evaluation system achieved high inter-rater consistency in both engineering feasibility evaluation (ICC ≥ 0.9) and design creativity evaluation for EV front face styling schemes (ICC ≥ 0.85), effectively balancing engineering feasibility and design creativity in generative artificial intelligence-generated EV front face styling schemes. By constructing an AI-led, human-supervised hybrid design workflow, this method significantly enhances the engineering feasibility and design efficiency of generative AI in product styling design, providing a theoretical reference for achieving a balance between design innovation and engineering feasibility.
针对Midjourney等生成式人工智能(GenAI)工具生成的电动汽车(EV)正面造型图像工程可行性较低的问题,提出了一种将曲线优化与仿真专家与人类专家相结合的协同评估系统相结合的创新设计方法。该方法旨在提高人工智能生成设计方案的可制造性,同时有效地将传统燃油汽车的造型基因转移到电动汽车前脸造型设计中。首先,采用ChatGPT-5.0大型语言模型,基于六类常规燃油车正面数据集构建样式语义数据库;其次,利用Midjourney软件生成初始电动汽车前脸造型数据集,并构建可量产的造型数据集,为电动汽车前脸造型设计提供工程可行性参考;第三,将“人工智能生成曲线”与“工程参考曲线”按不同比例融合,采用针对图地关系优化的曲线混合算法生成电动汽车前脸造型方案。最后,建立了仿真专家(通过ChatGPT-5.0)与真人专家集成的基于llm的协同评估系统,从工程可行性和造型设计指标两方面对方案进行定量评估和优化。案例研究表明,优化方案的工程可行性得分从2.3分显著提高到7.1分(满分10分),同时保持了较高的设计创意水平(7.5分)。所建立的基于llm的协同评价体系在电动汽车前脸造型方案的工程可行性评价(ICC ≥ 0.9)和设计创意评价(ICC ≥ 0.85)上实现了较高的评价一致性,有效地平衡了生成式人工智能生成的电动汽车前脸造型方案的工程可行性和设计创意。该方法通过构建人工智能主导、人工监督的混合设计工作流,显著提高了生成式人工智能在产品造型设计中的工程可行性和设计效率,为实现设计创新与工程可行性之间的平衡提供了理论参考。
{"title":"A design method for electric vehicle front face styling: based on engineering feasibility optimization of GenAI-generated images","authors":"Huining Pei ,&nbsp;Mingzhe Yang ,&nbsp;Zhonghang Bai ,&nbsp;Man Ding ,&nbsp;Wen Li ,&nbsp;Yuxin Cao ,&nbsp;Yanjun Zhang","doi":"10.1016/j.eswa.2026.131522","DOIUrl":"10.1016/j.eswa.2026.131522","url":null,"abstract":"<div><div>To address the low engineering feasibility of electric vehicle (EV) front face styling images generated by generative artificial intelligence (GenAI) tools such as Midjourney, this study proposes an innovative design method that integrates curve optimization with a collaborative evaluation system combining simulated and human experts. The method aims to enhance the manufacturability of AI-generated design schemes while efficiently transferring the styling genes of conventional fuel vehicles to EV front face styling design. First, the large language model ChatGPT-5.0 is employed to construct a styling semantic database based on six categories of conventional fuel vehicle front face datasets. Second, Midjourney is used to generate an initial EV front face styling dataset, and a production-ready styling dataset is subsequently constructed to provide engineering feasibility references for EV front face styling design. Third, “AI-generated curves” and “engineering reference curves” are fused at different ratios, and an EV front face styling scheme is generated using a curve blending algorithm optimized for the figure–ground relationship. Finally, an LLM-based collaborative evaluation system integrating simulated experts (via ChatGPT-5.0) and human experts is established to conduct quantitative evaluation and optimization of the schemes in terms of engineering feasibility and styling design metrics. A case study demonstrates that the optimized scheme’s engineering feasibility score is significantly improved from 2.3 to 7.1 (out of 10), while maintaining a high level of design creativity (7.5). The established LLM-based collaborative evaluation system achieved high inter-rater consistency in both engineering feasibility evaluation (ICC ≥ 0.9) and design creativity evaluation for EV front face styling schemes (ICC ≥ 0.85), effectively balancing engineering feasibility and design creativity in generative artificial intelligence-generated EV front face styling schemes. By constructing an AI-led, human-supervised hybrid design workflow, this method significantly enhances the engineering feasibility and design efficiency of generative AI in product styling design, providing a theoretical reference for achieving a balance between design innovation and engineering feasibility.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"312 ","pages":"Article 131522"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146192788","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sureillance camera authentication system based on PRNU 基于PRNU的监控摄像头认证系统
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131548
Jian Li , Lisheng Yan , Bin Ma , Xiaolong Li , Zhenxing Qian
In this paper, we propose a camera authentication scheme to enhance access security for front-end devices in surveillance networks. The scheme leverages the Photo-Response Non-Uniformity (PRNU) pattern noise of camera sensors and combines traditional encryption techniques to strengthen system security. During registration, the camera captures images to extract PRNU, generates a compressed device fingerprint stored on the server as a root key. For authentication, the server sends a challenge sequence randomly generated from the root key to the front-end, which captures a new image to generate a root key approximation for response. To prevent attackers from extracting device fingerprints from public images, it incorporates anonymization, proposing a DWT-based PRNU anonymization algorithm. This improves PSNR by 8.08 dB and SSIM by 0.08 on average compared to previous methods. Security Analysis and Experimental results show high authentication accuracy and security, effectively resisting replay and man-in-the-middle attacks, providing a robust solution for surveillance network devices.
为了提高监控网络中前端设备的访问安全性,本文提出了一种摄像机认证方案。该方案利用相机传感器的光响应非均匀性(PRNU)模式噪声,结合传统的加密技术来增强系统的安全性。注册过程中,摄像头采集图像提取PRNU,生成压缩后的设备指纹作为根密钥存储在服务器上。对于身份验证,服务器将从根密钥随机生成的质询序列发送到前端,前端捕获一个新图像以生成响应的根密钥近似值。为了防止攻击者从公开图像中提取设备指纹,结合匿名化,提出了一种基于dwt的PRNU匿名化算法。与以前的方法相比,PSNR提高了8.08 dB, SSIM平均提高了0.08。安全性分析和实验结果表明,该方法具有较高的认证准确性和安全性,能有效抵御重放和中间人攻击,为监控网络设备提供了可靠的解决方案。
{"title":"Sureillance camera authentication system based on PRNU","authors":"Jian Li ,&nbsp;Lisheng Yan ,&nbsp;Bin Ma ,&nbsp;Xiaolong Li ,&nbsp;Zhenxing Qian","doi":"10.1016/j.eswa.2026.131548","DOIUrl":"10.1016/j.eswa.2026.131548","url":null,"abstract":"<div><div>In this paper, we propose a camera authentication scheme to enhance access security for front-end devices in surveillance networks. The scheme leverages the Photo-Response Non-Uniformity (PRNU) pattern noise of camera sensors and combines traditional encryption techniques to strengthen system security. During registration, the camera captures images to extract PRNU, generates a compressed device fingerprint stored on the server as a root key. For authentication, the server sends a challenge sequence randomly generated from the root key to the front-end, which captures a new image to generate a root key approximation for response. To prevent attackers from extracting device fingerprints from public images, it incorporates anonymization, proposing a DWT-based PRNU anonymization algorithm. This improves PSNR by 8.08 dB and SSIM by 0.08 on average compared to previous methods. Security Analysis and Experimental results show high authentication accuracy and security, effectively resisting replay and man-in-the-middle attacks, providing a robust solution for surveillance network devices.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131548"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154212","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction LDM-DTI:一个集成了预训练语言模型和几何图网络的多模态框架,用于可解释的药物-靶点相互作用预测
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-02-05 DOI: 10.1016/j.eswa.2026.131485
Yuanyuan Ji , Zhuo Chen , Zhihan Liu , Xiaofeng Man , Junwei Du , Bin Yu
Accurate prediction of drug-target interactions (DTIs) is essential for speeding up the discovery of new therapeutics. Although significant progress has been made with deep learning-based approaches, considerable challenges remain in learning informative molecular representations and modeling the intricate nature of drug-target associations. To overcome these limitations, an end-to-end predictive architecture, termed LDM-DTI, is proposed. In this framework, drug and protein sequences are encoded via pretrained large language models. Specifically, ChemBERTa is utilized to derive high-dimensional semantic and structural features from SMILES strings, while ProtBERT is employed to extract contextual representations from amino acid sequences. To further incorporate spatial molecular information, a three-layer Graph Convolutional Network (GCN) and an Equivariant Graph Neural Network (EGNN) are integrated to capture both 2D topological and 3D geometric characteristics of drug molecules. Protein-level features are refined through dynamic convolutional operations and multi-head self-attention mechanisms. These representations are then fused via a Dynamic Interactive Attention Module (DIAM) to model cross-modal dependencies between drugs and targets. The proposed framework demonstrates superior predictive performance and generalizability across four public benchmark datasets, consistently surpassing ten state-of-the-art baselines. Ablation experiments are conducted to quantify the contributions of individual components, and protein-level attention maps are visualized to enhance interpretability. Overall, LDM-DTI offers a robust and interpretable solution for DTI prediction, with strong potential for accelerating structure-informed drug discovery.
准确预测药物-靶标相互作用(DTIs)对于加速新疗法的发现至关重要。尽管基于深度学习的方法取得了重大进展,但在学习信息分子表示和模拟药物靶标关联的复杂性质方面仍然存在相当大的挑战。为了克服这些限制,提出了一种称为LDM-DTI的端到端预测体系结构。在这个框架中,药物和蛋白质序列通过预训练的大型语言模型进行编码。具体来说,ChemBERTa用于从SMILES字符串中提取高维语义和结构特征,而ProtBERT用于从氨基酸序列中提取上下文表示。为了进一步整合空间分子信息,我们集成了三层图卷积网络(GCN)和等变图神经网络(EGNN)来捕捉药物分子的二维拓扑和三维几何特征。通过动态卷积运算和多头自注意机制来细化蛋白质水平的特征。然后,这些表征通过动态交互注意模块(DIAM)进行融合,以模拟药物和靶标之间的跨模态依赖关系。提出的框架在四个公共基准数据集上展示了卓越的预测性能和通用性,始终超过十个最先进的基线。消融实验是为了量化单个成分的贡献,蛋白质水平的注意图是可视化的,以提高可解释性。总体而言,LDM-DTI为DTI预测提供了一个稳健且可解释的解决方案,具有加速结构信息药物发现的强大潜力。
{"title":"LDM-DTI: A multimodal framework integrating pretrained language models and geometric graph networks for interpretable drug-target interaction prediction","authors":"Yuanyuan Ji ,&nbsp;Zhuo Chen ,&nbsp;Zhihan Liu ,&nbsp;Xiaofeng Man ,&nbsp;Junwei Du ,&nbsp;Bin Yu","doi":"10.1016/j.eswa.2026.131485","DOIUrl":"10.1016/j.eswa.2026.131485","url":null,"abstract":"<div><div>Accurate prediction of drug-target interactions (DTIs) is essential for speeding up the discovery of new therapeutics. Although significant progress has been made with deep learning-based approaches, considerable challenges remain in learning informative molecular representations and modeling the intricate nature of drug-target associations. To overcome these limitations, an end-to-end predictive architecture, termed LDM-DTI, is proposed. In this framework, drug and protein sequences are encoded via pretrained large language models. Specifically, ChemBERTa is utilized to derive high-dimensional semantic and structural features from SMILES strings, while ProtBERT is employed to extract contextual representations from amino acid sequences. To further incorporate spatial molecular information, a three-layer Graph Convolutional Network (GCN) and an Equivariant Graph Neural Network (EGNN) are integrated to capture both 2D topological and 3D geometric characteristics of drug molecules. Protein-level features are refined through dynamic convolutional operations and multi-head self-attention mechanisms. These representations are then fused via a Dynamic Interactive Attention Module (DIAM) to model cross-modal dependencies between drugs and targets. The proposed framework demonstrates superior predictive performance and generalizability across four public benchmark datasets, consistently surpassing ten state-of-the-art baselines. Ablation experiments are conducted to quantify the contributions of individual components, and protein-level attention maps are visualized to enhance interpretability. Overall, LDM-DTI offers a robust and interpretable solution for DTI prediction, with strong potential for accelerating structure-informed drug discovery.</div></div>","PeriodicalId":50461,"journal":{"name":"Expert Systems with Applications","volume":"313 ","pages":"Article 131485"},"PeriodicalIF":7.5,"publicationDate":"2026-02-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146154239","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Expert Systems with Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1