首页 > 最新文献

Applied Soft Computing最新文献

英文 中文
Reliable reasoning: Learning and inference based on the ability of large language models 可靠推理:基于大型语言模型能力的学习和推理
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114618
Changsen Yuan , Rui Lin , Cunhan Guo
Large language models (LLMs) demonstrate strong reasoning abilities in complex tasks, but they are limited by their inability to access up-to-date information and are prone to generating plausible, known as hallucinations. Knowledge graphs (KGs), which are structured and dynamically updated collections of facts, offer a solution by providing LLMs with verified, current information. This enhances the models’ reasoning accuracy and reduces the risk of hallucinations. However, existing methods focus more on the inferability and structure of KGs, while overlooking the varying levels of difficulty that LLMs encounter when learning and understanding different reasoning paths between two nodes in the KGs. In this paper, we propose a novel method called Reliable Reasoning (R2) that selects appropriate inference paths from KGs for LLMs to learn and understand more easily. Specifically, we present a reliable reasoning path search framework in which R2 first extracts appropriate candidate reasoning paths based on KGs. The candidate reasoning paths are then filtered, selecting the ones preferred by the LLM for further learning. Comprehensive experiments conducted on two benchmark KGQA datasets indicate that R2 attains good performance in KGQA tasks, producing accurate and interpretable reasoning outcomes.
大型语言模型(llm)在复杂任务中表现出强大的推理能力,但它们受到无法获取最新信息的限制,并且容易产生似是而非的幻觉。知识图谱(KGs)是结构化的、动态更新的事实集合,通过为法学硕士提供经过验证的最新信息,提供了一种解决方案。这提高了模型的推理准确性,降低了产生幻觉的风险。然而,现有的方法更多地关注KGs的可推理性和结构,而忽略了llm在学习和理解KGs中两个节点之间的不同推理路径时遇到的不同难度。在本文中,我们提出了一种名为可靠推理(R2)的新方法,该方法从KGs中选择合适的推理路径,以便llm更容易地学习和理解。具体来说,我们提出了一个可靠的推理路径搜索框架,在该框架中,R2首先基于KGs提取合适的候选推理路径,然后对候选推理路径进行过滤,选择LLM喜欢的路径进行进一步学习。在两个基准KGQA数据集上进行的综合实验表明,R2在KGQA任务中获得了良好的性能,产生了准确且可解释的推理结果。
{"title":"Reliable reasoning: Learning and inference based on the ability of large language models","authors":"Changsen Yuan ,&nbsp;Rui Lin ,&nbsp;Cunhan Guo","doi":"10.1016/j.asoc.2026.114618","DOIUrl":"10.1016/j.asoc.2026.114618","url":null,"abstract":"<div><div>Large language models (LLMs) demonstrate strong reasoning abilities in complex tasks, but they are limited by their inability to access up-to-date information and are prone to generating plausible, known as hallucinations. Knowledge graphs (KGs), which are structured and dynamically updated collections of facts, offer a solution by providing LLMs with verified, current information. This enhances the models’ reasoning accuracy and reduces the risk of hallucinations. However, existing methods focus more on the inferability and structure of KGs, while overlooking the varying levels of difficulty that LLMs encounter when learning and understanding different reasoning paths between two nodes in the KGs. In this paper, we propose a novel method called <strong>R</strong>eliable <strong>R</strong>easoning (R<sup>2</sup>) that selects appropriate inference paths from KGs for LLMs to learn and understand more easily. Specifically, we present a reliable reasoning path search framework in which R<sup>2</sup> first extracts appropriate candidate reasoning paths based on KGs. The candidate reasoning paths are then filtered, selecting the ones preferred by the LLM for further learning. Comprehensive experiments conducted on two benchmark KGQA datasets indicate that R<sup>2</sup> attains good performance in KGQA tasks, producing accurate and interpretable reasoning outcomes.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114618"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146024167","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Stochastic subsampling with average pooling 平均池化随机子抽样
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114611
Bum Jun Kim , Sang Woo Kim
Regularization is essential for improving the generalization of deep neural networks while mitigating overfitting. Although the popular method of Dropout provides a regularization effect, it causes inconsistent properties in the output, which may degrade the performance of deep neural networks. In this study, we propose a new module called stochastic average pooling, which incorporates Dropout-like stochasticity into pooling. We describe the properties of stochastic subsampling and average pooling and leverage them to design a module without any inconsistency problems. The stochastic average pooling achieves a regularization effect without any potential performance degradation due to the inconsistency issue and can be easily plugged into existing deep neural network architectures. Experiments demonstrate that replacing existing average pooling with stochastic average pooling yields consistent improvements across a variety of tasks, datasets, and models.
正则化对于提高深度神经网络的泛化能力和减轻过拟合是必不可少的。尽管流行的Dropout方法提供了正则化效果,但它会导致输出属性不一致,这可能会降低深度神经网络的性能。在这项研究中,我们提出了一个新的模块称为随机平均池化,它将Dropout-like随机性纳入池化。我们描述了随机子抽样和平均池化的特性,并利用它们设计了一个不存在不一致问题的模块。随机平均池化实现了正则化效果,没有任何潜在的性能下降,因为不一致的问题,可以很容易地插入现有的深度神经网络架构。实验表明,用随机平均池代替现有的平均池可以在各种任务、数据集和模型中产生一致的改进。
{"title":"Stochastic subsampling with average pooling","authors":"Bum Jun Kim ,&nbsp;Sang Woo Kim","doi":"10.1016/j.asoc.2026.114611","DOIUrl":"10.1016/j.asoc.2026.114611","url":null,"abstract":"<div><div>Regularization is essential for improving the generalization of deep neural networks while mitigating overfitting. Although the popular method of Dropout provides a regularization effect, it causes inconsistent properties in the output, which may degrade the performance of deep neural networks. In this study, we propose a new module called stochastic average pooling, which incorporates Dropout-like stochasticity into pooling. We describe the properties of stochastic subsampling and average pooling and leverage them to design a module without any inconsistency problems. The stochastic average pooling achieves a regularization effect without any potential performance degradation due to the inconsistency issue and can be easily plugged into existing deep neural network architectures. Experiments demonstrate that replacing existing average pooling with stochastic average pooling yields consistent improvements across a variety of tasks, datasets, and models.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114611"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980370","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dynamic ensemble point-interval wind speed prediction system for data drift: Adaptive real-time feature decoupling and multi-level information fusion quantile regression 数据漂移的动态集成点间隔风速预测系统:自适应实时特征解耦和多层次信息融合分位数回归
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114615
Jujie Wang, Xiawei Wu, Minghong Chen
Wind speed forecasting is essential for integrating renewable energy systems and advancing carbon neutrality objectives. As a key technology for wind power grid integration, accurate wind speed prediction enhances the grid’s capacity to absorb renewable energy, thereby reducing reliance on fossil fuels and mitigating environmental impacts from greenhouse gas emissions. However, wind speed’s dynamic evolutionary characteristics and the inherent data drift problem pose significant challenges to precise forecasting. This research develops an integrated wind speed prediction model incorporating adaptive real-time feature decoupling and a multi-head attention ensemble algorithm to address data drift. The enhanced successive variational mode decomposition, combined with a sliding window, is first used for adaptive real-time feature decoupling of original wind speed data, which dynamically tracks drift-induced variations by decoupling data into adaptive subsequences. A multi-dimensional quantitative optimal model matching strategy is adopted to achieve precise matching between each subsequence and the model. Multi-head attention adjusts integration weights to mitigate the impact of drift. Quantile regression with multi-level information fusion is utilized to additionally assess the uncertainty in wind speed variations and derive the ultimate forecasting outcomes. Experimental findings indicate that in one-step forecasting, the prosed model attained PICP values of 0.9645 and 0.9602, along with PINAW values of 0.2155 and 0.2296, at the two wind farms. While ensuring high coverage, it effectively controlled interval width, fully validating the system’s superior performance.
风速预报对于整合可再生能源系统和推进碳中和目标至关重要。准确的风速预测是风电并网的关键技术,可提高电网对可再生能源的吸收能力,从而减少对化石燃料的依赖,减轻温室气体排放对环境的影响。然而,风速的动态演化特征和固有的数据漂移问题给精确预报带来了重大挑战。本研究开发了一种结合自适应实时特征解耦和多头注意力集成算法的综合风速预测模型,以解决数据漂移问题。首先将增强的逐次变分模态分解与滑动窗口相结合,对原始风速数据进行自适应实时特征解耦,通过解耦数据形成自适应子序列,动态跟踪漂移引起的变化。采用多维定量优化模型匹配策略,实现各子序列与模型的精确匹配。多头注意调整积分权重,以减轻漂移的影响。利用多级信息融合的分位数回归对风速变化的不确定性进行评估,得出最终的预报结果。实验结果表明,在一步预测中,该模型在两个风电场的PICP值分别为0.9645和0.9602,PINAW值分别为0.2155和0.2296。在保证高覆盖的同时,有效地控制了间隔宽度,充分验证了系统的优越性能。
{"title":"Dynamic ensemble point-interval wind speed prediction system for data drift: Adaptive real-time feature decoupling and multi-level information fusion quantile regression","authors":"Jujie Wang,&nbsp;Xiawei Wu,&nbsp;Minghong Chen","doi":"10.1016/j.asoc.2026.114615","DOIUrl":"10.1016/j.asoc.2026.114615","url":null,"abstract":"<div><div>Wind speed forecasting is essential for integrating renewable energy systems and advancing carbon neutrality objectives. As a key technology for wind power grid integration, accurate wind speed prediction enhances the grid’s capacity to absorb renewable energy, thereby reducing reliance on fossil fuels and mitigating environmental impacts from greenhouse gas emissions. However, wind speed’s dynamic evolutionary characteristics and the inherent data drift problem pose significant challenges to precise forecasting. This research develops an integrated wind speed prediction model incorporating adaptive real-time feature decoupling and a multi-head attention ensemble algorithm to address data drift. The enhanced successive variational mode decomposition, combined with a sliding window, is first used for adaptive real-time feature decoupling of original wind speed data, which dynamically tracks drift-induced variations by decoupling data into adaptive subsequences. A multi-dimensional quantitative optimal model matching strategy is adopted to achieve precise matching between each subsequence and the model. Multi-head attention adjusts integration weights to mitigate the impact of drift. Quantile regression with multi-level information fusion is utilized to additionally assess the uncertainty in wind speed variations and derive the ultimate forecasting outcomes. Experimental findings indicate that in one-step forecasting, the prosed model attained PICP values of 0.9645 and 0.9602, along with PINAW values of 0.2155 and 0.2296, at the two wind farms. While ensuring high coverage, it effectively controlled interval width, fully validating the system’s superior performance.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114615"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
One-class anomaly detection based on image reconstruction by Wavelet, Gaussian Fourier and variational Embedding for medical images 基于小波、高斯傅立叶和变分嵌入图像重构的一类医学图像异常检测
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114633
Lin-Chieh Huang, Hung-Hsu Tsai, Yu-Che Chuang
This paper proposes a new pixel-reconstruction-based method that combines high-frequency sub-bands of Wavelets, Gaussian Fourier features (GFF) and Variational Embedding (VE) for one-class anomaly detection on medical images, which is called WGF-VAE hereafter. Traditional reconstruction-based methods reconstruct low-frequency information, leading to miss image details during image reconstruction, especially for medical image reconstruction. As a result, those methods often cause false-positives, regarding normal parts as anomalies. The WGF-VAE scheme can overcome these drawbacks mentioned above due to the use of high-frequency sub-bands of wavelets, GFF and VE in the design of image reconstruction process. High-frequency sub-bands conserve high-frequency information, making the decoder of the WGF-VAE scheme easier to learn and handle these details of images. Moreover, the decoder leverages GFF to cover a broader frequency spectrum by transforming coordinates of an input image into a higher-dimension space so as to enhance the learning of high-frequency functions. Meanwhile, the scheme can accurately capture and reconstruct high-frequency details of medical images by utilizing the localized frequency information from high-frequency sub-bands and the expanded frequency spectrum from GFF. Furthermore, a variational autoencoder (VAE) produces VE which is employed in the decoding phase to play a role as the latent feature of high-frequency sub-bands. It makes the decoder stable to yield normal images so as to precisely compute the difference between input and output images, resulting in promoting the recognition ability of the scheme. Hence, the WGF-VAE scheme possesses remarkably ability on detection and localization for anomalies because of taking a combination of three features as inputs of the decoder. Finally, massively experimental results show that the WGF-VAE scheme outstandingly surpasses state-of-the-art methods on anomaly detection for brain and liver images in two public benchmarks.
本文提出了一种结合小波高频子带、高斯傅立叶特征(GFF)和变分嵌入(VE)的基于像素重构的医学图像一类异常检测新方法,以下称为WGF-VAE。传统的基于重构的方法重构的是低频信息,导致图像重构过程中缺少图像细节,尤其是医学图像重构。因此,这些方法经常导致误报,将正常部件视为异常。WGF-VAE方案由于在图像重建过程的设计中使用了小波、GFF和VE的高频子带,从而克服了上述缺点。高频子带保留了高频信息,使得WGF-VAE方案的解码器更容易学习和处理图像的这些细节。此外,解码器利用GFF将输入图像的坐标变换到更高维度的空间,从而覆盖更广泛的频谱,从而增强高频函数的学习。同时,该方案利用高频子带的局域频率信息和GFF的扩展频谱,可以准确地捕获和重建医学图像的高频细节。此外,变分自编码器(VAE)产生的VE在解码阶段用作高频子带的潜在特征。它使解码器稳定地产生正常图像,从而精确地计算输入和输出图像的差值,从而提高了方案的识别能力。因此,WGF-VAE方案采用三种特征的组合作为解码器的输入,具有显著的异常检测和定位能力。最后,大量实验结果表明,在两个公共基准测试中,WGF-VAE方案在脑和肝脏图像异常检测方面明显优于目前最先进的方法。
{"title":"One-class anomaly detection based on image reconstruction by Wavelet, Gaussian Fourier and variational Embedding for medical images","authors":"Lin-Chieh Huang,&nbsp;Hung-Hsu Tsai,&nbsp;Yu-Che Chuang","doi":"10.1016/j.asoc.2026.114633","DOIUrl":"10.1016/j.asoc.2026.114633","url":null,"abstract":"<div><div>This paper proposes a new pixel-reconstruction-based method that combines high-frequency sub-bands of <strong>W</strong>avelets, <strong>G</strong>aussian <strong>F</strong>ourier features (GFF) and <strong>V</strong>ariational <strong>E</strong>mbedding (VE) for one-class anomaly detection on medical images, which is called WGF-VAE hereafter. Traditional reconstruction-based methods reconstruct low-frequency information, leading to miss image details during image reconstruction, especially for medical image reconstruction. As a result, those methods often cause false-positives, regarding normal parts as anomalies. The WGF-VAE scheme can overcome these drawbacks mentioned above due to the use of high-frequency sub-bands of wavelets, GFF and VE in the design of image reconstruction process. High-frequency sub-bands conserve high-frequency information, making the decoder of the WGF-VAE scheme easier to learn and handle these details of images. Moreover, the decoder leverages GFF to cover a broader frequency spectrum by transforming coordinates of an input image into a higher-dimension space so as to enhance the learning of high-frequency functions. Meanwhile, the scheme can accurately capture and reconstruct high-frequency details of medical images by utilizing the localized frequency information from high-frequency sub-bands and the expanded frequency spectrum from GFF. Furthermore, a variational autoencoder (VAE) produces VE which is employed in the decoding phase to play a role as the latent feature of high-frequency sub-bands. It makes the decoder stable to yield normal images so as to precisely compute the difference between input and output images, resulting in promoting the recognition ability of the scheme. Hence, the WGF-VAE scheme possesses remarkably ability on detection and localization for anomalies because of taking a combination of three features as inputs of the decoder. Finally, massively experimental results show that the WGF-VAE scheme outstandingly surpasses state-of-the-art methods on anomaly detection for brain and liver images in two public benchmarks.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114633"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146039914","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
HQML-NLP: A hybrid quantum machine learning framework for scholarly AI-text detection HQML-NLP:用于学术人工智能文本检测的混合量子机器学习框架
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114634
Layth Rafea Hazim , Oguz Ata
The swift growth of the gargantuan language models has made it even harder to tell apart writings done by humans and AI especially in academia which led to the need for the setup of trustworthy detection frameworks that will satisfactorily balance accuracy, interpretability, and efficiency. In this paper, we walk you through the HQML-NLP, a detection system using a hybrid quantum-classical machine learning framework for the detection of AI-generated academic content. The system onboard the merging of Sentence-BERT semantic embeddings with quantum feature encoding that is supported by a 6-qubit, two-layer parameterized quantum circuit, thus resulting in a 390-dimensional hybrid representation which is classified through a lightweight multilayer perceptron. The framework has been tested on three benchmark datasets AI-GA, HWAI, and HAGT-1M to check its scalability and generalization to different academic writing situations. The results of the experiments show the framework has consistent and good discriminative capability achieving AUROC scores of more than 0.96 on all datasets and excellent performance (AUROC = 1.000, ACC = 99.98 %) on the large-scale data. Moreover, probability calibration by means of temperature scaling raises the trustworthiness of predicted confidence scores, leading to a 60 % reduction in the Expected Calibration Error (ECE) and without affecting the performance of the discrimination. When compared against the transformer-based and ensemble-learning detectors, HQML-NLP comes out with an equivalent and competitive detection accuracy and calibration quality yet demands more than 2000× less trainable parameters. These findings imply that hybrid quantum-classical representations act as an effective and compact alternative for the detection of AI-text in scholarly journals.
庞大的语言模型的迅速发展使得区分人类和人工智能的作品变得更加困难,特别是在学术界,这导致需要建立可信的检测框架,以令人满意地平衡准确性、可解释性和效率。在本文中,我们将向您介绍HQML-NLP,这是一种使用混合量子经典机器学习框架来检测人工智能生成的学术内容的检测系统。该系统将Sentence-BERT语义嵌入与量子特征编码合并,该编码由6量子位、两层参数化量子电路支持,从而产生390维混合表示,并通过轻量级多层感知器进行分类。该框架在AI-GA、HWAI和HAGT-1M三个基准数据集上进行了测试,以检验其在不同学术写作情况下的可扩展性和泛化性。实验结果表明,该框架具有一致性和良好的判别能力,在所有数据集上AUROC得分均在0.96以上,在大规模数据上表现优异(AUROC = 1.000, ACC = 99.98 %)。此外,通过温度标度的概率校准提高了预测置信度分数的可信度,导致预期校准误差(ECE)降低60 %,而不影响判别的性能。与基于变压器和集成学习的检测器相比,HQML-NLP具有同等且具有竞争力的检测精度和校准质量,但需要的可训练参数要少2000倍以上。这些发现表明,混合量子经典表示作为一种有效而紧凑的替代方法,可用于检测学术期刊中的人工智能文本。
{"title":"HQML-NLP: A hybrid quantum machine learning framework for scholarly AI-text detection","authors":"Layth Rafea Hazim ,&nbsp;Oguz Ata","doi":"10.1016/j.asoc.2026.114634","DOIUrl":"10.1016/j.asoc.2026.114634","url":null,"abstract":"<div><div>The swift growth of the gargantuan language models has made it even harder to tell apart writings done by humans and AI especially in academia which led to the need for the setup of trustworthy detection frameworks that will satisfactorily balance accuracy, interpretability, and efficiency. In this paper, we walk you through the HQML-NLP, a detection system using a hybrid quantum-classical machine learning framework for the detection of AI-generated academic content. The system onboard the merging of Sentence-BERT semantic embeddings with quantum feature encoding that is supported by a 6-qubit, two-layer parameterized quantum circuit, thus resulting in a 390-dimensional hybrid representation which is classified through a lightweight multilayer perceptron. The framework has been tested on three benchmark datasets AI-GA, HWAI, and HAGT-1M to check its scalability and generalization to different academic writing situations. The results of the experiments show the framework has consistent and good discriminative capability achieving AUROC scores of more than 0.96 on all datasets and excellent performance (AUROC = 1.000, ACC = 99.98 %) on the large-scale data. Moreover, probability calibration by means of temperature scaling raises the trustworthiness of predicted confidence scores, leading to a 60 % reduction in the Expected Calibration Error (ECE) and without affecting the performance of the discrimination. When compared against the transformer-based and ensemble-learning detectors, HQML-NLP comes out with an equivalent and competitive detection accuracy and calibration quality yet demands more than 2000× less trainable parameters. These findings imply that hybrid quantum-classical representations act as an effective and compact alternative for the detection of AI-text in scholarly journals.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"191 ","pages":"Article 114634"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146080200","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Open set recognition based on fuzzy rule classifier and probabilistic distribution analysis with application to spectral material classification 基于模糊规则分类器和概率分布分析的开集识别及其在光谱材料分类中的应用
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114607
Peng Zhao , Zhen-Yu Li
Open set recognition aims to classify known classes and reject unknown classes simultaneously. This study is often performed based on machine learning and deep learning. In machine learning based schemes, the classification boundary of known classes must be contracted and refined accurately, which is sometimes hard to solve in practice. In deep learning based schemes, a large dataset is required to train deep neural networks. These networks can only process images whose size should be sufficiently large. The spectral curve dataset cannot be processed by these networks usually. In this article, a novel open set recognition scheme is proposed based on a revised fuzzy rule classifier with application to spectral curve classification. A small spectral dataset is required to train this fuzzy rule classifier. After fuzzy rule training, the used fuzzy rules can be used to classify known classes, whereas the unused rules to reject unknown classes. Therefore, the contraction and refinement of classification boundary for known classes are not required. When a sample is sent into this revised fuzzy classifier, we get some fuzzy rules and their corresponding nonzero scores. The probabilistic distribution is then evaluated by the Entropy or Gini index. If this probabilistic distribution is certain, one class corresponding to the maximal score is final output class. Otherwise, this sample is rejected as unknown class. The comparison experimental results on mango, melamine and wood datasets demonstrate that our proposed scheme achieves approximately mean 42.30 % improvement in terms of overall recognition accuracy compared to baseline models.
开放集识别的目的是对已知类进行分类,同时对未知类进行拒绝。这项研究通常是基于机器学习和深度学习进行的。在基于机器学习的方案中,必须对已知类的分类边界进行精确的压缩和细化,这在实践中有时很难解决。在基于深度学习的方案中,需要大量的数据集来训练深度神经网络。这些网络只能处理尺寸足够大的图像。这些网络通常无法处理光谱曲线数据集。本文提出了一种新的基于改进模糊规则分类器的开集识别方案,并将其应用于光谱曲线分类。训练这种模糊规则分类器需要一个小的光谱数据集。经过模糊规则训练后,使用的模糊规则可以用来对已知的类进行分类,而未使用的规则可以用来拒绝未知的类。因此,不需要对已知类的分类边界进行压缩和细化。将样本输入到改进后的模糊分类器中,得到一些模糊规则及其相应的非零分数。然后用熵或基尼指数来评估概率分布。如果这个概率分布是确定的,那么最大分数对应的一个类就是最终输出类。否则,此样品被拒绝为未知类别。芒果、三聚氰胺和木材数据集的对比实验结果表明,与基线模型相比,我们提出的方案在整体识别精度方面提高了约42.30 %。
{"title":"Open set recognition based on fuzzy rule classifier and probabilistic distribution analysis with application to spectral material classification","authors":"Peng Zhao ,&nbsp;Zhen-Yu Li","doi":"10.1016/j.asoc.2026.114607","DOIUrl":"10.1016/j.asoc.2026.114607","url":null,"abstract":"<div><div>Open set recognition aims to classify known classes and reject unknown classes simultaneously. This study is often performed based on machine learning and deep learning. In machine learning based schemes, the classification boundary of known classes must be contracted and refined accurately, which is sometimes hard to solve in practice. In deep learning based schemes, a large dataset is required to train deep neural networks. These networks can only process images whose size should be sufficiently large. The spectral curve dataset cannot be processed by these networks usually. In this article, a novel open set recognition scheme is proposed based on a revised fuzzy rule classifier with application to spectral curve classification. A small spectral dataset is required to train this fuzzy rule classifier. After fuzzy rule training, the used fuzzy rules can be used to classify known classes, whereas the unused rules to reject unknown classes. Therefore, the contraction and refinement of classification boundary for known classes are not required. When a sample is sent into this revised fuzzy classifier, we get some fuzzy rules and their corresponding nonzero scores. The probabilistic distribution is then evaluated by the Entropy or Gini index. If this probabilistic distribution is certain, one class corresponding to the maximal score is final output class. Otherwise, this sample is rejected as unknown class. The comparison experimental results on mango, melamine and wood datasets demonstrate that our proposed scheme achieves approximately mean 42.30 % improvement in terms of overall recognition accuracy compared to baseline models.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114607"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980368","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Local-to-global multi-modal domain adversarial learning framework for multi-source domain mechanical fault diagnosis 局部到全局多模态域对抗学习框架的多源域机械故障诊断
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114626
Yuan Zhou, Xiaofeng Yue
Integrating rich information from diverse source domains significantly enhances cross-domain knowledge transfer capabilities in mechanical fault diagnosis, which is critical for addressing fault diagnosis demands under complex and varying operating conditions. However, existing methods typically perform domain alignment at the global feature level while neglecting the local domain shift accumulation effects in time-frequency features, resulting in inadequate suppression of multi-level domain discrepancies. To address the aforementioned issues, a local-to-global multi-modal domain adversarial learning framework for multi-source domain mechanical fault diagnosis is proposed in this paper. First, a multi-scale time-frequency feature learning network is designed to achieve effective learning from multi-modal local features to unified global representations through parallel heterogeneous feature encoding and adaptive feature aggregation. To fundamentally eliminate multi-level domain bias, a hierarchical local-to-global domain adversarial learning strategy is further proposed. Through constructing a multi-level progressive domain discrimination system to achieve cross-domain collaborative adversarial training. On this basis, a globally-guided local neighborhood consistency learning mechanism is constructed, which generates high-quality pseudo-labels through joint adaptive cross-domain semantic association modeling and multi-level entropy-weighted confidence evaluation, effectively achieving cross-domain knowledge transfer. Extensive experiments on three datasets demonstrate that the proposed method achieves an average diagnostic accuracy of 93.46 %, outperforming the best baseline by 5.34 % across all 10 cross-domain transfer scenarios.
集成多源领域的丰富信息,显著提高了机械故障诊断的跨领域知识转移能力,对满足复杂多变工况下的故障诊断需求至关重要。然而,现有方法通常在全局特征水平上进行域对齐,而忽略了时频特征的局部域漂移累积效应,导致对多层次域差异的抑制不足。针对上述问题,本文提出了一种局部到全局的多模态域对抗学习框架,用于多源域机械故障诊断。首先,设计了一个多尺度时频特征学习网络,通过并行异构特征编码和自适应特征聚合,实现从多模态局部特征到统一全局表征的有效学习;为了从根本上消除多层次的领域偏差,进一步提出了一种分层的局部到全局的领域对抗学习策略。通过构建多层次的渐进式领域判别系统,实现跨领域协同对抗训练。在此基础上,构建了全局引导的局部邻域一致性学习机制,该机制通过联合自适应跨域语义关联建模和多级熵加权置信度评估生成高质量伪标签,有效实现了跨域知识转移。在三个数据集上进行的大量实验表明,该方法在所有10种跨域传输场景下的平均诊断准确率为93.46 %,比最佳基线高出5.34 %。
{"title":"Local-to-global multi-modal domain adversarial learning framework for multi-source domain mechanical fault diagnosis","authors":"Yuan Zhou,&nbsp;Xiaofeng Yue","doi":"10.1016/j.asoc.2026.114626","DOIUrl":"10.1016/j.asoc.2026.114626","url":null,"abstract":"<div><div>Integrating rich information from diverse source domains significantly enhances cross-domain knowledge transfer capabilities in mechanical fault diagnosis, which is critical for addressing fault diagnosis demands under complex and varying operating conditions. However, existing methods typically perform domain alignment at the global feature level while neglecting the local domain shift accumulation effects in time-frequency features, resulting in inadequate suppression of multi-level domain discrepancies. To address the aforementioned issues, a local-to-global multi-modal domain adversarial learning framework for multi-source domain mechanical fault diagnosis is proposed in this paper. First, a multi-scale time-frequency feature learning network is designed to achieve effective learning from multi-modal local features to unified global representations through parallel heterogeneous feature encoding and adaptive feature aggregation. To fundamentally eliminate multi-level domain bias, a hierarchical local-to-global domain adversarial learning strategy is further proposed. Through constructing a multi-level progressive domain discrimination system to achieve cross-domain collaborative adversarial training. On this basis, a globally-guided local neighborhood consistency learning mechanism is constructed, which generates high-quality pseudo-labels through joint adaptive cross-domain semantic association modeling and multi-level entropy-weighted confidence evaluation, effectively achieving cross-domain knowledge transfer. Extensive experiments on three datasets demonstrate that the proposed method achieves an average diagnostic accuracy of 93.46 %, outperforming the best baseline by 5.34 % across all 10 cross-domain transfer scenarios.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114626"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980293","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A novel multi-sensor fusion method for diagnosing insulation defects in gas-insulated substations guided by adaptive-attention and contrastive-based few-shot learning 基于自适应关注和基于对比的少次学习的多传感器融合气体绝缘变电站绝缘缺陷诊断方法
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114600
Yanxin Wang , Jing Yan , Zhengrun Zhang , Jianhua Wang , Zhiyuan Liu , Yingsan Geng , Dipti Srinivasan
Deep learning based multi-source fusion has shown significant potential in diagnosing insulation defects in gas-insulated switchgear (GIS). However, its applicability in real engineering scenarios remains limited. Existing fusion frameworks struggle to model the heterogeneous sensing characteristics of optical and electrical channels, often relying on rigid or shallow interaction schemes that fail to capture modality complementarity. In addition, field data are typically scarce and distribution-shifted, making it difficult for conventional models to learn discriminative and generalizable features under small-sample conditions. To address these challenges, we propose a novel multi-sensor fusion few-shot learning network (MSFFLN) for GIS insulation defect diagnosis. First, a deep fusion network is developed to construct comprehensive representations of insulation defects. Specifically, a feature weighting fusion module is employed to improve robustness, while an adaptive attention-based fusion block suppresses redundant and aliased information, emphasizing the most discriminative features. Second, a contrastive learning-based few-shot strategy is introduced. By computing global and local contrastive losses and using contrastive learning as an auxiliary task, the model learns more accurate and generalizable feature representations. In addition, salient region mixing across samples is applied to decouple class-level and instance-level feature correlations. Finally, field experiments validate the effectiveness of the MSFFLN. Results show that the MSFFLN achieves a diagnostic accuracy of 95.06% with only 10 support samples, significantly outperforming baseline and ablation models in small-sample GIS insulation defect diagnosis.
基于深度学习的多源融合在气体绝缘开关设备(GIS)绝缘缺陷诊断中显示出巨大的潜力。然而,它在实际工程场景中的适用性仍然有限。现有的融合框架难以模拟光和电通道的异构传感特性,通常依赖于无法捕获模态互补性的刚性或浅层相互作用方案。此外,现场数据通常是稀缺的,并且分布移位,这使得传统模型难以在小样本条件下学习判别和推广特征。为了解决这些问题,我们提出了一种新的多传感器融合少镜头学习网络(MSFFLN)用于GIS绝缘缺陷诊断。首先,建立深度融合网络,构建绝缘缺陷的综合表征。具体而言,采用特征加权融合模块来提高鲁棒性,而基于自适应注意力的融合块则抑制冗余和混叠信息,强调最具区别性的特征。其次,介绍了一种基于对比学习的少镜头策略。通过计算全局和局部对比损失,并将对比学习作为辅助任务,该模型学习到更准确和可推广的特征表示。此外,跨样本的显著区域混合应用于解耦类级和实例级特征相关性。最后,通过现场实验验证了该算法的有效性。结果表明,MSFFLN在10个支持样本的情况下,诊断准确率达到95.06%,在小样本GIS绝缘缺陷诊断中显著优于基线模型和消融模型。
{"title":"A novel multi-sensor fusion method for diagnosing insulation defects in gas-insulated substations guided by adaptive-attention and contrastive-based few-shot learning","authors":"Yanxin Wang ,&nbsp;Jing Yan ,&nbsp;Zhengrun Zhang ,&nbsp;Jianhua Wang ,&nbsp;Zhiyuan Liu ,&nbsp;Yingsan Geng ,&nbsp;Dipti Srinivasan","doi":"10.1016/j.asoc.2026.114600","DOIUrl":"10.1016/j.asoc.2026.114600","url":null,"abstract":"<div><div>Deep learning based multi-source fusion has shown significant potential in diagnosing insulation defects in gas-insulated switchgear (GIS). However, its applicability in real engineering scenarios remains limited. Existing fusion frameworks struggle to model the heterogeneous sensing characteristics of optical and electrical channels, often relying on rigid or shallow interaction schemes that fail to capture modality complementarity. In addition, field data are typically scarce and distribution-shifted, making it difficult for conventional models to learn discriminative and generalizable features under small-sample conditions. To address these challenges, we propose a novel multi-sensor fusion few-shot learning network (MSFFLN) for GIS insulation defect diagnosis. First, a deep fusion network is developed to construct comprehensive representations of insulation defects. Specifically, a feature weighting fusion module is employed to improve robustness, while an adaptive attention-based fusion block suppresses redundant and aliased information, emphasizing the most discriminative features. Second, a contrastive learning-based few-shot strategy is introduced. By computing global and local contrastive losses and using contrastive learning as an auxiliary task, the model learns more accurate and generalizable feature representations. In addition, salient region mixing across samples is applied to decouple class-level and instance-level feature correlations. Finally, field experiments validate the effectiveness of the MSFFLN. Results show that the MSFFLN achieves a diagnostic accuracy of 95.06% with only 10 support samples, significantly outperforming baseline and ablation models in small-sample GIS insulation defect diagnosis.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114600"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980371","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Minimizing predictive class confusion: A unified framework for cross-domain fault diagnosis under different label and domain configurations 最小化预测类混淆:不同标签和域配置下跨域故障诊断的统一框架
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-12 DOI: 10.1016/j.asoc.2026.114640
Yuteng Zhang , Leijun Shi , Qinkai Han , Xueping Xu , Hui Liu , Fulei Chu , Yun Kong
Reliable fault diagnosis is essential for maintaining the safety and operational efficiency of advanced industrial equipment. Diagnostic methods based on transfer learning techniques such as unsupervised domain adaptation have demonstrated considerable potential for engineering applications. However, existing methods rely on the predefined specific assumptions regarding inter-domain label relationships and domain configurations, which severely restrict their practical applications. To address these issues, this study proposes a unified cross-domain fault diagnosis framework for transfer diagnostic tasks under different label and domain configurations, including closed-set, partial-set, open-set, multi-source domain, and multi-target domain transfer diagnostics. The presented unified framework leverages a predictive class confusion bias shared across multiple scenarios to guide cross-domain knowledge transfer, thus enabling effective domain adaptation to various transfer diagnostic scenarios. To measure the tendency of class confusion accurately, a prototype similarity-based fault discrimination method is developed, which enhances classification robustness and provides reliable prediction distributions for predictive class confusion estimation. Then, a label smoothing-based probability calibration mechanism is designed for probability regularization, mitigating erroneous class confusion estimation caused by prediction bias. Additionally, an open-set cross-domain diagnosis method with an adaptive threshold is provided to handle potential unseen faults, which has a straightforward design and can be implemented easily within the unified cross-domain diagnosis framework. Extensive experiments on two transmission system datasets verify the general applicability of the proposed unified framework across five cross-domain diagnosis settings, and its performance is competitive with advanced scenario-specific transfer diagnosis methods, providing an effective tool for intelligent diagnosis in industrial scenarios.
可靠的故障诊断对于维护先进工业设备的安全和运行效率至关重要。基于迁移学习技术(如无监督域自适应)的诊断方法已显示出相当大的工程应用潜力。然而,现有的方法依赖于对域间标签关系和域配置预先定义的特定假设,这严重限制了它们的实际应用。针对这些问题,本研究提出了一个统一的跨域故障诊断框架,用于不同标签和域配置下的传输诊断任务,包括闭集、部分集、开集、多源域和多目标域传输诊断。提出的统一框架利用在多个场景中共享的预测性类混淆偏差来指导跨领域知识转移,从而能够有效地适应各种转移诊断场景。为了准确测量类混淆趋势,提出了一种基于原型相似度的故障判别方法,增强了分类鲁棒性,为预测类混淆估计提供了可靠的预测分布。然后,设计了一种基于标记平滑的概率校正机制进行概率正则化,减轻了由于预测偏差导致的错误类混淆估计。此外,提出了一种具有自适应阈值的开集跨域诊断方法来处理潜在的未见故障,该方法设计简单,易于在统一的跨域诊断框架内实现。在两个传输系统数据集上的大量实验验证了所提出的统一框架在五种跨域诊断设置中的普遍适用性,其性能与先进的场景特定传输诊断方法相媲美,为工业场景的智能诊断提供了有效的工具。
{"title":"Minimizing predictive class confusion: A unified framework for cross-domain fault diagnosis under different label and domain configurations","authors":"Yuteng Zhang ,&nbsp;Leijun Shi ,&nbsp;Qinkai Han ,&nbsp;Xueping Xu ,&nbsp;Hui Liu ,&nbsp;Fulei Chu ,&nbsp;Yun Kong","doi":"10.1016/j.asoc.2026.114640","DOIUrl":"10.1016/j.asoc.2026.114640","url":null,"abstract":"<div><div>Reliable fault diagnosis is essential for maintaining the safety and operational efficiency of advanced industrial equipment. Diagnostic methods based on transfer learning techniques such as unsupervised domain adaptation have demonstrated considerable potential for engineering applications. However, existing methods rely on the predefined specific assumptions regarding inter-domain label relationships and domain configurations, which severely restrict their practical applications. To address these issues, this study proposes a unified cross-domain fault diagnosis framework for transfer diagnostic tasks under different label and domain configurations, including closed-set, partial-set, open-set, multi-source domain, and multi-target domain transfer diagnostics. The presented unified framework leverages a predictive class confusion bias shared across multiple scenarios to guide cross-domain knowledge transfer, thus enabling effective domain adaptation to various transfer diagnostic scenarios. To measure the tendency of class confusion accurately, a prototype similarity-based fault discrimination method is developed, which enhances classification robustness and provides reliable prediction distributions for predictive class confusion estimation. Then, a label smoothing-based probability calibration mechanism is designed for probability regularization, mitigating erroneous class confusion estimation caused by prediction bias. Additionally, an open-set cross-domain diagnosis method with an adaptive threshold is provided to handle potential unseen faults, which has a straightforward design and can be implemented easily within the unified cross-domain diagnosis framework. Extensive experiments on two transmission system datasets verify the general applicability of the proposed unified framework across five cross-domain diagnosis settings, and its performance is competitive with advanced scenario-specific transfer diagnosis methods, providing an effective tool for intelligent diagnosis in industrial scenarios.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114640"},"PeriodicalIF":6.6,"publicationDate":"2026-01-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145979792","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Text-guided and Edge-guided fusion network for enhancing medical image segmentation 文本引导和边缘引导融合网络增强医学图像分割
IF 6.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-11 DOI: 10.1016/j.asoc.2026.114629
Zhiyong Tan , Ruifen Cao , Pijing Wei , Chao Zhou , Yansen Su , Chunhou Zheng
Medical image segmentation is crucial for disease diagnosis and treatment, especially in oncology. However, current image-based unimodal methods are limited by data acquisition challenges, making it difficult to improve segmentation performance. The medical text annotation generated together with images provides rich semantic information at low cost, and the utilization of textual data presents a complementary approach to enhance analytical capabilities of image-based unimodal methods. However, medical image segmentation still encounters challenges, including complex background distributions, variable lesion shapes and sizes, and ambiguous boundaries. Accurate capture of edge information between the foreground (lesion or region of interest) and background (surrounding tissue) significantly influences segmentation outcomes. To address these challenges, we propose the Text-guided and Edge-guided Fusion Network (TEFNet), which integrates medical text knowledge and edge information to enhance segmentation performance. The text-guided strategy enriches the model's understanding of image content by leveraging semantic information from textual reports associated with medical images, enabling more accurate segmentation judgements. Edge-guided attention enhances the model's ability to identify anatomical structures and tissue boundaries by leveraging high-frequency edge information, enabling more reliable boundary delineation. Additionally, we introduce the Segmentation Anything Model (SAM), specifically tailored for the biomedical domain, to further enhance medical feature representation. Comprehensive evaluations across three established medical image segmentation benchmarks demonstrate that TEFNet, through a synergistic fusion of visual and textual features, achieves superior segmentation accuracy compared with current leading methods, validating the effectiveness of joint visual-text feature learning in medical image segmentation.
医学图像分割对于疾病的诊断和治疗至关重要,尤其是在肿瘤中。然而,目前基于图像的单峰分割方法受到数据采集挑战的限制,难以提高分割性能。与图像一起生成的医学文本标注以低成本提供了丰富的语义信息,对文本数据的利用为增强基于图像的单峰方法的分析能力提供了补充途径。然而,医学图像分割仍然面临着复杂的背景分布、多变的病灶形状和大小、模糊的边界等挑战。准确捕获前景(病变或感兴趣的区域)和背景(周围组织)之间的边缘信息显著影响分割结果。为了解决这些问题,我们提出了文本引导和边缘引导融合网络(TEFNet),该网络集成了医学文本知识和边缘信息,以提高分割性能。文本引导策略通过利用与医学图像相关的文本报告中的语义信息,丰富了模型对图像内容的理解,从而实现更准确的分割判断。边缘引导注意力通过利用高频边缘信息增强了模型识别解剖结构和组织边界的能力,从而实现更可靠的边界描绘。此外,我们引入了专为生物医学领域量身定制的分割任意模型(SAM),以进一步增强医学特征表示。通过三个已建立的医学图像分割基准的综合评估表明,TEFNet通过视觉和文本特征的协同融合,与目前领先的方法相比,实现了更高的分割精度,验证了视觉和文本联合特征学习在医学图像分割中的有效性。
{"title":"Text-guided and Edge-guided fusion network for enhancing medical image segmentation","authors":"Zhiyong Tan ,&nbsp;Ruifen Cao ,&nbsp;Pijing Wei ,&nbsp;Chao Zhou ,&nbsp;Yansen Su ,&nbsp;Chunhou Zheng","doi":"10.1016/j.asoc.2026.114629","DOIUrl":"10.1016/j.asoc.2026.114629","url":null,"abstract":"<div><div>Medical image segmentation is crucial for disease diagnosis and treatment, especially in oncology. However, current image-based unimodal methods are limited by data acquisition challenges, making it difficult to improve segmentation performance. The medical text annotation generated together with images provides rich semantic information at low cost, and the utilization of textual data presents a complementary approach to enhance analytical capabilities of image-based unimodal methods. However, medical image segmentation still encounters challenges, including complex background distributions, variable lesion shapes and sizes, and ambiguous boundaries. Accurate capture of edge information between the foreground (lesion or region of interest) and background (surrounding tissue) significantly influences segmentation outcomes. To address these challenges, we propose the Text-guided and Edge-guided Fusion Network (TEFNet), which integrates medical text knowledge and edge information to enhance segmentation performance. The text-guided strategy enriches the model's understanding of image content by leveraging semantic information from textual reports associated with medical images, enabling more accurate segmentation judgements. Edge-guided attention enhances the model's ability to identify anatomical structures and tissue boundaries by leveraging high-frequency edge information, enabling more reliable boundary delineation. Additionally, we introduce the Segmentation Anything Model (SAM), specifically tailored for the biomedical domain, to further enhance medical feature representation. Comprehensive evaluations across three established medical image segmentation benchmarks demonstrate that TEFNet, through a synergistic fusion of visual and textual features, achieves superior segmentation accuracy compared with current leading methods, validating the effectiveness of joint visual-text feature learning in medical image segmentation.</div></div>","PeriodicalId":50737,"journal":{"name":"Applied Soft Computing","volume":"190 ","pages":"Article 114629"},"PeriodicalIF":6.6,"publicationDate":"2026-01-11","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145980379","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Applied Soft Computing
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1