首页 > 最新文献

Pattern Recognition最新文献

英文 中文
Universal image restoration via task-adaptive diffusion degradation oriented model 面向任务自适应扩散退化模型的通用图像恢复
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-29 DOI: 10.1016/j.patcog.2026.113193
Junxi Wu , Sicheng Pan , Naiqi Li , Bin Chen , Baoyi An , Zhi Wang , Yaowei Wang , Shu-Tao Xia
Image restoration covers many sub-tasks, including image super-resolution, inpainting, deblurring, compressed sensing, etc. However, existing methods often struggle to balance generality across tasks and specificity to degradation patterns. Multi-task methods relying on generative priors and neglect the diversity of degradation operators, leading to worse performance and hallucinations, while task-specific methods cannot capture the generality of different image restoration tasks. In this work, we introduce the Task-Adaptive Diffusion Degradation Oriented Model (DDOM), which bridges this gap by integrating a pre-trained diffusion model as a general generative prior with lightweight Degradation Oriented Adapters (DO-Adapters) to align task-specific knowledge. DO-Adapters extract task-specific priors and refine the diffusion process at each timestep, guiding the diffusion model toward accurate restoration while reducing hallucinations. This design decouples task adaptation from the pre-trained model, enabling plug-and-play deployment across tasks with low computation (0.36% of the pre-trained models). Experimental results demonstrate DDOM outperforms the superior multi-task methods, while matching or surpassing task-specific methods in visual quality. Notably, DDOM exhibits strong generalization in out-of-distribution datasets and extreme degradation scenarios, validating its effectiveness in unifying generality.
图像恢复涉及许多子任务,包括图像超分辨率、修复、去模糊、压缩感知等。然而,现有的方法往往难以平衡任务之间的通用性和退化模式的特殊性。多任务方法依赖于生成先验,忽略了退化算子的多样性,导致性能较差和产生幻觉,而特定任务方法无法捕捉到不同图像恢复任务的通用性。在这项工作中,我们引入了面向任务的自适应扩散退化模型(DDOM),它通过将预训练的扩散模型作为通用生成先验与轻量级面向退化的适配器(DO-Adapters)集成来校准特定于任务的知识,从而弥合了这一差距。DO-Adapters在每个时间步提取特定于任务的先验并细化扩散过程,在减少幻觉的同时引导扩散模型走向准确的恢复。这种设计将任务自适应与预训练模型解耦,以低计算量(预训练模型的0.36%)实现跨任务的即插即用部署。实验结果表明,DDOM在视觉质量上优于多任务方法,在视觉质量上与特定任务方法相匹配甚至超越。值得注意的是,DDOM在分布外数据集和极端退化场景中表现出很强的泛化,验证了它在统一泛化方面的有效性。
{"title":"Universal image restoration via task-adaptive diffusion degradation oriented model","authors":"Junxi Wu ,&nbsp;Sicheng Pan ,&nbsp;Naiqi Li ,&nbsp;Bin Chen ,&nbsp;Baoyi An ,&nbsp;Zhi Wang ,&nbsp;Yaowei Wang ,&nbsp;Shu-Tao Xia","doi":"10.1016/j.patcog.2026.113193","DOIUrl":"10.1016/j.patcog.2026.113193","url":null,"abstract":"<div><div>Image restoration covers many sub-tasks, including image super-resolution, inpainting, deblurring, compressed sensing, etc. However, existing methods often struggle to balance generality across tasks and specificity to degradation patterns. Multi-task methods relying on generative priors and neglect the diversity of degradation operators, leading to worse performance and hallucinations, while task-specific methods cannot capture the generality of different image restoration tasks. In this work, we introduce the Task-Adaptive Diffusion Degradation Oriented Model (DDOM), which bridges this gap by integrating a pre-trained diffusion model as a general generative prior with lightweight Degradation Oriented Adapters (DO-Adapters) to align task-specific knowledge. DO-Adapters extract task-specific priors and refine the diffusion process at each timestep, guiding the diffusion model toward accurate restoration while reducing hallucinations. This design decouples task adaptation from the pre-trained model, enabling plug-and-play deployment across tasks with low computation (0.36% of the pre-trained models). Experimental results demonstrate DDOM outperforms the superior multi-task methods, while matching or surpassing task-specific methods in visual quality. Notably, DDOM exhibits strong generalization in out-of-distribution datasets and extreme degradation scenarios, validating its effectiveness in unifying generality.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113193"},"PeriodicalIF":7.6,"publicationDate":"2026-01-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174543","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
MonoTDF: Temporal deep feature learning for generalizable monocular 3D object detection 用于泛化单目3D物体检测的时间深度特征学习
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113184
Xiu-Zhi Chen , Yi-Kai Chiu , Chih-Sheng Huang , Yen-Lin Chen
Monocular 3D object detection has gained significant attention due to its cost-effectiveness and practicality in real-world applications. However, existing monocular methods often struggle with depth estimation and spatial consistency, limiting their accuracy in complex environments. In this work, we introduce a Temporal Deep Feature Learning framework, which enhances monocular 3D object detection by integrating temporal features across sequential frames. Our approach leverages a novel deep feature auxiliary module based on convolutional recurrent structures, effectively capturing spatiotemporal information to improve depth perception and detection robustness. The proposed module is model-agnostic and can be seamlessly integrated into various existing monocular detection frameworks. Extensive experiments across multiple state-of-the-art monocular 3D object detection models demonstrate consistent performance improvements, particularly in detecting small or partially occluded objects. Our results highlight the effectiveness and generalizability of the proposed approach, making it a promising solution for real-world autonomous perception systems. The source code of this work is at: https://github.com/Shuray36/MonoTDF-Temporal-Deep-Feature-Learning-for-Generalizable-Monocular-3D-Object-Detection.
单目三维目标检测由于其成本效益和在实际应用中的实用性而受到了极大的关注。然而,现有的单目方法往往在深度估计和空间一致性方面存在问题,限制了它们在复杂环境下的精度。在这项工作中,我们引入了一个时间深度特征学习框架,该框架通过整合跨序列帧的时间特征来增强单目3D目标检测。我们的方法利用一种基于卷积循环结构的新型深度特征辅助模块,有效地捕获时空信息,以提高深度感知和检测鲁棒性。该模块与模型无关,可以无缝集成到各种现有的单目检测框架中。在多个最先进的单眼3D物体检测模型上进行的广泛实验表明,性能得到了一致的改善,特别是在检测小或部分遮挡的物体方面。我们的结果突出了所提出方法的有效性和可泛化性,使其成为现实世界自主感知系统的一个有前途的解决方案。这项工作的源代码在:https://github.com/Shuray36/MonoTDF-Temporal-Deep-Feature-Learning-for-Generalizable-Monocular-3D-Object-Detection。
{"title":"MonoTDF: Temporal deep feature learning for generalizable monocular 3D object detection","authors":"Xiu-Zhi Chen ,&nbsp;Yi-Kai Chiu ,&nbsp;Chih-Sheng Huang ,&nbsp;Yen-Lin Chen","doi":"10.1016/j.patcog.2026.113184","DOIUrl":"10.1016/j.patcog.2026.113184","url":null,"abstract":"<div><div>Monocular 3D object detection has gained significant attention due to its cost-effectiveness and practicality in real-world applications. However, existing monocular methods often struggle with depth estimation and spatial consistency, limiting their accuracy in complex environments. In this work, we introduce a Temporal Deep Feature Learning framework, which enhances monocular 3D object detection by integrating temporal features across sequential frames. Our approach leverages a novel deep feature auxiliary module based on convolutional recurrent structures, effectively capturing spatiotemporal information to improve depth perception and detection robustness. The proposed module is model-agnostic and can be seamlessly integrated into various existing monocular detection frameworks. Extensive experiments across multiple state-of-the-art monocular 3D object detection models demonstrate consistent performance improvements, particularly in detecting small or partially occluded objects. Our results highlight the effectiveness and generalizability of the proposed approach, making it a promising solution for real-world autonomous perception systems. The source code of this work is at: <span><span>https://github.com/Shuray36/MonoTDF-Temporal-Deep-Feature-Learning-for-Generalizable-Monocular-3D-Object-Detection</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113184"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174372","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing sampling performance in XGBoost by ensemble feature engineering 利用集成特征工程增强XGBoost的采样性能
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113169
Lingping Kong , Ponnuthurai Nagaratnam Suganthan , Václav Snášel , Varun Ojha , Jeng-Shyang Pan
Feature engineering is crucial in enhancing model performance, yet effectively combining multiple feature transformations to maximize their benefits remains a key challenge. In this study, we propose an innovative approach that integrates various feature engineering techniques within the boosting steps of the XGBoost algorithm and adapts the gradient-based one-sided sampling, forming an enhanced classifier named Feat-XGBoost. Feat-XGBoost aims to improve data representation and separation in model learning by iteratively applying feature transformations. We evaluated this approach on 61 diverse datasets, comparing its performance with 12 baseline classifiers, including standard XGBoost. The results show that Feat-XGBoost achieved improved accuracy in 36 datasets, with a notable increase in accuracy of 0.31 in the Balloon dataset and 13.5% on the hill-valley dataset. Across 61 datasets, the method demonstrates an average accuracy increase of 0.9080%, highlighting its effectiveness in enhancing model performance. These findings indicate that integrating multiple feature engineering strategies within the boosting framework can yield significant gains in model accuracy and robustness. We propose a simple ensemble, the Mix-XGBoost classifier, which selects the final classifier based on validation results from both the Feat-XGBoost and the baseline model. The results indicate that Mix-XGBoost enhances performance by leveraging the strengths of both classifiers. The source code will be publicly accessible after acceptance at https://github.com/lingping-fuzzy.
特征工程在增强模型性能方面是至关重要的,然而有效地组合多个特征转换以最大化其收益仍然是一个关键的挑战。在本研究中,我们提出了一种创新的方法,将各种特征工程技术集成到XGBoost算法的提升步骤中,并适应基于梯度的单侧采样,形成一个增强的分类器,命名为feature -XGBoost。feature - xgboost旨在通过迭代地应用特征转换来改进模型学习中的数据表示和分离。我们在61个不同的数据集上评估了这种方法,并将其与12个基线分类器(包括标准XGBoost)的性能进行了比较。结果表明,feature - xgboost在36个数据集上实现了精度的提高,其中气球数据集的精度提高了0.31,山谷数据集的精度提高了13.5%。在61个数据集上,该方法的平均准确率提高了0.9080%,突出了其在提高模型性能方面的有效性。这些发现表明,在增强框架内集成多种特征工程策略可以显著提高模型的准确性和鲁棒性。我们提出了一个简单的集成,Mix-XGBoost分类器,它根据fat - xgboost和基线模型的验证结果选择最终的分类器。结果表明Mix-XGBoost通过利用两种分类器的优势来提高性能。接受后,源代码将在https://github.com/lingping-fuzzy上公开访问。
{"title":"Enhancing sampling performance in XGBoost by ensemble feature engineering","authors":"Lingping Kong ,&nbsp;Ponnuthurai Nagaratnam Suganthan ,&nbsp;Václav Snášel ,&nbsp;Varun Ojha ,&nbsp;Jeng-Shyang Pan","doi":"10.1016/j.patcog.2026.113169","DOIUrl":"10.1016/j.patcog.2026.113169","url":null,"abstract":"<div><div>Feature engineering is crucial in enhancing model performance, yet effectively combining multiple feature transformations to maximize their benefits remains a key challenge. In this study, we propose an innovative approach that integrates various feature engineering techniques within the boosting steps of the XGBoost algorithm and adapts the gradient-based one-sided sampling, forming an enhanced classifier named Feat-XGBoost. Feat-XGBoost aims to improve data representation and separation in model learning by iteratively applying feature transformations. We evaluated this approach on 61 diverse datasets, comparing its performance with 12 baseline classifiers, including standard XGBoost. The results show that Feat-XGBoost achieved improved accuracy in 36 datasets, with a notable increase in accuracy of 0.31 in the <em>Balloon</em> dataset and 13.5% on the <em>hill-valley</em> dataset. Across 61 datasets, the method demonstrates an average accuracy increase of 0.9080%, highlighting its effectiveness in enhancing model performance. These findings indicate that integrating multiple feature engineering strategies within the boosting framework can yield significant gains in model accuracy and robustness. We propose a simple ensemble, the Mix-XGBoost classifier, which selects the final classifier based on validation results from both the Feat-XGBoost and the baseline model. The results indicate that Mix-XGBoost enhances performance by leveraging the strengths of both classifiers. The source code will be publicly accessible after acceptance at <span><span>https://github.com/lingping-fuzzy</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113169"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174259","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Data-efficient generalization for zero-shot composed image retrieval 零镜头合成图像检索的数据高效泛化
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113187
Zining Chen , Zhicheng Zhao , Fei Su , Shijian Lu
Zero-shot Composed Image Retrieval (ZS-CIR) aims to retrieve the target image based on a reference image and a text description without requiring in-distribution triplets for training. One prevalent approach follows the vision-language pretraining paradigm that employs a mapping network to transfer the image embedding to a pseudo-word token in the text embedding space. However, this approach tends to impede network generalization due to modality discrepancy and distribution shift between training and inference. To this end, we propose a Data-efficient Generalization (DeG) framework, including two novel designs, namely, Textual Supplement (TS) module and Semantic Sample Pool (SSP) module. The TS module exploits compositional textual semantics during training, enhancing the pseudo-word token with more linguistic semantics and thus mitigating the modality discrepancy effectively. The SSP module exploits the zero-shot capability of pretrained Vision-Language Models (VLMs), alleviating the distribution shift and mitigating the overfitting issue from the redundancy of the large-scale image-text data. Extensive experiments over four ZS-CIR benchmarks show that DeG outperforms the state-of-the-art (SOTA) methods with much less training data, and saves substantial training and inference time for practical usage.
Zero-shot组合图像检索(ZS-CIR)旨在基于参考图像和文本描述检索目标图像,而不需要在分布中三元组进行训练。一种流行的方法遵循视觉语言预训练范式,使用映射网络将图像嵌入转移到文本嵌入空间中的伪词标记。然而,由于训练和推理之间的模态差异和分布转移,这种方法容易阻碍网络的泛化。为此,我们提出了一个数据高效泛化(DeG)框架,其中包括两个新颖的设计,即文本补充(TS)模块和语义样本池(SSP)模块。TS模块在训练过程中利用组合文本语义,使伪词标记具有更多的语言语义,从而有效地缓解情态差异。SSP模块利用了预训练视觉语言模型(VLMs)的零射击能力,减轻了大规模图像文本数据冗余带来的分布偏移和过拟合问题。在四个ZS-CIR基准上进行的广泛实验表明,DeG在训练数据少得多的情况下优于最先进的(SOTA)方法,并为实际使用节省了大量的训练和推理时间。
{"title":"Data-efficient generalization for zero-shot composed image retrieval","authors":"Zining Chen ,&nbsp;Zhicheng Zhao ,&nbsp;Fei Su ,&nbsp;Shijian Lu","doi":"10.1016/j.patcog.2026.113187","DOIUrl":"10.1016/j.patcog.2026.113187","url":null,"abstract":"<div><div>Zero-shot Composed Image Retrieval (ZS-CIR) aims to retrieve the target image based on a reference image and a text description without requiring in-distribution triplets for training. One prevalent approach follows the vision-language pretraining paradigm that employs a mapping network to transfer the image embedding to a pseudo-word token in the text embedding space. However, this approach tends to impede network generalization due to modality discrepancy and distribution shift between training and inference. To this end, we propose a Data-efficient Generalization (DeG) framework, including two novel designs, namely, Textual Supplement (TS) module and Semantic Sample Pool (SSP) module. The TS module exploits compositional textual semantics during training, enhancing the pseudo-word token with more linguistic semantics and thus mitigating the modality discrepancy effectively. The SSP module exploits the zero-shot capability of pretrained Vision-Language Models (VLMs), alleviating the distribution shift and mitigating the overfitting issue from the redundancy of the large-scale image-text data. Extensive experiments over four ZS-CIR benchmarks show that DeG outperforms the state-of-the-art (SOTA) methods with much less training data, and saves substantial training and inference time for practical usage.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113187"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146081397","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Frequency-aligned supervision for few-shot neural rendering 基于频率对齐的少镜头神经渲染监督
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113183
Su-Ji Jang, Ue-Hwan Kim
Neural rendering has shown significant potential in generating high-quality 3D scenes from sparse inputs. However, existing methods struggle to simultaneously capture both low-frequency global structures and high-frequency fine details, leading to suboptimal scene representations. To overcome this limitation, we propose a frequency-aligned supervision framework that explicitly separates the learning process into low-frequency and full-spectrum components. By introducing two sub-networks and aligning supervision signals at appropriate layers, our method enhances the formation of global structures while preserving fine details. Specifically, the low-frequency network (LFN) is supervised with low-pass targets (Gaussian-filtered images) to form global structures, while the full-spectrum network (FSN) is supervised with the original images to refine high-frequency details. The proposed approach is broadly applicable to MLP-based NeRF architectures without requiring major architectural modifications. Extensive experiments demonstrate that our method consistently improves PSNR, SSIM, and LPIPS across multiple NeRF variants and datasets, confirming its robustness in sparse input scenarios.
神经渲染在从稀疏输入生成高质量3D场景方面显示出巨大的潜力。然而,现有的方法很难同时捕获低频全局结构和高频精细细节,导致次优的场景表示。为了克服这一限制,我们提出了一个频率一致的监督框架,明确地将学习过程分为低频和全频谱组件。该方法通过引入两个子网络并在适当的层上对齐监督信号,增强了全局结构的形成,同时保留了精细的细节。其中,低频网络(LFN)采用低通目标(高斯滤波图像)进行监督,形成全局结构;全谱网络(FSN)采用原始图像进行监督,细化高频细节。所提出的方法广泛适用于基于mlp的NeRF体系结构,而不需要对体系结构进行重大修改。大量实验表明,我们的方法在多个NeRF变量和数据集上持续提高PSNR、SSIM和LPIPS,证实了其在稀疏输入场景下的鲁棒性。
{"title":"Frequency-aligned supervision for few-shot neural rendering","authors":"Su-Ji Jang,&nbsp;Ue-Hwan Kim","doi":"10.1016/j.patcog.2026.113183","DOIUrl":"10.1016/j.patcog.2026.113183","url":null,"abstract":"<div><div>Neural rendering has shown significant potential in generating high-quality 3D scenes from sparse inputs. However, existing methods struggle to simultaneously capture both low-frequency global structures and high-frequency fine details, leading to suboptimal scene representations. To overcome this limitation, we propose a frequency-aligned supervision framework that explicitly separates the learning process into low-frequency and full-spectrum components. By introducing two sub-networks and aligning supervision signals at appropriate layers, our method enhances the formation of global structures while preserving fine details. Specifically, the low-frequency network (LFN) is supervised with low-pass targets (Gaussian-filtered images) to form global structures, while the full-spectrum network (FSN) is supervised with the original images to refine high-frequency details. The proposed approach is broadly applicable to MLP-based NeRF architectures without requiring major architectural modifications. Extensive experiments demonstrate that our method consistently improves PSNR, SSIM, and LPIPS across multiple NeRF variants and datasets, confirming its robustness in sparse input scenarios.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113183"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174534","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Joint asymmetric discrete hashing for cross-modal retrieval 跨模态检索的联合非对称离散哈希
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113180
Jiaxing Li , Lin Jiang , Zuopeng Yang , Xiaozhao Fang , Shengli Xie , Yong Xu
Cross-modal hashing is one of the promising practical applications in information retrieval for multimedia data. However, there exist some technical hurdles, e.g., how to further reduce the heterogeneous gaps for cross-modal data semantically, how to extract cross-modal knowledge by jointly training data from different modality and how to better leverage the label information to generate more discriminative hash codes, etc. To overcome the above-mentioned challenges, this paper proposes a joint asymmetric discrete hashing (JADH for short) for cross-modal retrieval. By leveraging kernel mapping operation, JADH extracts the non-linear features of cross-modal data to better preserve the semantic information in the latent common space learning. Then, a joint asymmetric hash codes learning term is customized to learn hash codes for data from different modalities jointly. As such, more cross-modal information can be preserved, which can effectively reduce the heterogeneous semantic gaps. Finally, a log-likelihood similarity preserving term is proposed to boost hash codes learning from the similarity matrix, while a classifier learning term is proposed to further improve the quality of the learned hash codes. In addition, an alternative algorithm is derived to solve the optimization problem in JADH efficiently. Experimental results on four widely used datasets show that, JADH outperforms some state-of-the-art baseline methods in hashing-based cross-modal retrieval, on accuracy and efficiency.
跨模态哈希是多媒体数据信息检索中很有前途的实际应用之一。然而,存在一些技术障碍,如如何在语义上进一步减少跨模态数据的异构差距,如何通过联合训练不同模态的数据来提取跨模态知识,如何更好地利用标签信息生成更具判别性的哈希码等。为了克服上述挑战,本文提出了一种联合非对称离散哈希(joint asymmetric discrete hash,简称JADH)的跨模态检索方法。JADH通过核映射操作提取跨模态数据的非线性特征,更好地保留潜在公共空间学习中的语义信息。然后,自定义一个联合非对称哈希码学习项,用于联合学习不同模态数据的哈希码。这样可以保留更多的跨模态信息,有效减少异构语义间隙。最后,提出了一个对数似然相似保持项来增强哈希码从相似矩阵中学习的能力,同时提出了一个分类器学习项来进一步提高学习到的哈希码的质量。在此基础上,提出了一种有效解决JADH优化问题的替代算法。在四个广泛使用的数据集上的实验结果表明,在基于哈希的跨模态检索中,JADH在准确性和效率上都优于一些最先进的基线方法。
{"title":"Joint asymmetric discrete hashing for cross-modal retrieval","authors":"Jiaxing Li ,&nbsp;Lin Jiang ,&nbsp;Zuopeng Yang ,&nbsp;Xiaozhao Fang ,&nbsp;Shengli Xie ,&nbsp;Yong Xu","doi":"10.1016/j.patcog.2026.113180","DOIUrl":"10.1016/j.patcog.2026.113180","url":null,"abstract":"<div><div>Cross-modal hashing is one of the promising practical applications in information retrieval for multimedia data. However, there exist some technical hurdles, e.g., how to further reduce the heterogeneous gaps for cross-modal data semantically, how to extract cross-modal knowledge by jointly training data from different modality and how to better leverage the label information to generate more discriminative hash codes, etc. To overcome the above-mentioned challenges, this paper proposes a joint asymmetric discrete hashing (JADH for short) for cross-modal retrieval. By leveraging kernel mapping operation, JADH extracts the non-linear features of cross-modal data to better preserve the semantic information in the latent common space learning. Then, a joint asymmetric hash codes learning term is customized to learn hash codes for data from different modalities jointly. As such, more cross-modal information can be preserved, which can effectively reduce the heterogeneous semantic gaps. Finally, a log-likelihood similarity preserving term is proposed to boost hash codes learning from the similarity matrix, while a classifier learning term is proposed to further improve the quality of the learned hash codes. In addition, an alternative algorithm is derived to solve the optimization problem in JADH efficiently. Experimental results on four widely used datasets show that, JADH outperforms some state-of-the-art baseline methods in hashing-based cross-modal retrieval, on accuracy and efficiency.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113180"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174537","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
PUA : Pseudo-features made useful again for robust graph node classification under distribution shift PUA:伪特征再次对分布移位下的鲁棒图节点分类有用
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113185
Zihao Yin , Zhihai Wang , Haiyang Liu , Chuanlan Li , Muyun Yao , Shijiang Li , Fangjing Li , Jia Ren , Yanchao Yang
In graph learning tasks, distributional shifts between training and test data are widely observed, rendering the conventional assumption of independent and identically distributed (i.i.d.) data invalid. Such shifts pose substantial challenges to the generalization capability of graph neural networks. Existing approaches often focus on addressing a specific type of distributional bias–such as label selection bias or structural bias–yet in real-world scenarios, the nature of such biases is typically unobservable in advance. As a result, models tailored to a single bias type lack general applicability and may fail under more complex conditions. Causal feature disentanglement has emerged as a promising strategy to mitigate the influence of spurious correlations by isolating features that are causally relevant to the classification task. However, under severe bias or when causal features are incompletely identified, relying solely on these features may be insufficient to capture all informative signals, thereby limiting the model’s performance. To address this challenge, we propose a novel de-biased node classification framework named PUA (Pseudo-Features Made Useful Again), which integrates causal feature disentanglement with adaptive feature fusion. Specifically, PUA employs an attention mechanism to approximate the Markov boundary (MB), thereby disentangling causal and pseudo features at the feature level. It then performs adaptive selection on pseudo features to extract auxiliary information that may assist classification. Finally, causal and pseudo features are fused via a gating mechanism, resulting in robust node representations that are more resilient to various forms of distributional bias. Notably, PUA does not require prior knowledge of the bias type, making it broadly applicable to diverse scenarios. We conduct extensive experiments on six publicly available graph datasets under different types of distributional bias, including label selection bias, structural bias, mixture bias, and low-resource scenarios1. The experimental results demonstrate that PUA consistently outperforms existing methods, achieving superior classification accuracy and robustness across all bias conditions.
在图学习任务中,训练数据和测试数据之间的分布变化被广泛观察到,使得传统的数据独立和同分布(i.i.d)的假设无效。这种转变对图神经网络的泛化能力提出了实质性的挑战。现有的方法通常侧重于解决特定类型的分布偏差,如标签选择偏差或结构偏差,但在现实世界中,这种偏差的性质通常是无法提前观察到的。因此,为单一偏差类型量身定制的模型缺乏普遍适用性,并且可能在更复杂的条件下失效。通过分离与分类任务有因果关系的特征,因果特征解缠已经成为一种很有前途的策略,可以减轻虚假相关性的影响。然而,在严重偏差或因果特征未完全识别的情况下,仅依靠这些特征可能不足以捕获所有信息信号,从而限制了模型的性能。为了解决这一挑战,我们提出了一种新的去偏见节点分类框架,名为PUA (Pseudo-Features Made Useful Again),它将因果特征解纠缠与自适应特征融合结合在一起。具体来说,PUA采用了一种注意机制来近似马尔可夫边界(MB),从而在特征级别上分离因果特征和伪特征。然后对伪特征进行自适应选择,提取可能有助于分类的辅助信息。最后,因果特征和伪特征通过门控机制融合,产生健壮的节点表示,对各种形式的分布偏差更有弹性。值得注意的是,PUA不需要预先了解偏差类型,这使得它广泛适用于各种场景。我们在六个公开可用的图数据集上进行了广泛的实验,这些数据集在不同类型的分布偏差下,包括标签选择偏差、结构偏差、混合偏差和低资源场景1。实验结果表明,该方法优于现有方法,在所有偏置条件下都具有优异的分类精度和鲁棒性。
{"title":"PUA : Pseudo-features made useful again for robust graph node classification under distribution shift","authors":"Zihao Yin ,&nbsp;Zhihai Wang ,&nbsp;Haiyang Liu ,&nbsp;Chuanlan Li ,&nbsp;Muyun Yao ,&nbsp;Shijiang Li ,&nbsp;Fangjing Li ,&nbsp;Jia Ren ,&nbsp;Yanchao Yang","doi":"10.1016/j.patcog.2026.113185","DOIUrl":"10.1016/j.patcog.2026.113185","url":null,"abstract":"<div><div>In graph learning tasks, distributional shifts between training and test data are widely observed, rendering the conventional assumption of independent and identically distributed (i.i.d.) data invalid. Such shifts pose substantial challenges to the generalization capability of graph neural networks. Existing approaches often focus on addressing a specific type of distributional bias–such as label selection bias or structural bias–yet in real-world scenarios, the nature of such biases is typically unobservable in advance. As a result, models tailored to a single bias type lack general applicability and may fail under more complex conditions. Causal feature disentanglement has emerged as a promising strategy to mitigate the influence of spurious correlations by isolating features that are causally relevant to the classification task. However, under severe bias or when causal features are incompletely identified, relying solely on these features may be insufficient to capture all informative signals, thereby limiting the model’s performance. To address this challenge, we propose a novel de-biased node classification framework named PUA (<strong>P</strong>seudo-Features Made <strong>U</strong>seful <strong>A</strong>gain), which integrates causal feature disentanglement with adaptive feature fusion. Specifically, PUA employs an attention mechanism to approximate the Markov boundary (MB), thereby disentangling causal and pseudo features at the feature level. It then performs adaptive selection on pseudo features to extract auxiliary information that may assist classification. Finally, causal and pseudo features are fused via a gating mechanism, resulting in robust node representations that are more resilient to various forms of distributional bias. Notably, PUA does not require prior knowledge of the bias type, making it broadly applicable to diverse scenarios. We conduct extensive experiments on six publicly available graph datasets under different types of distributional bias, including label selection bias, structural bias, mixture bias, and low-resource scenarios<span><span><sup>1</sup></span></span>. The experimental results demonstrate that PUA consistently outperforms existing methods, achieving superior classification accuracy and robustness across all bias conditions.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113185"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174722","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Causal-guided strength differential independence sample weighting for out-of-distribution generalization 分布外泛化的因果导向强度差分独立样本加权
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113179
Haoran Yu , Weifeng Liu , Yingjie Wang , Baodi Liu , Dapeng Tao , Honglong Chen
Most machine learning methods often perform vulnerably in the real open world due to the unknown distribution shifts between training and testing distribution. Out-of-Distribution (OOD) generalization aims to make stable predictions under unknown distribution shifts by exploring invariant patterns to address this problem. One of the representative methods is independence sample weighting learning. It eliminates spurious correlations to make the model explore the true relationship between features and labels for stable prediction by learning a set of sample weights to eliminate dependencies between features. However, existing independence sample weighting methods roughly eliminate the correlation between all features, resulting in the loss of critical information and affecting the model’s performance. To address this problem, we propose a causal-guided independence sample weighting (CIW) algorithm. CIW first evaluates the causal effect of features on labels by constructing a cross domain-invariant directed acyclic graph (DAG). Subsequently, it generates a strength guiding mask based on the causal effect to differentially eliminate the correlation between different features avoiding redundant elimination of correlations between causal features. We perform extensive experiments in different experimental settings and experimental results demonstrate the effectiveness and superiority of our method.
由于训练分布和测试分布之间的未知分布变化,大多数机器学习方法在真实的开放世界中往往表现得很脆弱。out - distribution (OOD)泛化的目的是通过探索不变模式来解决这一问题,从而在未知分布变化的情况下做出稳定的预测。其中一种代表性的方法是独立样本加权学习。它通过学习一组样本权重来消除特征之间的依赖关系,消除虚假相关性,使模型探索特征与标签之间的真实关系,从而实现稳定的预测。然而,现有的独立样本加权方法大致消除了所有特征之间的相关性,导致关键信息的丢失,影响了模型的性能。为了解决这个问题,我们提出了一种因果导向的独立样本加权(CIW)算法。CIW首先通过构建一个跨域不变有向无环图(DAG)来评估特征对标签的因果效应。然后根据因果效应生成强度引导掩模,区别地消除不同特征之间的相关性,避免冗余地消除因果特征之间的相关性。我们在不同的实验环境下进行了大量的实验,实验结果证明了我们的方法的有效性和优越性。
{"title":"Causal-guided strength differential independence sample weighting for out-of-distribution generalization","authors":"Haoran Yu ,&nbsp;Weifeng Liu ,&nbsp;Yingjie Wang ,&nbsp;Baodi Liu ,&nbsp;Dapeng Tao ,&nbsp;Honglong Chen","doi":"10.1016/j.patcog.2026.113179","DOIUrl":"10.1016/j.patcog.2026.113179","url":null,"abstract":"<div><div>Most machine learning methods often perform vulnerably in the real open world due to the unknown distribution shifts between training and testing distribution. Out-of-Distribution (OOD) generalization aims to make stable predictions under unknown distribution shifts by exploring invariant patterns to address this problem. One of the representative methods is independence sample weighting learning. It eliminates spurious correlations to make the model explore the true relationship between features and labels for stable prediction by learning a set of sample weights to eliminate dependencies between features. However, existing independence sample weighting methods roughly eliminate the correlation between all features, resulting in the loss of critical information and affecting the model’s performance. To address this problem, we propose a causal-guided independence sample weighting (CIW) algorithm. CIW first evaluates the causal effect of features on labels by constructing a cross domain-invariant directed acyclic graph (DAG). Subsequently, it generates a strength guiding mask based on the causal effect to differentially eliminate the correlation between different features avoiding redundant elimination of correlations between causal features. We perform extensive experiments in different experimental settings and experimental results demonstrate the effectiveness and superiority of our method.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113179"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174868","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised multimodal emotion-unified representation learning with dual-level language-driven cross-modal emotion alignment 双层次语言驱动跨模态情感对齐的无监督多模态情感统一表征学习
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113160
Shaoze Feng , Qiyin Zhou , Yuanyuan Liu , Ke Wang , Kejun Liu , Chang Tang
Unsupervised Multimodal Emotion Recognition (UMER) aims to infer affective states by integrating unannotated multimodal data, such as text, speech, and images. A key challenge is the substantial semantic gaps between modalities, spanning both global cross-modal emotion cues and local fine-grained emotion changes within each modality. Without annotation, existing methods struggle to effectively align and fuse cross-modal emotional semantics, resulting in suboptimal UMER performance. To address this challenge, we propose DLCEA, a Dual-level Language-Driven Cross-Modal Emotion Alignment framework for robust unsupervised multimodal emotion representation learning. DLCEA leverages intrinsic emotion semantics in text to guide cross-modal alignment and introduces a dual-level semantic alignment scheme: Text-guided Cross-modal Global Emotion Alignment (TGEA) and Text-guided Cross-modal Local Emotion Alignment (TLEA). Specifically, the TGEA module treats text as an alignment anchor and applies text-guided contrastive learning to align the global emotional features of audio and visual modalities with those of the text, achieving global emotion-level consistency across all three modalities. In parallel, TLEA incorporates an emotion-aware text masking strategy and text-guided audio/video reconstruction, enabling the model to capture subtle emotional cues and reinforce local-level cross-modal consistency, thereby further addressing fine-grained emotional alignment. By jointly modeling global and local emotional alignment, DLCEA learns unified and robust multimodal emotion representations in a fully unsupervised manner. Extensive experiments on multimodal datasets such as MAFW, MOSEI, and IEMOCAP demonstrate that DLCEA outperforms existing methods by a significant margin, achieving state-of-the-art performance. These results confirm the critical role of language-driven cross-modal emotional alignment in UMER. Code is available on https://github.com/Tank9971/DLCEA.
无监督多模态情感识别(UMER)旨在通过整合无注释的多模态数据(如文本、语音和图像)来推断情感状态。一个关键的挑战是模态之间的实质性语义差距,跨越了全球跨模态情感线索和每个模态中局部细粒度情感变化。如果没有注释,现有方法很难有效地对齐和融合跨模态情感语义,导致UMER性能不理想。为了解决这一挑战,我们提出了DLCEA,这是一个用于鲁棒无监督多模态情感表征学习的双层语言驱动跨模态情感对齐框架。DLCEA利用文本中的内在情感语义来指导跨模态对齐,并引入了文本引导的跨模态全局情感对齐(TGEA)和文本引导的跨模态局部情感对齐(TLEA)两层语义对齐方案。具体而言,TGEA模块将文本视为对齐锚点,并应用文本引导的对比学习将音频和视觉模式的整体情感特征与文本的整体情感特征对齐,从而在所有三种模式中实现整体情感水平的一致性。同时,TLEA结合了情感感知的文本掩蔽策略和文本引导的音频/视频重建,使模型能够捕捉微妙的情感线索,并加强局部层面的跨模态一致性,从而进一步解决细粒度的情感一致性。通过联合建模全局和局部情绪对齐,DLCEA以完全无监督的方式学习统一和鲁棒的多模态情绪表征。在多模态数据集(如MAFW、MOSEI和IEMOCAP)上进行的大量实验表明,DLCEA在很大程度上优于现有方法,实现了最先进的性能。这些结果证实了语言驱动的跨模态情绪对齐在UMER中的关键作用。代码可在https://github.com/Tank9971/DLCEA上获得。
{"title":"Unsupervised multimodal emotion-unified representation learning with dual-level language-driven cross-modal emotion alignment","authors":"Shaoze Feng ,&nbsp;Qiyin Zhou ,&nbsp;Yuanyuan Liu ,&nbsp;Ke Wang ,&nbsp;Kejun Liu ,&nbsp;Chang Tang","doi":"10.1016/j.patcog.2026.113160","DOIUrl":"10.1016/j.patcog.2026.113160","url":null,"abstract":"<div><div>Unsupervised Multimodal Emotion Recognition (UMER) aims to infer affective states by integrating unannotated multimodal data, such as text, speech, and images. A key challenge is the substantial semantic gaps between modalities, spanning both global cross-modal emotion cues and local fine-grained emotion changes within each modality. Without annotation, existing methods struggle to effectively align and fuse cross-modal emotional semantics, resulting in suboptimal UMER performance. To address this challenge, we propose <strong>DLCEA</strong>, a <em>Dual-level Language-Driven Cross-Modal Emotion Alignment</em> framework for robust unsupervised multimodal emotion representation learning. DLCEA leverages intrinsic emotion semantics in text to guide cross-modal alignment and introduces a dual-level semantic alignment scheme: <strong>Text-guided Cross-modal Global Emotion Alignment (TGEA)</strong> and <strong>Text-guided Cross-modal Local Emotion Alignment (TLEA)</strong>. Specifically, the TGEA module treats text as an alignment anchor and applies text-guided contrastive learning to align the global emotional features of audio and visual modalities with those of the text, achieving global emotion-level consistency across all three modalities. In parallel, TLEA incorporates an emotion-aware text masking strategy and text-guided audio/video reconstruction, enabling the model to capture subtle emotional cues and reinforce local-level cross-modal consistency, thereby further addressing fine-grained emotional alignment. By jointly modeling global and local emotional alignment, DLCEA learns unified and robust multimodal emotion representations in a fully unsupervised manner. Extensive experiments on multimodal datasets such as MAFW, MOSEI, and IEMOCAP demonstrate that DLCEA outperforms existing methods by a significant margin, achieving state-of-the-art performance. These results confirm the critical role of language-driven cross-modal emotional alignment in UMER. Code is available on <span><span>https://github.com/Tank9971/DLCEA</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113160"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174498","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Kernel entropy graph isomorphism network for graph classification 核熵图同构网络的图分类
IF 7.6 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2026-01-28 DOI: 10.1016/j.patcog.2026.113182
Lixiang Xu , Wei Ge , Feiping Nie , Enhong Chen , Bin Luo
Graph neural networks (GNNs) have been successfully applied to many graph classification tasks. However, most GNNs are based on message-passing neural network (MPNN) frameworks, making it difficult to utilize the structural information of the graph from multiple perspectives. To address the limitations of existing GNN methods, we incorporate structural information into graph embedding representation in two ways. On the one hand, the subgraph information in the neighborhood of a node is incorporated into the message passing process of GNN through graph entropy. On the other hand, we encode the path information in the graph with the help of an improved shortest path kernel. Then, these two parts of structural information are fused through the attention mechanism, which can capture the structural information of the graph and thus enrich the structural expression of graph neural network. Finally, the model is experimentally evaluated on seven publicly available graph classification datasets. Compared with the existing graph representation models, extensive experiments show that our model can better obtain graph representation and achieves more competitive performance.
图神经网络(gnn)已经成功地应用于许多图分类任务。然而,大多数gnn是基于消息传递神经网络(MPNN)框架的,很难从多个角度利用图的结构信息。为了解决现有GNN方法的局限性,我们以两种方式将结构信息纳入图嵌入表示。一方面,通过图熵将节点邻域的子图信息纳入到GNN的消息传递过程中;另一方面,我们借助改进的最短路径核对图中的路径信息进行编码。然后,通过注意机制将这两部分结构信息融合,从而捕获图的结构信息,从而丰富图神经网络的结构表达。最后,在七个公开的图分类数据集上对该模型进行了实验评估。与现有的图表示模型相比,大量的实验表明,我们的模型可以更好地获得图表示,并取得更有竞争力的性能。
{"title":"Kernel entropy graph isomorphism network for graph classification","authors":"Lixiang Xu ,&nbsp;Wei Ge ,&nbsp;Feiping Nie ,&nbsp;Enhong Chen ,&nbsp;Bin Luo","doi":"10.1016/j.patcog.2026.113182","DOIUrl":"10.1016/j.patcog.2026.113182","url":null,"abstract":"<div><div>Graph neural networks (GNNs) have been successfully applied to many graph classification tasks. However, most GNNs are based on message-passing neural network (MPNN) frameworks, making it difficult to utilize the structural information of the graph from multiple perspectives. To address the limitations of existing GNN methods, we incorporate structural information into graph embedding representation in two ways. On the one hand, the subgraph information in the neighborhood of a node is incorporated into the message passing process of GNN through graph entropy. On the other hand, we encode the path information in the graph with the help of an improved shortest path kernel. Then, these two parts of structural information are fused through the attention mechanism, which can capture the structural information of the graph and thus enrich the structural expression of graph neural network. Finally, the model is experimentally evaluated on seven publicly available graph classification datasets. Compared with the existing graph representation models, extensive experiments show that our model can better obtain graph representation and achieves more competitive performance.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"176 ","pages":"Article 113182"},"PeriodicalIF":7.6,"publicationDate":"2026-01-28","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146174533","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1