首页 > 最新文献

Pattern Recognition最新文献

英文 中文
PRGS: Patch-to-Region Graph Search for Visual Place Recognition
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-24 DOI: 10.1016/j.patcog.2025.111673
Weiliang Zuo, Liguo Liu, Yizhe Li, Yanqing Shen, Fuhua Xiang, Jingmin Xin, Nanning Zheng
Visual Place Recognition (VPR) is a task to estimate the target location based on visual information in changing scenarios, which usually uses a two-stage strategy of global retrieval and reranking. Existing reranking methods in VPR establish a single correspondence between the query image and the candidate images for reranking, which almost overlooks the neighbor correspondences in retrieved candidate images that can help to enhance reranking. In this paper, we propose a Patch-to-Region Graph Search (PRGS) method to enhance reranking using neighbor correspondences in candidate images. Firstly, considering that searching for neighbor correspondences relies on important features, we design a Patch-to-Region (PR) module, which aggregates patch level features into region level features for highlighting important features. Secondly, to estimate the candidate image reranking score using the neighbor correspondences, we design a Graph Search (GS) module, which establishes the neighbor correspondences among all candidates and query images in graph space. What is more, PRGS integrates well with both CNN and transformer backbone. We achieve competitive performance on several benchmarks, offering a 64% improvement in matching time and approximately 59% reduction in FLOPs compared to state-of-the-art methods. The code is released at https://github.com/LKELN/PRGS.
{"title":"PRGS: Patch-to-Region Graph Search for Visual Place Recognition","authors":"Weiliang Zuo,&nbsp;Liguo Liu,&nbsp;Yizhe Li,&nbsp;Yanqing Shen,&nbsp;Fuhua Xiang,&nbsp;Jingmin Xin,&nbsp;Nanning Zheng","doi":"10.1016/j.patcog.2025.111673","DOIUrl":"10.1016/j.patcog.2025.111673","url":null,"abstract":"<div><div>Visual Place Recognition (VPR) is a task to estimate the target location based on visual information in changing scenarios, which usually uses a two-stage strategy of global retrieval and reranking. Existing reranking methods in VPR establish a single correspondence between the query image and the candidate images for reranking, which almost overlooks the neighbor correspondences in retrieved candidate images that can help to enhance reranking. In this paper, we propose a <strong>P</strong>atch-to-<strong>R</strong>egion <strong>G</strong>raph <strong>S</strong>earch (PRGS) method to enhance reranking using neighbor correspondences in candidate images. Firstly, considering that searching for neighbor correspondences relies on important features, we design a <strong>P</strong>atch-to-<strong>R</strong>egion (PR) module, which aggregates patch level features into region level features for highlighting important features. Secondly, to estimate the candidate image reranking score using the neighbor correspondences, we design a <strong>G</strong>raph <strong>S</strong>earch (GS) module, which establishes the neighbor correspondences among all candidates and query images in graph space. What is more, PRGS integrates well with both CNN and transformer backbone. We achieve competitive performance on several benchmarks, offering a 64% improvement in matching time and approximately 59% reduction in FLOPs compared to state-of-the-art methods. The code is released at <span><span>https://github.com/LKELN/PRGS</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111673"},"PeriodicalIF":7.5,"publicationDate":"2025-04-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143874965","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-Modality Interactive Attention Network for AI-generated image quality assessment
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-23 DOI: 10.1016/j.patcog.2025.111693
Tianwei Zhou , Songbai Tan , Leida Li , Baoquan Zhao , Qiuping Jiang , Guanghui Yue
Recently, AI-generative techniques have revolutionized image creation, prompting the need for AI-generated image (AGI) quality assessment. This paper introduces CIA-Net, a Cross-modality Interactive Attention Network, for blind AGI quality evaluation. Using a multi-task framework, CIA-Net processes text and image inputs to output consistency, visual quality, and authenticity scores. Specifically, CIA-Net first encodes two-modal data to obtain textual and visual embeddings. Next, for consistency score prediction, it computes the similarity between these two kinds of embeddings in view of that text-to-image alignment. For visual quality prediction, it fuses textural and visual embeddings using a well-designed cross-modality interactive attention module. For authenticity score prediction, it constructs a textural template that contains authenticity labels and computes the joint probability from the similarity between the textural embeddings of each element and the visual embeddings. Experimental results show that CIA-Net is more competent for the AGI quality assessment task than 11 state-of-the-art competing methods.
{"title":"Cross-Modality Interactive Attention Network for AI-generated image quality assessment","authors":"Tianwei Zhou ,&nbsp;Songbai Tan ,&nbsp;Leida Li ,&nbsp;Baoquan Zhao ,&nbsp;Qiuping Jiang ,&nbsp;Guanghui Yue","doi":"10.1016/j.patcog.2025.111693","DOIUrl":"10.1016/j.patcog.2025.111693","url":null,"abstract":"<div><div>Recently, AI-generative techniques have revolutionized image creation, prompting the need for AI-generated image (AGI) quality assessment. This paper introduces CIA-Net, a Cross-modality Interactive Attention Network, for blind AGI quality evaluation. Using a multi-task framework, CIA-Net processes text and image inputs to output consistency, visual quality, and authenticity scores. Specifically, CIA-Net first encodes two-modal data to obtain textual and visual embeddings. Next, for consistency score prediction, it computes the similarity between these two kinds of embeddings in view of that text-to-image alignment. For visual quality prediction, it fuses textural and visual embeddings using a well-designed cross-modality interactive attention module. For authenticity score prediction, it constructs a textural template that contains authenticity labels and computes the joint probability from the similarity between the textural embeddings of each element and the visual embeddings. Experimental results show that CIA-Net is more competent for the AGI quality assessment task than 11 state-of-the-art competing methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"167 ","pages":"Article 111693"},"PeriodicalIF":7.5,"publicationDate":"2025-04-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143873287","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Multi-channel set polynomial based label regularized graph neural networks against extreme data scarcity
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-21 DOI: 10.1016/j.patcog.2025.111754
Jingxiao Zhang , Shifei Ding , Jian Zhang , Lili Guo , Ling Ding
Graph Neural Networks (GNNs) are one of the commonly used methods for semi-supervised node classification. Their advantage lies in modeling the relational information in the data and propagating the feature information of labeled nodes to unlabeled nodes in the graph, thereby predicting their labels. However, current research results indicate that existing models perform poorly when labeled data are extremely limited. To address this problem, we introduce a label regularization method and propose a multi-channel set polynomial based label regularized graph neural network against extreme data scarcity (MSP-LR). It consists of two components: a basic learning module based on multi-channel set polynomials and a label regularization module. Specifically, we use the basic module to expand the model's receptive field and obtain pseudo-labels for all nodes. For labeled nodes, we replace the obtained pseudo-label information with their initial label information. In the label regularization module, we impose regularization constraints on unlabeled nodes based on the clustering assumption to improve the reliability of labels. Experimental results on two homogeneous graphs and four heterogeneous graphs with different labeling rates demonstrate the effectiveness of this model.
{"title":"Multi-channel set polynomial based label regularized graph neural networks against extreme data scarcity","authors":"Jingxiao Zhang ,&nbsp;Shifei Ding ,&nbsp;Jian Zhang ,&nbsp;Lili Guo ,&nbsp;Ling Ding","doi":"10.1016/j.patcog.2025.111754","DOIUrl":"10.1016/j.patcog.2025.111754","url":null,"abstract":"<div><div>Graph Neural Networks (GNNs) are one of the commonly used methods for semi-supervised node classification. Their advantage lies in modeling the relational information in the data and propagating the feature information of labeled nodes to unlabeled nodes in the graph, thereby predicting their labels. However, current research results indicate that existing models perform poorly when labeled data are extremely limited. To address this problem, we introduce a label regularization method and propose a <strong>m</strong>ulti-channel <strong>s</strong>et <strong>p</strong>olynomial based <strong>l</strong>abel <strong>r</strong>egularized graph neural network against extreme data scarcity <strong>(MSP-LR)</strong>. It consists of two components: a basic learning module based on multi-channel set polynomials and a label regularization module. Specifically, we use the basic module to expand the model's receptive field and obtain pseudo-labels for all nodes. For labeled nodes, we replace the obtained pseudo-label information with their initial label information. In the label regularization module, we impose regularization constraints on unlabeled nodes based on the clustering assumption to improve the reliability of labels. Experimental results on two homogeneous graphs and four heterogeneous graphs with different labeling rates demonstrate the effectiveness of this model.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111754"},"PeriodicalIF":7.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875036","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Growing-before-pruning: A progressive neural architecture search strategy via group sparsity and deterministic annealing
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-21 DOI: 10.1016/j.patcog.2025.111697
Xiaotong Lu , Weisheng Dong , Zhenxuan Fang , Jie Lin , Xin Li , Guangming Shi
Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.
{"title":"Growing-before-pruning: A progressive neural architecture search strategy via group sparsity and deterministic annealing","authors":"Xiaotong Lu ,&nbsp;Weisheng Dong ,&nbsp;Zhenxuan Fang ,&nbsp;Jie Lin ,&nbsp;Xin Li ,&nbsp;Guangming Shi","doi":"10.1016/j.patcog.2025.111697","DOIUrl":"10.1016/j.patcog.2025.111697","url":null,"abstract":"<div><div>Network pruning is a widely studied technique of obtaining compact representations from over-parameterized deep convolutional neural networks. Existing pruning methods are based on finding an optimal combination of pruned filters in the fixed search space. However, the optimality of those methods is often questionable due to limited search space and pruning choices - e.g., the difficulty with removing the entire layer and the risk of unexpected performance degradation. Inspired by the exploration vs. exploitation trade-off in reinforcement learning, we propose to reconstruct the filter space without increasing the model capacity and prune them by exploiting group sparsity. Our approach challenges the conventional wisdom by advocating the strategy of Growing-before-Pruning (GbP), which allows us to explore more space before exploiting the power of architecture search. Meanwhile, to achieve more efficient pruning, we propose to measure the importance of filters by global group sparsity, which extends the existing Gaussian scale mixture model. Such global characterization of sparsity in the filter space leads to a novel deterministic annealing strategy for progressively pruning the filters. We have evaluated our method on several popular datasets and network architectures. Our extensive experiment results have shown that the proposed method advances the current state-of-the-art.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111697"},"PeriodicalIF":7.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143875037","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
A framework for global role-based author name disambiguation
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-21 DOI: 10.1016/j.patcog.2025.111703
Lan Huang, Jiayuan Zhang, Bo Wang, Zixu Li, Shu Wang, Rui Zhang
The academic community has long been confronted with the issue of Author Name Disambiguation (AND), where different authors share the same name. Most existing methods formalize AND as a task of clustering papers, based on the assumption that the more similar the papers are, the more likely they are to be the work of the same researcher. This paper introduces a framework for global role-based author name disambiguation, GRAND. It redefines the problem of AND by distinguishing between a real-world researcher and the author roles he/she plays, formalizing it as a role player matching problem. Furthermore, it proposes an embedding and clustering strategy based on meta-path, combined with a global coauthor sampling algorithm to address ambiguity in coauthor pairs. Finally, a set of rule-based metrics are employed to match real-world researchers with their author roles. The innovation of GRAND lies in its combination of global meta-path embedding method and rule-based author mapping. It effectively handles fuzzy coauthor relationships. In addition, it combines local and global information, and it improves disambiguation by distinguishing between researchers and the author roles they plays. The experimental results show GRAND out-performs several state-of-the-art approaches, with the F1-score improving by 0.49% to 5.45% across the three datasets.
{"title":"A framework for global role-based author name disambiguation","authors":"Lan Huang,&nbsp;Jiayuan Zhang,&nbsp;Bo Wang,&nbsp;Zixu Li,&nbsp;Shu Wang,&nbsp;Rui Zhang","doi":"10.1016/j.patcog.2025.111703","DOIUrl":"10.1016/j.patcog.2025.111703","url":null,"abstract":"<div><div>The academic community has long been confronted with the issue of Author Name Disambiguation (AND), where different authors share the same name. Most existing methods formalize AND as a task of clustering papers, based on the assumption that the more similar the papers are, the more likely they are to be the work of the same researcher. This paper introduces a framework for global role-based author name disambiguation, GRAND. It redefines the problem of AND by distinguishing between a real-world researcher and the author roles he/she plays, formalizing it as a role player matching problem. Furthermore, it proposes an embedding and clustering strategy based on meta-path, combined with a global coauthor sampling algorithm to address ambiguity in coauthor pairs. Finally, a set of rule-based metrics are employed to match real-world researchers with their author roles. The innovation of GRAND lies in its combination of global meta-path embedding method and rule-based author mapping. It effectively handles fuzzy coauthor relationships. In addition, it combines local and global information, and it improves disambiguation by distinguishing between researchers and the author roles they plays. The experimental results show GRAND out-performs several state-of-the-art approaches, with the F1-score improving by 0.49% to 5.45% across the three datasets.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111703"},"PeriodicalIF":7.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863413","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Unsupervised anomaly detection with a temporal continuation, confidence-aware VAE-GAN
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-21 DOI: 10.1016/j.patcog.2025.111699
Zeyu Xing , Owais Mehmood , William A.P. Smith
We propose an unsupervised approach to anomaly detection in data with a temporal dimension. We adapt the VAE-GAN architecture to learn the proxy task of temporal sequence continuation. Rather than reconstructing the input, our variational decoder decodes to a forecast of the future sequence. In order to separate structural uncertainty (which our model can reconstruct by fitting to observed data) from stochastic uncertainty (which it cannot) we introduce an additional decoder that outputs the pointwise confidence of the prediction, after the optimal latent-variable has been found. We can use this for zero-shot anomaly detection, separating anomalies from stochastic variation that cannot be modelled, without any examples. This is important for domains in which anomalies are so rare that it is not possible or meaningful to train a supervised model. As an example of such a domain, we introduce a new dataset comprising linescan imagery of railway lines which we use to illustrate our methods. We also achieve state-of-the-art performance on the ECG5000 and MIT-BIH time series anomaly detection datasets. We make an implementation of our method available at https://github.com/YorkXingZeyu/ECG-VAEGAN-Project.
我们提出了一种无监督方法来检测具有时间维度的数据中的异常。我们调整了 VAE-GAN 架构,以学习时序延续的代理任务。我们的变分解码器不是重建输入,而是解码为对未来序列的预测。为了将结构不确定性(我们的模型可以通过拟合观测数据来重构)与随机不确定性(我们的模型无法重构)区分开来,我们引入了一个额外的解码器,在找到最佳潜变量后输出预测的点置信度。我们可以利用它进行零点异常检测,在没有任何实例的情况下,将异常与无法建模的随机变化分离开来。这对于异常情况非常罕见,以至于不可能或没有意义训练监督模型的领域非常重要。作为此类领域的一个例子,我们介绍了一个由铁路线扫描图像组成的新数据集,用来说明我们的方法。我们还在 ECG5000 和 MIT-BIH 时间序列异常检测数据集上取得了一流的性能。我们在 https://github.com/YorkXingZeyu/ECG-VAEGAN-Project 上提供了我们方法的实现。
{"title":"Unsupervised anomaly detection with a temporal continuation, confidence-aware VAE-GAN","authors":"Zeyu Xing ,&nbsp;Owais Mehmood ,&nbsp;William A.P. Smith","doi":"10.1016/j.patcog.2025.111699","DOIUrl":"10.1016/j.patcog.2025.111699","url":null,"abstract":"<div><div>We propose an unsupervised approach to anomaly detection in data with a temporal dimension. We adapt the VAE-GAN architecture to learn the proxy task of temporal sequence continuation. Rather than reconstructing the input, our variational decoder decodes to a forecast of the future sequence. In order to separate structural uncertainty (which our model can reconstruct by fitting to observed data) from stochastic uncertainty (which it cannot) we introduce an additional decoder that outputs the pointwise confidence of the prediction, after the optimal latent-variable has been found. We can use this for zero-shot anomaly detection, separating anomalies from stochastic variation that cannot be modelled, without any examples. This is important for domains in which anomalies are so rare that it is not possible or meaningful to train a supervised model. As an example of such a domain, we introduce a new dataset comprising linescan imagery of railway lines which we use to illustrate our methods. We also achieve state-of-the-art performance on the ECG5000 and MIT-BIH time series anomaly detection datasets. We make an implementation of our method available at <span><span>https://github.com/YorkXingZeyu/ECG-VAEGAN-Project</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111699"},"PeriodicalIF":7.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143867752","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Knowledge-enhanced and structure-enhanced representation learning for protein–ligand binding affinity prediction
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-21 DOI: 10.1016/j.patcog.2025.111701
Mei Li , Ye Cao , Xiaoguang Liu , Hua Ji
Protein–ligand binding affinity (PLA) prediction is a fundamental preliminary stage in drug discovery and development. Existing methods mainly focus on structure-free prediction of binding affinities and the investigation of structural PLA prediction is not fully explored yet. Spatial structures of protein–ligand complexes are critical in determining binding affinities. A few graph neural network (GNN) based methods model spatial structures of complexes with pairwise atomic distances within a cutoff, which provides insufficient spatial descriptions and limits their capabilities in distinguishing between certain molecules. In this paper, we propose a knowledge-enhanced and structure-enhanced representation learning method (KSM) for structural PLA prediction. The proposed KSM has a specially designed structure-based GNN (KSGNN) to learn complete representations for PLA prediction by combining sequence and structure information of complexes. Notably, KSGNN is capable of learning structure-aware representations via incorporating relative spatial information of distances and angles among atoms into the message passing. Additionally, we adopt an attentive pooling layer (APL) to further refine structural patterns in complexes. We compare KSM against 18 state-of-the-art baselines on two benchmarks. KSM outperforms its competitors with improvements of 0.0536 and 0.19 on the PDBbind core set and the CSAR-HiQ dataset, respectively, in terms of the metric of RMSE, demonstrating its superiority in binding affinity prediction.
{"title":"Knowledge-enhanced and structure-enhanced representation learning for protein–ligand binding affinity prediction","authors":"Mei Li ,&nbsp;Ye Cao ,&nbsp;Xiaoguang Liu ,&nbsp;Hua Ji","doi":"10.1016/j.patcog.2025.111701","DOIUrl":"10.1016/j.patcog.2025.111701","url":null,"abstract":"<div><div>Protein–ligand binding affinity (PLA) prediction is a fundamental preliminary stage in drug discovery and development. Existing methods mainly focus on structure-free prediction of binding affinities and the investigation of structural PLA prediction is not fully explored yet. Spatial structures of protein–ligand complexes are critical in determining binding affinities. A few graph neural network (GNN) based methods model spatial structures of complexes with pairwise atomic distances within a cutoff, which provides insufficient spatial descriptions and limits their capabilities in distinguishing between certain molecules. In this paper, we propose a knowledge-enhanced and structure-enhanced representation learning method (KSM) for structural PLA prediction. The proposed KSM has a specially designed structure-based GNN (KSGNN) to learn complete representations for PLA prediction by combining sequence and structure information of complexes. Notably, KSGNN is capable of learning structure-aware representations via incorporating relative spatial information of distances and angles among atoms into the message passing. Additionally, we adopt an attentive pooling layer (APL) to further refine structural patterns in complexes. We compare KSM against 18 state-of-the-art baselines on two benchmarks. KSM outperforms its competitors with improvements of 0.0536 and 0.19 on the PDBbind core set and the CSAR-HiQ dataset, respectively, in terms of the metric of RMSE, demonstrating its superiority in binding affinity prediction.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111701"},"PeriodicalIF":7.5,"publicationDate":"2025-04-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143863414","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
OIL-AD: An anomaly detection framework for decision-making sequences
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-19 DOI: 10.1016/j.patcog.2025.111656
Chen Wang , Sarah Erfani , Tansu Alpcan , Christopher Leckie
Anomaly detection in decision-making sequences is a challenging problem due to the complexity of normality representation learning and the sequential nature of the task. Most existing methods based on Reinforcement Learning (RL) are difficult to implement in the real world due to unrealistic assumptions, such as having access to environment dynamics, reward signals, and online interactions with the environment. To address these limitations, we propose an unsupervised method named Offline Imitation Learning based Anomaly Detection (OIL-AD), which detects anomalies in decision-making sequences using two extracted behaviour features: action optimality and sequential association. Our offline learning model is an adaptation of behavioural cloning with a transformer policy network, where we modify the training process to learn a Q function and a state value function from normal trajectories. We propose that the Q function and the state value function can provide sufficient information about agents’ behavioural data, from which we derive two features for anomaly detection. The intuition behind our method is that the action optimality feature derived from the Q function can differentiate the optimal action from others at each local state, and the sequential association feature derived from the state value function has the potential to maintain the temporal correlations between decisions (state–action pairs). Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in F1 score over comparable baselines. The source code is available on https://github.com/chenwang4/OILAD.
{"title":"OIL-AD: An anomaly detection framework for decision-making sequences","authors":"Chen Wang ,&nbsp;Sarah Erfani ,&nbsp;Tansu Alpcan ,&nbsp;Christopher Leckie","doi":"10.1016/j.patcog.2025.111656","DOIUrl":"10.1016/j.patcog.2025.111656","url":null,"abstract":"<div><div>Anomaly detection in decision-making sequences is a challenging problem due to the complexity of normality representation learning and the sequential nature of the task. Most existing methods based on Reinforcement Learning (RL) are difficult to implement in the real world due to unrealistic assumptions, such as having access to environment dynamics, reward signals, and online interactions with the environment. To address these limitations, we propose an unsupervised method named Offline Imitation Learning based Anomaly Detection (OIL-AD), which detects anomalies in decision-making sequences using two extracted behaviour features: <em>action optimality</em> and <em>sequential association</em>. Our offline learning model is an adaptation of behavioural cloning with a transformer policy network, where we modify the training process to learn a Q function and a state value function from normal trajectories. We propose that the Q function and the state value function can provide sufficient information about agents’ behavioural data, from which we derive two features for anomaly detection. The intuition behind our method is that the <em>action optimality</em> feature derived from the Q function can differentiate the optimal action from others at each local state, and the <em>sequential association</em> feature derived from the state value function has the potential to maintain the temporal correlations between decisions (state–action pairs). Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in <span><math><msub><mrow><mi>F</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span> score over comparable baselines. The source code is available on <span><span>https://github.com/chenwang4/OILAD</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111656"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143867751","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Sample selection for noisy partial label learning with interactive contrastive learning
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-19 DOI: 10.1016/j.patcog.2025.111681
Xiaotong Yu , Shiding Sun , Yingjie Tian
In the context of weakly supervised learning, partial label learning (PLL) addresses situations where each training instance is associated with a set of partial labels, with only one being accurate. However, in complex realworld tasks, the restrictive assumption may be invalid which means the ground-truth may be outside the candidate label set. In this work, we loose the constraints and address the noisy label problem for PLL. First, we introduce a selection strategy, which enables deep models to select clean samples via the loss values of flipped and original images. Besides, we progressively identify the true labels of the selected samples and ensemble two models to acquire the knowledge of unselected samples. To extract better feature representations, we introduce pseudo-labeled interactive contrastive learning to aggregate cross-network information of all samples. Experimental results verify that our approach surpasses baseline methods on noisy PLL task with different levels of label noise.
{"title":"Sample selection for noisy partial label learning with interactive contrastive learning","authors":"Xiaotong Yu ,&nbsp;Shiding Sun ,&nbsp;Yingjie Tian","doi":"10.1016/j.patcog.2025.111681","DOIUrl":"10.1016/j.patcog.2025.111681","url":null,"abstract":"<div><div>In the context of weakly supervised learning, partial label learning (PLL) addresses situations where each training instance is associated with a set of partial labels, with only one being accurate. However, in complex realworld tasks, the restrictive assumption may be invalid which means the ground-truth may be outside the candidate label set. In this work, we loose the constraints and address the noisy label problem for PLL. First, we introduce a selection strategy, which enables deep models to select clean samples via the loss values of flipped and original images. Besides, we progressively identify the true labels of the selected samples and ensemble two models to acquire the knowledge of unselected samples. To extract better feature representations, we introduce pseudo-labeled interactive contrastive learning to aggregate cross-network information of all samples. Experimental results verify that our approach surpasses baseline methods on noisy PLL task with different levels of label noise.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111681"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143859332","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Cross-domain person re-identification via learning Heterogeneous Pseudo Labels
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2025-04-19 DOI: 10.1016/j.patcog.2025.111702
Zhong Zhang, Di He, Shuang Liu
Assigning pseudo labels is vital for cross-domain person re-identification (ReID), and most existing methods only assign one kind of pseudo labels to unlabeled target domain samples, which cannot describe these unlabeled samples accurately due to large intra-class and small inter-class variances caused by diverse environmental factors, such as occlusions, illuminations, viewpoints, and poses, etc. In this paper, we propose a novel label learning method named Heterogeneous Pseudo Labels (HPL) for cross-domain person ReID, which could overcome large intra-class and small inter-class variances between pedestrian images in the target domain. For each unlabeled target domain sample, HPL simultaneously learns three different kinds of pseudo labels, i.e., fine-grained labels, coarse-grained labels, and instance labels. With the three kinds of labels, we could make full use of their own advantages to describe target domain samples from different perspectives. Meanwhile, we propose the Pseudo Labels Constraint (PLC) to improve the quality of the heterogeneous labels by using their consistency. Furthermore, in order to relieve the influence of noisy labels from the aspect of contrastive learning, we propose the Confidence Contrastive Loss (CCL) to consider the sample confidence in the learning process. Extensive experiments on four cross-domain tasks demonstrate that the proposed method achieves a new state-of-the-art performance, for example, the proposed method achieves 87.2% mAP and 95.0% Rank-1 accuracy on MSMT17Market.
{"title":"Cross-domain person re-identification via learning Heterogeneous Pseudo Labels","authors":"Zhong Zhang,&nbsp;Di He,&nbsp;Shuang Liu","doi":"10.1016/j.patcog.2025.111702","DOIUrl":"10.1016/j.patcog.2025.111702","url":null,"abstract":"<div><div>Assigning pseudo labels is vital for cross-domain person re-identification (ReID), and most existing methods only assign one kind of pseudo labels to unlabeled target domain samples, which cannot describe these unlabeled samples accurately due to large intra-class and small inter-class variances caused by diverse environmental factors, such as occlusions, illuminations, viewpoints, and poses, etc. In this paper, we propose a novel label learning method named Heterogeneous Pseudo Labels (HPL) for cross-domain person ReID, which could overcome large intra-class and small inter-class variances between pedestrian images in the target domain. For each unlabeled target domain sample, HPL simultaneously learns three different kinds of pseudo labels, i.e., fine-grained labels, coarse-grained labels, and instance labels. With the three kinds of labels, we could make full use of their own advantages to describe target domain samples from different perspectives. Meanwhile, we propose the Pseudo Labels Constraint (PLC) to improve the quality of the heterogeneous labels by using their consistency. Furthermore, in order to relieve the influence of noisy labels from the aspect of contrastive learning, we propose the Confidence Contrastive Loss (CCL) to consider the sample confidence in the learning process. Extensive experiments on four cross-domain tasks demonstrate that the proposed method achieves a new state-of-the-art performance, for example, the proposed method achieves 87.2% mAP and 95.0% Rank-1 accuracy on MSMT17<span><math><mo>→</mo></math></span>Market.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"166 ","pages":"Article 111702"},"PeriodicalIF":7.5,"publicationDate":"2025-04-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143868774","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1