首页 > 最新文献

Pattern Recognition最新文献

英文 中文
Learning accurate and enriched features for stereo image super-resolution 学习准确而丰富的立体图像超分辨率特征
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-13 DOI: 10.1016/j.patcog.2024.111170
Hu Gao, Depeng Dang
Stereo image super-resolution (stereoSR) aims to enhance the quality of super-resolution results by incorporating complementary information from an alternative view. Although current methods have shown significant advancements, they typically operate on representations at full resolution to preserve spatial details, facing challenges in accurately capturing contextual information. Simultaneously, they utilize all feature similarities to cross-fuse information from the two views, potentially disregarding the impact of irrelevant information. To overcome this problem, we propose a mixed-scale selective fusion network (MSSFNet) to preserve precise spatial details and incorporate abundant contextual information, and adaptively select and fuse most accurate features from two views to enhance the promotion of high-quality stereoSR. Specifically, we develop a mixed-scale block (MSB) that obtains contextually enriched feature representations across multiple spatial scales while preserving precise spatial details. Furthermore, to dynamically retain the most essential cross-view information, we design a selective fusion attention module (SFAM) that searches and transfers the most accurate features from another view. To learn an enriched set of local and non-local features, we introduce a fast fourier convolution block (FFCB) to explicitly integrate frequency domain knowledge. Extensive experiments show that MSSFNet achieves significant improvements over state-of-the-art approaches on both quantitative and qualitative evaluations. The code and the pre-trained models will be released at https://github.com/Tombs98/MSSFNet.
立体图像超分辨率(stereoSR)旨在通过纳入来自另一视角的补充信息来提高超分辨率结果的质量。尽管目前的方法已经取得了显著的进步,但它们通常都是在全分辨率下运行以保留空间细节,在准确捕捉上下文信息方面面临挑战。同时,它们利用所有特征的相似性来交叉融合来自两个视图的信息,可能会忽略无关信息的影响。为了克服这一问题,我们提出了一种混合尺度选择性融合网络(MSSFNet),以保留精确的空间细节并纳入丰富的上下文信息,并自适应地从两个视图中选择和融合最精确的特征,从而加强高质量立体视像增强技术的推广。具体来说,我们开发了一种混合尺度块(MSB),可在多个空间尺度上获得丰富的上下文特征表征,同时保留精确的空间细节。此外,为了动态地保留最基本的跨视图信息,我们设计了一个选择性融合注意力模块(SFAM),可从另一个视图中搜索并转移最精确的特征。为了学习一组丰富的本地和非本地特征,我们引入了快速傅立叶卷积块(FFCB)来明确整合频域知识。广泛的实验表明,在定量和定性评估方面,MSSFNet 比最先进的方法都有显著改进。代码和预训练模型将在 https://github.com/Tombs98/MSSFNet 上发布。
{"title":"Learning accurate and enriched features for stereo image super-resolution","authors":"Hu Gao,&nbsp;Depeng Dang","doi":"10.1016/j.patcog.2024.111170","DOIUrl":"10.1016/j.patcog.2024.111170","url":null,"abstract":"<div><div>Stereo image super-resolution (stereoSR) aims to enhance the quality of super-resolution results by incorporating complementary information from an alternative view. Although current methods have shown significant advancements, they typically operate on representations at full resolution to preserve spatial details, facing challenges in accurately capturing contextual information. Simultaneously, they utilize all feature similarities to cross-fuse information from the two views, potentially disregarding the impact of irrelevant information. To overcome this problem, we propose a mixed-scale selective fusion network (MSSFNet) to preserve precise spatial details and incorporate abundant contextual information, and adaptively select and fuse most accurate features from two views to enhance the promotion of high-quality stereoSR. Specifically, we develop a mixed-scale block (MSB) that obtains contextually enriched feature representations across multiple spatial scales while preserving precise spatial details. Furthermore, to dynamically retain the most essential cross-view information, we design a selective fusion attention module (SFAM) that searches and transfers the most accurate features from another view. To learn an enriched set of local and non-local features, we introduce a fast fourier convolution block (FFCB) to explicitly integrate frequency domain knowledge. Extensive experiments show that MSSFNet achieves significant improvements over state-of-the-art approaches on both quantitative and qualitative evaluations. The code and the pre-trained models will be released at <span><span>https://github.com/Tombs98/MSSFNet</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111170"},"PeriodicalIF":7.5,"publicationDate":"2024-11-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652232","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
CAST: An innovative framework for Cross-dimensional Attention Structure in Transformers CAST:变压器中的跨维注意力结构创新框架
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-12 DOI: 10.1016/j.patcog.2024.111153
Dezheng Wang , Xiaoyi Wei , Congyan Chen
Dominant Transformer-based approaches rely solely on attention mechanisms and their variations, primarily emphasizing capturing crucial information within the temporal dimension. For enhanced performance, we introduce a novel architecture for Cross-dimensional Attention Structure in Transformers (CAST), which presents an innovative approach in Transformer-based models, emphasizing attention mechanisms across both temporal and spatial dimensions. The core component of CAST, the cross-dimensional attention structure (CAS), captures dependencies among multivariable time series in both temporal and spatial dimensions. The Static Attention Mechanism (SAM) is incorporated to simplify and enhance multivariate time series forecasting performance. This integration effectively reduces complexity, leading to a more efficient model. CAST demonstrates robust and efficient capabilities in predicting multivariate time series, with the simplicity of SAM broadening its applicability to various tasks. Beyond time series forecasting, CAST also shows promise in CV classification tasks. By integrating CAS into pre-trained image models, CAST facilitates spatiotemporal reasoning. Experimental results highlight the superior performance of CAST in time series forecasting and its competitive edge in CV classification tasks.
基于变换器的主流方法仅依赖于注意力机制及其变化,主要强调捕捉时间维度内的关键信息。为了提高性能,我们引入了一种新颖的 "变换器中的跨维注意结构"(CAST)架构,它在基于变换器的模型中提出了一种创新方法,强调跨时间和空间维度的注意机制。CAST 的核心部分--跨维注意力结构(CAS)--捕捉了多变量时间序列在时间和空间维度上的依赖关系。静态注意力机制(SAM)的加入简化并提高了多变量时间序列的预测性能。这种整合有效地降低了复杂性,使模型更加高效。CAST 在预测多变量时间序列方面表现出稳健高效的能力,而 SAM 的简易性则扩大了其在各种任务中的适用性。除了时间序列预测之外,CAST 还在 CV 分类任务中展现了前景。通过将 CAS 集成到预先训练好的图像模型中,CAST 可促进时空推理。实验结果凸显了 CAST 在时间序列预测方面的卓越性能以及在 CV 分类任务中的竞争优势。
{"title":"CAST: An innovative framework for Cross-dimensional Attention Structure in Transformers","authors":"Dezheng Wang ,&nbsp;Xiaoyi Wei ,&nbsp;Congyan Chen","doi":"10.1016/j.patcog.2024.111153","DOIUrl":"10.1016/j.patcog.2024.111153","url":null,"abstract":"<div><div>Dominant Transformer-based approaches rely solely on attention mechanisms and their variations, primarily emphasizing capturing crucial information within the temporal dimension. For enhanced performance, we introduce a novel architecture for Cross-dimensional Attention Structure in Transformers (CAST), which presents an innovative approach in Transformer-based models, emphasizing attention mechanisms across both temporal and spatial dimensions. The core component of CAST, the cross-dimensional attention structure (CAS), captures dependencies among multivariable time series in both temporal and spatial dimensions. The Static Attention Mechanism (SAM) is incorporated to simplify and enhance multivariate time series forecasting performance. This integration effectively reduces complexity, leading to a more efficient model. CAST demonstrates robust and efficient capabilities in predicting multivariate time series, with the simplicity of SAM broadening its applicability to various tasks. Beyond time series forecasting, CAST also shows promise in CV classification tasks. By integrating CAS into pre-trained image models, CAST facilitates spatiotemporal reasoning. Experimental results highlight the superior performance of CAST in time series forecasting and its competitive edge in CV classification tasks.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111153"},"PeriodicalIF":7.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652184","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Semi-supervised multi-view feature selection with adaptive similarity fusion and learning 利用自适应相似性融合与学习进行半监督式多视角特征选择
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-12 DOI: 10.1016/j.patcog.2024.111159
Bingbing Jiang , Jun Liu , Zidong Wang , Chenglong Zhang , Jie Yang , Yadi Wang , Weiguo Sheng , Weiping Ding
Existing multi-view semi-supervised feature selection methods typically need to calculate the inversion of high-order dense matrices, rendering them impractical for large-scale applications. Meanwhile, traditional works construct similarity graphs on different views and directly fuse these graphs from the view level, ignoring the differences among samples in various views and the interplay between graph learning and feature selection. Consequently, both the reliability of graphs and the discrimination of selected features are compromised. To address these issues, we propose a novel multi-view semi-supervised feature selection with Adaptive Similarity Fusion and Learning (ASFL) for large-scale tasks. Specifically, ASFL constructs bipartite graphs for each view and then leverages the relationships between samples and anchors to align anchors and graphs across different views, preserving the complementarity and consistency among views. Moreover, an effective view-to-sample fusion manner is designed to coalesce the aligned graphs while simultaneously exploiting the neighbor structures in projection subspaces to construct the joint graph compatible across views, reducing the adverse effects of noisy features and outliers. By incorporating bipartite graph fusion and learning, label propagation, and l2,0-norm multi-view feature selection into a unified framework, ASFL not only avoids the expensive computation in the solution procedures but also enhances the quality of selected features. An effective optimization strategy with fast convergence is developed to solve the objective function, and experimental results validate its efficiency and effectiveness over state-of-the-art methods.
现有的多视图半监督特征选择方法通常需要计算高阶稠密矩阵的反演,因此不适合大规模应用。同时,传统方法在不同视图上构建相似性图,并从视图层面直接融合这些图,忽略了不同视图样本之间的差异,以及图学习和特征选择之间的相互作用。因此,图的可靠性和所选特征的辨别能力都会受到影响。为了解决这些问题,我们针对大规模任务提出了一种新颖的多视图半监督特征选择方法--自适应相似性融合与学习(ASFL)。具体来说,ASFL 会为每个视图构建双向图,然后利用样本和锚点之间的关系,在不同视图之间调整锚点和图,从而保持视图之间的互补性和一致性。此外,还设计了一种有效的视图到样本融合方式来凝聚对齐图,同时利用投影子空间中的邻接结构来构建跨视图兼容的联合图,从而减少噪声特征和异常值的不利影响。ASFL 将双向图融合与学习、标签传播和 l2,0 准则多视图特征选择整合到一个统一的框架中,不仅避免了求解过程中昂贵的计算,还提高了所选特征的质量。为了求解目标函数,我们开发了一种具有快速收敛性的有效优化策略,实验结果验证了其效率和效果优于最先进的方法。
{"title":"Semi-supervised multi-view feature selection with adaptive similarity fusion and learning","authors":"Bingbing Jiang ,&nbsp;Jun Liu ,&nbsp;Zidong Wang ,&nbsp;Chenglong Zhang ,&nbsp;Jie Yang ,&nbsp;Yadi Wang ,&nbsp;Weiguo Sheng ,&nbsp;Weiping Ding","doi":"10.1016/j.patcog.2024.111159","DOIUrl":"10.1016/j.patcog.2024.111159","url":null,"abstract":"<div><div>Existing multi-view semi-supervised feature selection methods typically need to calculate the inversion of high-order dense matrices, rendering them impractical for large-scale applications. Meanwhile, traditional works construct similarity graphs on different views and directly fuse these graphs from the view level, ignoring the differences among samples in various views and the interplay between graph learning and feature selection. Consequently, both the reliability of graphs and the discrimination of selected features are compromised. To address these issues, we propose a novel multi-view semi-supervised feature selection with Adaptive Similarity Fusion and Learning (ASFL) for large-scale tasks. Specifically, ASFL constructs bipartite graphs for each view and then leverages the relationships between samples and anchors to align anchors and graphs across different views, preserving the complementarity and consistency among views. Moreover, an effective view-to-sample fusion manner is designed to coalesce the aligned graphs while simultaneously exploiting the neighbor structures in projection subspaces to construct the joint graph compatible across views, reducing the adverse effects of noisy features and outliers. By incorporating bipartite graph fusion and learning, label propagation, and <span><math><msub><mrow><mi>l</mi></mrow><mrow><mn>2</mn><mo>,</mo><mn>0</mn></mrow></msub></math></span>-norm multi-view feature selection into a unified framework, ASFL not only avoids the expensive computation in the solution procedures but also enhances the quality of selected features. An effective optimization strategy with fast convergence is developed to solve the objective function, and experimental results validate its efficiency and effectiveness over state-of-the-art methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111159"},"PeriodicalIF":7.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652013","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DyConfidMatch: Dynamic thresholding and re-sampling for 3D semi-supervised learning DyConfidMatch:用于三维半监督学习的动态阈值和重新采样
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-12 DOI: 10.1016/j.patcog.2024.111154
Zhimin Chen, Bing Li
Semi-supervised learning (SSL) leverages limited labeled and abundant unlabeled data but often faces challenges with data imbalance, especially in 3D contexts. This study investigates class-level confidence as an indicator of learning status in 3D SSL, proposing a novel method that utilizes dynamic thresholding to better use unlabeled data, particularly from underrepresented classes. A re-sampling strategy is also introduced to mitigate bias towards well-represented classes, ensuring equitable class representation. Through extensive experiments in 3D SSL, our method surpasses state-of-the-art counterparts in classification and detection tasks, highlighting its effectiveness in tackling data imbalance. This approach presents a significant advancement in SSL for 3D datasets, providing a robust solution for data imbalance issues.
半监督学习(SSL)利用有限的标记数据和丰富的非标记数据,但经常面临数据不平衡的挑战,尤其是在三维环境中。本研究调查了作为三维半监督学习中学习状态指标的类级置信度,提出了一种利用动态阈值的新方法,以更好地利用未标记数据,特别是代表性不足的类的数据。此外,还引入了一种重新采样策略,以减轻对代表性强的类别的偏差,确保公平的类别代表性。通过在 3D SSL 中进行广泛的实验,我们的方法在分类和检测任务中超越了最先进的同行,突出了其在解决数据不平衡方面的有效性。这种方法为三维数据集的 SSL 带来了重大进步,为数据不平衡问题提供了稳健的解决方案。
{"title":"DyConfidMatch: Dynamic thresholding and re-sampling for 3D semi-supervised learning","authors":"Zhimin Chen,&nbsp;Bing Li","doi":"10.1016/j.patcog.2024.111154","DOIUrl":"10.1016/j.patcog.2024.111154","url":null,"abstract":"<div><div>Semi-supervised learning (SSL) leverages limited labeled and abundant unlabeled data but often faces challenges with data imbalance, especially in 3D contexts. This study investigates class-level confidence as an indicator of learning status in 3D SSL, proposing a novel method that utilizes dynamic thresholding to better use unlabeled data, particularly from underrepresented classes. A re-sampling strategy is also introduced to mitigate bias towards well-represented classes, ensuring equitable class representation. Through extensive experiments in 3D SSL, our method surpasses state-of-the-art counterparts in classification and detection tasks, highlighting its effectiveness in tackling data imbalance. This approach presents a significant advancement in SSL for 3D datasets, providing a robust solution for data imbalance issues.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111154"},"PeriodicalIF":7.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652071","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Embedded feature selection for robust probability learning machines 稳健概率学习机的嵌入式特征选择
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-12 DOI: 10.1016/j.patcog.2024.111157
Miguel Carrasco , Benjamin Ivorra , Julio López , Angel M. Ramos

Methods:

Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.

Objective:

In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the 1-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.

Findings:

Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.

Novelty:

The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and 1-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.
方法:特征选择对于建立有效的二元分类机器学习模型至关重要。消除不必要的特征可以降低过拟合风险,提高分类性能。此外,我们处理的数据通常包含随机成分,因此开发对数据扰动不敏感的稳健模型非常重要。目标:在这项工作中,我们引入了具有集成特征选择功能的鲁棒分类器,利用基于不同惩罚技术(如 ℓ1-norm 或 elastic-net)的概率机,结合新颖的直接特征消除过程来提高模型的弹性和效率。新颖性:该研究讨论了在结合不同惩罚措施以选择最相关特征的同时尽量减少经验风险所涉及的权衡问题。特别是,在一个统一的框架内整合弹性网和ℓ1-norm 惩罚,并结合原始的直接特征消除方法,为提高模型的准确性和鲁棒性提供了一种新方法。
{"title":"Embedded feature selection for robust probability learning machines","authors":"Miguel Carrasco ,&nbsp;Benjamin Ivorra ,&nbsp;Julio López ,&nbsp;Angel M. Ramos","doi":"10.1016/j.patcog.2024.111157","DOIUrl":"10.1016/j.patcog.2024.111157","url":null,"abstract":"<div><h3>Methods:</h3><div>Feature selection is essential for building effective machine learning models in binary classification. Eliminating unnecessary features can reduce the risk of overfitting and improve classification performance. Moreover, the data we handle typically contains a stochastic component, making it important to develop robust models that are insensitive to data perturbations. Although there are numerous methods and tools for feature selection, relatively few studies address embedded feature selection within robust classification models using penalization techniques.</div></div><div><h3>Objective:</h3><div>In this work, we introduce robust classifiers with integrated feature selection capabilities, utilizing probability machines based on different penalization techniques, such as the <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm or the elastic-net, combined with a novel Direct Feature Elimination process to improve model resilience and efficiency.</div></div><div><h3>Findings:</h3><div>Numerical experiments on standard datasets demonstrate the effectiveness and robustness of the proposed models in classification tasks even when using a reduced number of features. These experiments were evaluated using original performance indicators, highlighting the models’ ability to maintain high performance with fewer features.</div></div><div><h3>Novelty:</h3><div>The study discusses the trade-offs involved in combining different penalties to select the most relevant features while minimizing empirical risk. In particular, the integration of elastic-net and <span><math><msub><mrow><mi>ℓ</mi></mrow><mrow><mn>1</mn></mrow></msub></math></span>-norm penalties within a unified framework, combined with the original Direct Feature Elimination approach, presents a novel method for improving both model accuracy and robustness.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111157"},"PeriodicalIF":7.5,"publicationDate":"2024-11-12","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652186","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Topology reorganized graph contrastive learning with mitigating semantic drift 拓扑重组图对比学习,缓解语义漂移
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-10 DOI: 10.1016/j.patcog.2024.111160
Jiaqiang Zhang, Songcan Chen
Graph contrastive learning (GCL) is an effective paradigm for node representation learning in graphs. The key components hidden behind GCL are data augmentation and positive–negative pair selection. Typical data augmentations in GCL, such as uniform deletion of edges, are generally blind and resort to local perturbation, which is prone to producing under-diversity views. Additionally, there is a risk of making the augmented data traverse to other classes. Moreover, most methods always treat all other samples as negatives. Such a negative pairing naturally results in sampling bias and likewise may make the learned representation suffer from semantic drift. Therefore, to increase the diversity of the contrastive view, we propose two simple and effective global topological augmentations to compensate current GCL. One is to mine the semantic correlation between nodes in the feature space. The other is to utilize the algebraic properties of the adjacency matrix to characterize the topology by eigen-decomposition. With the help of both, we can retain important edges to build a better view. To reduce the risk of semantic drift, a prototype-based negative pair selection is further designed which can filter false negative samples. Extensive experiments on various tasks demonstrate the advantages of the model compared to the state-of-the-art methods.
图形对比学习(GCL)是一种有效的图形节点表示学习范式。隐藏在 GCL 背后的关键要素是数据增强和正负对选择。GCL 中的典型数据增强方法(如均匀删除边)一般都是盲目的,而且采用局部扰动,容易产生多样性不足的视图。此外,还存在使增强数据遍历其他类别的风险。此外,大多数方法总是将所有其他样本视为负样本。这样的否定配对自然会导致采样偏差,同样也可能使学习到的表征出现语义漂移。因此,为了增加对比视图的多样性,我们提出了两种简单有效的全局拓扑增强方法来弥补当前的 GCL。一种是挖掘特征空间中节点之间的语义相关性。另一种是利用邻接矩阵的代数特性,通过特征分解来描述拓扑结构。在这两种方法的帮助下,我们可以保留重要的边,从而建立更好的视图。为了降低语义漂移的风险,我们进一步设计了一种基于原型的负对选择,它可以过滤假负样本。在各种任务上的广泛实验证明了该模型与最先进方法相比的优势。
{"title":"Topology reorganized graph contrastive learning with mitigating semantic drift","authors":"Jiaqiang Zhang,&nbsp;Songcan Chen","doi":"10.1016/j.patcog.2024.111160","DOIUrl":"10.1016/j.patcog.2024.111160","url":null,"abstract":"<div><div>Graph contrastive learning (GCL) is an effective paradigm for node representation learning in graphs. The key components hidden behind GCL are data augmentation and positive–negative pair selection. Typical data augmentations in GCL, such as uniform deletion of edges, are generally blind and resort to local perturbation, which is prone to producing under-diversity views. Additionally, there is a risk of making the augmented data traverse to other classes. Moreover, most methods always treat all other samples as negatives. Such a negative pairing naturally results in sampling bias and likewise may make the learned representation suffer from semantic drift. Therefore, to increase the diversity of the contrastive view, we propose two simple and effective global topological augmentations to compensate current GCL. One is to mine the semantic correlation between nodes in the feature space. The other is to utilize the algebraic properties of the adjacency matrix to characterize the topology by eigen-decomposition. With the help of both, we can retain important edges to build a better view. To reduce the risk of semantic drift, a prototype-based negative pair selection is further designed which can filter false negative samples. Extensive experiments on various tasks demonstrate the advantages of the model compared to the state-of-the-art methods.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111160"},"PeriodicalIF":7.5,"publicationDate":"2024-11-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652187","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adaptive representation learning and sample weighting for low-quality 3D face recognition 用于低质量 3D 人脸识别的自适应表示学习和样本加权
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-09 DOI: 10.1016/j.patcog.2024.111161
Cuican Yu , Fengxun Sun , Zihui Zhang , Huibin Li , Liming Chen , Jian Sun , Zongben Xu
3D face recognition (3DFR) algorithms have advanced significantly in the past two decades by leveraging facial geometric information, but they mostly focus on high-quality 3D face scans, thus limiting their practicality in real-world scenarios. Recently, with the development of affordable consumer-level depth cameras, the focus has shifted towards low-quality 3D face scans. In this paper, we propose a method for low-quality 3DFR. On one hand, our approach employs the normalizing flow to model an adaptive-form distribution for any given 3D face scan. This adaptive distributional representation learning strategy allows for more robust representations of low-quality 3D face scans (which may be caused by the scan noises, pose or occlusion variations, etc.). On the other hand, we introduce an adaptive sample weighting strategy to adjust the importance of each training sample by measuring both the difficulty of being recognized and the data quality. This adaptive sample weighting strategy can further enhance the robustness of the deep model and meanwhile improve its performance on low-quality 3DFR. Through comprehensive experiments, we demonstrate that our method can significantly improve the performance of low-quality 3DFR. For example, our method achieves competitive results on both the IIIT-D database and the Lock3DFace datasets, underscoring its effectiveness in addressing the challenges associated with low-quality 3D faces.
在过去的二十年里,三维人脸识别(3DFR)算法利用面部几何信息取得了长足的进步,但这些算法大多侧重于高质量的三维人脸扫描,因此限制了其在现实世界中的实用性。最近,随着价格低廉的消费级深度摄像头的发展,人们的关注点转向了低质量的三维人脸扫描。在本文中,我们提出了一种低质量 3DFR 方法。一方面,我们的方法利用归一化流为任何给定的三维人脸扫描建立自适应形式分布模型。这种自适应分布表征学习策略可以更稳健地表征低质量三维人脸扫描(可能由扫描噪声、姿势或遮挡变化等引起)。另一方面,我们引入了自适应样本加权策略,通过衡量识别难度和数据质量来调整每个训练样本的重要性。这种自适应样本加权策略可以进一步增强深度模型的鲁棒性,同时提高其在低质量 3DFR 上的性能。通过综合实验,我们证明了我们的方法可以显著提高低质量 3DFR 的性能。例如,我们的方法在 IIIT-D 数据库和 Lock3DFace 数据集上都取得了具有竞争力的结果,这突出表明了我们的方法在应对与低质量三维人脸相关的挑战方面的有效性。
{"title":"Adaptive representation learning and sample weighting for low-quality 3D face recognition","authors":"Cuican Yu ,&nbsp;Fengxun Sun ,&nbsp;Zihui Zhang ,&nbsp;Huibin Li ,&nbsp;Liming Chen ,&nbsp;Jian Sun ,&nbsp;Zongben Xu","doi":"10.1016/j.patcog.2024.111161","DOIUrl":"10.1016/j.patcog.2024.111161","url":null,"abstract":"<div><div>3D face recognition (3DFR) algorithms have advanced significantly in the past two decades by leveraging facial geometric information, but they mostly focus on high-quality 3D face scans, thus limiting their practicality in real-world scenarios. Recently, with the development of affordable consumer-level depth cameras, the focus has shifted towards low-quality 3D face scans. In this paper, we propose a method for low-quality 3DFR. On one hand, our approach employs the normalizing flow to model an adaptive-form distribution for any given 3D face scan. This adaptive distributional representation learning strategy allows for more robust representations of low-quality 3D face scans (which may be caused by the scan noises, pose or occlusion variations, etc.). On the other hand, we introduce an adaptive sample weighting strategy to adjust the importance of each training sample by measuring both the difficulty of being recognized and the data quality. This adaptive sample weighting strategy can further enhance the robustness of the deep model and meanwhile improve its performance on low-quality 3DFR. Through comprehensive experiments, we demonstrate that our method can significantly improve the performance of low-quality 3DFR. For example, our method achieves competitive results on both the IIIT-D database and the Lock3DFace datasets, underscoring its effectiveness in addressing the challenges associated with low-quality 3D faces.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111161"},"PeriodicalIF":7.5,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652078","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Corruption-based anomaly detection and interpretation in tabular data 基于破坏的异常检测和表格数据解读
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-09 DOI: 10.1016/j.patcog.2024.111149
Chunghyup Mok , Seoung Bum Kim
Recent advances in self-supervised learning (SSL) have proven crucial in effectively learning representations of unstructured data, encompassing text, images, and audio. Although the applications of these advances in anomaly detection have been explored extensively, applying SSL to tabular data presents challenges because of the absence of prior information on data structure. In response, we propose a framework for anomaly detection in tabular datasets using variable corruption. Through selective variable corruption and assignment of new labels based on the degree of corruption, our framework can effectively distinguish between normal and abnormal data. Furthermore, analyzing the impact of corruption on anomaly scores aids in the identification of important variables. Experimental results obtained from various tabular datasets validate the precision and applicability of the proposed method. The source code can be accessed at https://github.com/mokch/CAIT.
事实证明,自监督学习(SSL)的最新进展对于有效学习非结构化数据(包括文本、图像和音频)的表征至关重要。虽然这些进步在异常检测中的应用已经得到了广泛的探索,但由于缺乏数据结构方面的先验信息,将 SSL 应用于表格数据仍面临挑战。为此,我们提出了一种使用变量破坏的表格数据集异常检测框架。通过选择性变量损坏和根据损坏程度分配新标签,我们的框架可以有效区分正常数据和异常数据。此外,分析损坏对异常得分的影响有助于识别重要变量。从各种表格数据集获得的实验结果验证了所提方法的精确性和适用性。源代码可通过 https://github.com/mokch/CAIT 访问。
{"title":"Corruption-based anomaly detection and interpretation in tabular data","authors":"Chunghyup Mok ,&nbsp;Seoung Bum Kim","doi":"10.1016/j.patcog.2024.111149","DOIUrl":"10.1016/j.patcog.2024.111149","url":null,"abstract":"<div><div>Recent advances in self-supervised learning (SSL) have proven crucial in effectively learning representations of unstructured data, encompassing text, images, and audio. Although the applications of these advances in anomaly detection have been explored extensively, applying SSL to tabular data presents challenges because of the absence of prior information on data structure. In response, we propose a framework for anomaly detection in tabular datasets using variable corruption. Through selective variable corruption and assignment of new labels based on the degree of corruption, our framework can effectively distinguish between normal and abnormal data. Furthermore, analyzing the impact of corruption on anomaly scores aids in the identification of important variables. Experimental results obtained from various tabular datasets validate the precision and applicability of the proposed method. The source code can be accessed at <span><span>https://github.com/mokch/CAIT</span><svg><path></path></svg></span>.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111149"},"PeriodicalIF":7.5,"publicationDate":"2024-11-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652185","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online indoor visual odometry with semantic assistance under implicit epipolar constraints 隐式极点约束下的语义辅助在线室内视觉里程测量法
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-08 DOI: 10.1016/j.patcog.2024.111150
Yang Chen , Lin Zhang , Shengjie Zhao , Yicong Zhou
Among solutions to the tasks of indoor localization and reconstruction, compared with traditional SLAM (Simultaneous Localization And Mapping), learning-based VO (Visual Odometry) has gained more and more popularity due to its robustness and low cost. However, the performance of existing indoor deep VOs is still limited in comparison with their outdoor counterparts mainly owing to large areas of textureless regions and complex indoor motions containing much more rotations. In this paper, the above two challenges are carefully tackled with the proposed SEOVO (Semantic Epipolar-constrained Online VO). On the one hand, as far as we know, SEOVO is the first semantic-aided VO under an online adaptive framework, which adaptively reconstructs low-texture planes without any supervision. On the other hand, we introduce the epipolar geometric constraint in an implicit way for improving the accuracy of pose estimation without destroying the global scale consistency. The efficiency and efficacy of SEOVO have been corroborated by extensive experiments conducted on both public datasets and our collected video sequences.
在室内定位和重建任务的解决方案中,与传统的 SLAM(同步定位和绘图)相比,基于学习的 VO(视觉轨迹测量)因其鲁棒性和低成本而越来越受欢迎。然而,与室外相比,现有的室内深度 VO 性能仍然有限,这主要是由于室内存在大面积的无纹理区域和包含更多旋转的复杂室内运动。本文提出的 SEOVO(Semantic Epipolar-constrained Online VO)可以很好地解决上述两个难题。一方面,据我们所知,SEOVO 是第一个在线自适应框架下的语义辅助 VO,它可以在没有任何监督的情况下自适应地重建低纹理平面。另一方面,我们以隐含的方式引入了外极点几何约束,在不破坏全局尺度一致性的前提下提高了姿态估计的准确性。在公共数据集和我们收集的视频序列上进行的大量实验证实了 SEOVO 的效率和功效。
{"title":"Online indoor visual odometry with semantic assistance under implicit epipolar constraints","authors":"Yang Chen ,&nbsp;Lin Zhang ,&nbsp;Shengjie Zhao ,&nbsp;Yicong Zhou","doi":"10.1016/j.patcog.2024.111150","DOIUrl":"10.1016/j.patcog.2024.111150","url":null,"abstract":"<div><div>Among solutions to the tasks of indoor localization and reconstruction, compared with traditional SLAM (Simultaneous Localization And Mapping), learning-based VO (Visual Odometry) has gained more and more popularity due to its robustness and low cost. However, the performance of existing indoor deep VOs is still limited in comparison with their outdoor counterparts mainly owing to large areas of textureless regions and complex indoor motions containing much more rotations. In this paper, the above two challenges are carefully tackled with the proposed SEOVO (Semantic Epipolar-constrained Online VO). On the one hand, as far as we know, SEOVO is the first semantic-aided VO under an online adaptive framework, which adaptively reconstructs low-texture planes without any supervision. On the other hand, we introduce the epipolar geometric constraint in an implicit way for improving the accuracy of pose estimation without destroying the global scale consistency. The efficiency and efficacy of SEOVO have been corroborated by extensive experiments conducted on both public datasets and our collected video sequences.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111150"},"PeriodicalIF":7.5,"publicationDate":"2024-11-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652072","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DSCIMABNet: A novel multi-head attention depthwise separable CNN model for skin cancer detection DSCIMABNet:用于皮肤癌检测的新型多头注意力深度可分离 CNN 模型
IF 7.5 1区 计算机科学 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-11-07 DOI: 10.1016/j.patcog.2024.111182
Hatice Catal Reis , Veysel Turk
Skin cancer is a common type of cancer worldwide. Early diagnosis of skin cancer can reduce the risk of death by increasing treatment success. However, it is challenging for dermatologists or specialists because the symptoms are vague in the early stages and cannot be noticed by the naked eye. This study examines digital diagnostic techniques supported by artificial intelligence, focusing on early skin cancer detection and two methods have been proposed. In the first method, DSCIMABNet deep learning architecture was developed by combining multi-head attention and depthwise separable convolution techniques. This model provides flexibility in learning the dataset's local features, abstract concepts, and long-term relationships. The DSCIMABNet model and modern deep learning models trained on ImageNet are proposed to be combined with the ensemble learning method in the second method. This approach provides a comprehensive feature extraction process that will increase the performance of the classification process with ensemble learning. The proposed approaches are trained and evaluated on the ISIC 2018 dataset with image enhancement applied in preprocessing. In the experimental results, DSCIMABNet achieved 84.28% accuracy, while the proposed hybrid method achieved 99.40% accuracy. Moreover, on the Mendeley dataset (CNN for Melanoma Detection Data), DSCIMABNet achieved 92.58% accuracy, while the hybrid method achieved 99.37% accuracy. This study may significantly contribute to developing new and effective methods for the early diagnosis and treatment of skin cancer.
皮肤癌是全球常见的一种癌症。皮肤癌的早期诊断可以提高治疗成功率,从而降低死亡风险。然而,这对皮肤科医生或专科医生来说具有挑战性,因为皮肤癌早期症状模糊,肉眼无法察觉。本研究以早期皮肤癌检测为重点,研究了人工智能支持下的数字诊断技术,并提出了两种方法。在第一种方法中,结合多头注意力和深度可分离卷积技术,开发了 DSCIMABNet 深度学习架构。该模型可灵活学习数据集的局部特征、抽象概念和长期关系。在第二种方法中,建议将 DSCIMABNet 模型和在 ImageNet 上训练的现代深度学习模型与集合学习方法相结合。这种方法提供了一个全面的特征提取过程,将通过集合学习提高分类过程的性能。所提出的方法在 ISIC 2018 数据集上进行了训练和评估,并在预处理中应用了图像增强。在实验结果中,DSCIMABNet 的准确率达到 84.28%,而提出的混合方法的准确率达到 99.40%。此外,在 Mendeley 数据集(用于黑色素瘤检测数据的 CNN)上,DSCIMABNet 实现了 92.58% 的准确率,而混合方法实现了 99.37% 的准确率。这项研究将大大有助于开发新的、有效的皮肤癌早期诊断和治疗方法。
{"title":"DSCIMABNet: A novel multi-head attention depthwise separable CNN model for skin cancer detection","authors":"Hatice Catal Reis ,&nbsp;Veysel Turk","doi":"10.1016/j.patcog.2024.111182","DOIUrl":"10.1016/j.patcog.2024.111182","url":null,"abstract":"<div><div>Skin cancer is a common type of cancer worldwide. Early diagnosis of skin cancer can reduce the risk of death by increasing treatment success. However, it is challenging for dermatologists or specialists because the symptoms are vague in the early stages and cannot be noticed by the naked eye. This study examines digital diagnostic techniques supported by artificial intelligence, focusing on early skin cancer detection and two methods have been proposed. In the first method, DSCIMABNet deep learning architecture was developed by combining multi-head attention and depthwise separable convolution techniques. This model provides flexibility in learning the dataset's local features, abstract concepts, and long-term relationships. The DSCIMABNet model and modern deep learning models trained on ImageNet are proposed to be combined with the ensemble learning method in the second method. This approach provides a comprehensive feature extraction process that will increase the performance of the classification process with ensemble learning. The proposed approaches are trained and evaluated on the ISIC 2018 dataset with image enhancement applied in preprocessing. In the experimental results, DSCIMABNet achieved 84.28% accuracy, while the proposed hybrid method achieved 99.40% accuracy. Moreover, on the Mendeley dataset (CNN for Melanoma Detection Data), DSCIMABNet achieved 92.58% accuracy, while the hybrid method achieved 99.37% accuracy. This study may significantly contribute to developing new and effective methods for the early diagnosis and treatment of skin cancer.</div></div>","PeriodicalId":49713,"journal":{"name":"Pattern Recognition","volume":"159 ","pages":"Article 111182"},"PeriodicalIF":7.5,"publicationDate":"2024-11-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142652041","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pattern Recognition
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1