首页 > 最新文献

Pattern Recognition Letters最新文献

英文 中文
Enhancing zero-shot object detection with external knowledge-guided robust contrast learning 利用外部知识引导的鲁棒对比度学习增强零镜头物体检测能力
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-05 DOI: 10.1016/j.patrec.2024.08.003
Lijuan Duan , Guangyuan Liu , Qing En , Zhaoying Liu , Zhi Gong , Bian Ma

Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.

零镜头对象检测旨在从训练过程中未出现的未知类别中识别对象。现有方法依赖于类别标签来为未见类别创建伪特征,但这些方法在探索语义信息方面存在局限性,并且缺乏鲁棒性。为了解决这些问题,我们引入了一个新颖的框架 EKZSD,通过结合外部知识和对比范例来增强零镜头对象检测。该框架丰富了语义多样性,增强了判别能力和鲁棒性。具体来说,我们引入了一个新颖的外部知识提取模块,利用属性和关系提示来丰富语义信息。此外,我们还提出了一个新颖的外部知识对比学习模块,通过探索伪视觉特征来增强模型的判别能力和鲁棒性。此外,我们还利用循环一致性学习将生成的视觉特征与原始语义特征相一致,并利用对抗学习将视觉特征与语义特征相一致。通过对比学习损失、周期一致性损失、对抗学习损失和分类损失的协同训练,我们的框架在 MSCOCO 和 Ship-43 数据集上表现出了卓越的性能,实验结果也证明了这一点。
{"title":"Enhancing zero-shot object detection with external knowledge-guided robust contrast learning","authors":"Lijuan Duan ,&nbsp;Guangyuan Liu ,&nbsp;Qing En ,&nbsp;Zhaoying Liu ,&nbsp;Zhi Gong ,&nbsp;Bian Ma","doi":"10.1016/j.patrec.2024.08.003","DOIUrl":"10.1016/j.patrec.2024.08.003","url":null,"abstract":"<div><p>Zero-shot object detection aims to identify objects from unseen categories not present during training. Existing methods rely on category labels to create pseudo-features for unseen categories, but they face limitations in exploring semantic information and lack robustness. To address these issues, we introduce a novel framework, EKZSD, enhancing zero-shot object detection by incorporating external knowledge and contrastive paradigms. This framework enriches semantic diversity, enhancing discriminative ability and robustness. Specifically, we introduce a novel external knowledge extraction module that leverages attribute and relationship prompts to enrich semantic information. Moreover, a novel external knowledge contrastive learning module is proposed to enhance the model’s discriminative and robust capabilities by exploring pseudo-visual features. Additionally, we use cycle consistency learning to align generated visual features with original semantic features and adversarial learning to align visual features with semantic features. Collaboratively trained with contrast learning loss, cycle consistency loss, adversarial learning loss, and classification loss, our framework outperforms superior performance on the MSCOCO and Ship-43 datasets, as demonstrated in experimental results.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 152-159"},"PeriodicalIF":3.9,"publicationDate":"2024-08-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141978155","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Exploring percolation features with polynomial algorithms for classifying Covid-19 in chest X-ray images 利用多项式算法探索渗流特征,对胸部 X 光图像中的 Covid-19 进行分类
IF 5.1 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.patrec.2024.07.022
Guilherme F. Roberto, Danilo C. Pereira, Alessandro S. Martins, Thaína A.A. Tosta, Carlos Soares, Alessandra Lumini, Guilherme B. Rozendo, Leandro A. Neves, Marcelo Z. Nascimento
Covid-19 is a severe illness caused by the Sars-CoV-2 virus, initially identified in China in late 2019 and swiftly spreading globally. Since the virus primarily impacts the lungs, analyzing chest X-rays stands as a reliable and widely accessible means of diagnosing the infection. In computer vision, deep learning models such as CNNs have been the main adopted approach for detection of Covid-19 in chest X-ray images. However, we believe that handcrafted features can also provide relevant results, as shown previously in similar image classification challenges. In this study, we propose a method for identifying Covid-19 in chest X-ray images by extracting and classifying local and global percolation-based features. This technique was tested on three datasets: one comprising 2,002 segmented samples categorized into two groups (Covid-19 and Healthy); another with 1,125 non-segmented samples categorized into three groups (Covid-19, Healthy, and Pneumonia); and a third one composed of 4,809 non-segmented images representing three classes (Covid-19, Healthy, and Pneumonia). Then, 48 percolation features were extracted and give as input into six distinct classifiers. Subsequently, the AUC and accuracy metrics were assessed. We used the 10-fold cross-validation approach and evaluated lesion sub-types via binary and multiclass classification using the Hermite polynomial classifier, a novel approach in this domain. The Hermite polynomial classifier exhibited the most promising outcomes compared to five other machine learning algorithms, wherein the best obtained values for accuracy and AUC were 98.72% and 0.9917, respectively. We also evaluated the influence of noise in the features and in the classification accuracy. These results, based in the integration of percolation features with the Hermite polynomial, hold the potential for enhancing lesion detection and supporting clinicians in their diagnostic endeavors.
Covid-19是由Sars-CoV-2病毒引起的一种严重疾病,最初于2019年底在中国发现,并迅速在全球蔓延。由于该病毒主要影响肺部,因此分析胸部 X 光片是诊断该病毒感染的一种可靠而广泛的手段。在计算机视觉领域,CNN 等深度学习模型一直是检测胸部 X 光图像中 Covid-19 的主要方法。不过,我们认为,正如之前在类似的图像分类挑战中所显示的那样,手工制作的特征也能提供相关结果。在本研究中,我们提出了一种通过提取和分类基于局部和全局渗滤的特征来识别胸部 X 光图像中 Covid-19 的方法。该技术在三个数据集上进行了测试:一个数据集由 2,002 个分割样本组成,分为两组(Covid-19 和健康);另一个数据集由 1,125 个非分割样本组成,分为三组(Covid-19、健康和肺炎);第三个数据集由 4,809 个非分割图像组成,代表三个类别(Covid-19、健康和肺炎)。然后,提取 48 个渗滤特征,并将其作为输入输入到六个不同的分类器中。随后,评估了 AUC 和准确度指标。我们采用了 10 倍交叉验证方法,并使用 Hermite 多项式分类器(这是该领域的一种新方法)通过二分类和多分类对病变子类型进行了评估。与其他五种机器学习算法相比,Hermite 多项式分类器表现出了最有前途的结果,准确率和 AUC 的最佳值分别为 98.72% 和 0.9917。我们还评估了噪声对特征和分类准确率的影响。这些结果基于渗滤特征与赫米特多项式的整合,有望提高病变检测能力,并为临床医生的诊断工作提供支持。
{"title":"Exploring percolation features with polynomial algorithms for classifying Covid-19 in chest X-ray images","authors":"Guilherme F. Roberto, Danilo C. Pereira, Alessandro S. Martins, Thaína A.A. Tosta, Carlos Soares, Alessandra Lumini, Guilherme B. Rozendo, Leandro A. Neves, Marcelo Z. Nascimento","doi":"10.1016/j.patrec.2024.07.022","DOIUrl":"https://doi.org/10.1016/j.patrec.2024.07.022","url":null,"abstract":"Covid-19 is a severe illness caused by the Sars-CoV-2 virus, initially identified in China in late 2019 and swiftly spreading globally. Since the virus primarily impacts the lungs, analyzing chest X-rays stands as a reliable and widely accessible means of diagnosing the infection. In computer vision, deep learning models such as CNNs have been the main adopted approach for detection of Covid-19 in chest X-ray images. However, we believe that handcrafted features can also provide relevant results, as shown previously in similar image classification challenges. In this study, we propose a method for identifying Covid-19 in chest X-ray images by extracting and classifying local and global percolation-based features. This technique was tested on three datasets: one comprising 2,002 segmented samples categorized into two groups (Covid-19 and Healthy); another with 1,125 non-segmented samples categorized into three groups (Covid-19, Healthy, and Pneumonia); and a third one composed of 4,809 non-segmented images representing three classes (Covid-19, Healthy, and Pneumonia). Then, 48 percolation features were extracted and give as input into six distinct classifiers. Subsequently, the AUC and accuracy metrics were assessed. We used the 10-fold cross-validation approach and evaluated lesion sub-types via binary and multiclass classification using the Hermite polynomial classifier, a novel approach in this domain. The Hermite polynomial classifier exhibited the most promising outcomes compared to five other machine learning algorithms, wherein the best obtained values for accuracy and AUC were 98.72% and 0.9917, respectively. We also evaluated the influence of noise in the features and in the classification accuracy. These results, based in the integration of percolation features with the Hermite polynomial, hold the potential for enhancing lesion detection and supporting clinicians in their diagnostic endeavors.","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"24 1","pages":""},"PeriodicalIF":5.1,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142185786","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Feature decomposition-based gaze estimation with auxiliary head pose regression 基于特征分解的凝视估计与辅助头部姿态回归
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.patrec.2024.07.021
Ke Ni, Jing Chen, Jian Wang, Bo Liu, Ting Lei, Yongtian Wang

Recognition and understanding of facial images or eye images are critical for eye tracking. Recent studies have shown that the simultaneous use of facial and eye images can effectively lower gaze errors. However, these methods typically consider facial and eye images as two unrelated inputs, without taking into account their distinct representational abilities at the feature level. Additionally, implicitly learned head pose from highly coupled facial features would make the trained model less interpretable and prone to the gaze-head overfitting problem. To address these issues, we propose a method that aims to enhance task-relevant features while suppressing other noises by leveraging feature decomposition. We disentangle eye-related features from the facial image via a projection module and further make them distinctive with an attention-based head pose regression task, which could enhance the representation of gaze-related features and make the model less susceptible to task-irrelevant features. After that, the mutually separated eye features and head pose are recombined to achieve more accurate gaze estimation. Experimental results demonstrate that our method achieves state-of-the-art performance, with an estimation error of 3.90° on the MPIIGaze dataset and 5.15° error on the EyeDiap dataset, respectively.

识别和理解面部图像或眼部图像对于眼动跟踪至关重要。最近的研究表明,同时使用面部图像和眼部图像可以有效降低注视误差。然而,这些方法通常将面部图像和眼部图像视为两个互不相关的输入,而没有考虑到它们在特征层面的不同表征能力。此外,从高度耦合的面部特征中隐含学习到的头部姿势会降低训练模型的可解释性,并容易出现凝视-头部过拟合问题。为了解决这些问题,我们提出了一种方法,旨在通过利用特征分解来增强任务相关特征,同时抑制其他噪音。我们通过投影模块将眼部相关特征从面部图像中分离出来,并通过基于注意力的头部姿势回归任务进一步使其与众不同,这可以增强凝视相关特征的代表性,并使模型不易受任务无关特征的影响。然后,将相互分离的眼部特征和头部姿势重新组合,以实现更精确的注视估计。实验结果表明,我们的方法达到了最先进的性能,在 MPIIGaze 数据集上的估计误差为 3.90°,在 EyeDiap 数据集上的误差为 5.15°。
{"title":"Feature decomposition-based gaze estimation with auxiliary head pose regression","authors":"Ke Ni,&nbsp;Jing Chen,&nbsp;Jian Wang,&nbsp;Bo Liu,&nbsp;Ting Lei,&nbsp;Yongtian Wang","doi":"10.1016/j.patrec.2024.07.021","DOIUrl":"10.1016/j.patrec.2024.07.021","url":null,"abstract":"<div><p>Recognition and understanding of facial images or eye images are critical for eye tracking. Recent studies have shown that the simultaneous use of facial and eye images can effectively lower gaze errors. However, these methods typically consider facial and eye images as two unrelated inputs, without taking into account their distinct representational abilities at the feature level. Additionally, implicitly learned head pose from highly coupled facial features would make the trained model less interpretable and prone to the gaze-head overfitting problem. To address these issues, we propose a method that aims to enhance task-relevant features while suppressing other noises by leveraging feature decomposition. We disentangle eye-related features from the facial image via a projection module and further make them distinctive with an attention-based head pose regression task, which could enhance the representation of gaze-related features and make the model less susceptible to task-irrelevant features. After that, the mutually separated eye features and head pose are recombined to achieve more accurate gaze estimation. Experimental results demonstrate that our method achieves state-of-the-art performance, with an estimation error of 3.90° on the MPIIGaze dataset and 5.15° error on the EyeDiap dataset, respectively.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 137-142"},"PeriodicalIF":3.9,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141963573","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial self-training for robustness and generalization 逆向自训练,实现稳健性和通用性
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-02 DOI: 10.1016/j.patrec.2024.07.020
Zhuorong Li , Minghui Wu , Canghong Jin , Daiwei Yu , Hongchuan Yu

Adversarial training is currently one of the most promising ways to achieve adversarial robustness of deep models. However, even the most sophisticated training methods is far from satisfactory, as improvement in robustness requires either heuristic strategies or more annotated data, which might be problematic in real-world applications. To alleviate these issues, we propose an effective training scheme that avoids prohibitively high cost of additional labeled data by adapting self-training scheme to adversarial training. In particular, we first use the confident prediction for a randomly-augmented image as the pseudo-label for self-training. Then we enforce the consistency regularization by targeting the adversarially-perturbed version of the same image at the pseudo-label, which implicitly suppresses the distortion of representation in latent space. Despite its simplicity, extensive experiments show that our regularization could bring significant advancement in adversarial robustness of a wide range of adversarial training methods and helps the model to generalize its robustness to larger perturbations or even against unseen adversaries.

是目前最有希望实现深度模型对抗鲁棒性的方法之一。然而,即使是最复杂的训练方法也远不能令人满意,因为提高鲁棒性需要启发式策略或更多标注数据,而这在实际应用中可能会遇到问题。为了缓解这些问题,我们提出了一种有效的训练方案,通过将自我训练方案调整为对抗训练,避免了额外标记数据的高昂成本。具体来说,我们首先使用随机增强图像的可信预测作为自我训练的伪标签。然后,我们通过将同一图像的对抗扰动版本作为伪标签来执行一致性正则化,从而隐式地抑制了潜在空间中的表征失真。尽管方法简单,但大量实验表明,我们的正则化可以显著提高各种对抗训练方法的对抗鲁棒性,并帮助模型将其鲁棒性扩展到更大的扰动,甚至对抗未见过的对抗者。
{"title":"Adversarial self-training for robustness and generalization","authors":"Zhuorong Li ,&nbsp;Minghui Wu ,&nbsp;Canghong Jin ,&nbsp;Daiwei Yu ,&nbsp;Hongchuan Yu","doi":"10.1016/j.patrec.2024.07.020","DOIUrl":"10.1016/j.patrec.2024.07.020","url":null,"abstract":"<div><p><em>Adversarial training</em> is currently one of the most promising ways to achieve adversarial robustness of deep models. However, even the most sophisticated training methods is far from satisfactory, as improvement in robustness requires either heuristic strategies or more annotated data, which might be problematic in real-world applications. To alleviate these issues, we propose an effective training scheme that avoids prohibitively high cost of additional labeled data by adapting self-training scheme to adversarial training. In particular, we first use the confident prediction for a randomly-augmented image as the pseudo-label for self-training. Then we enforce the consistency regularization by targeting the adversarially-perturbed version of the same image at the pseudo-label, which implicitly suppresses the distortion of representation in latent space. Despite its simplicity, extensive experiments show that our regularization could bring significant advancement in adversarial robustness of a wide range of adversarial training methods and helps the model to generalize its robustness to larger perturbations or even against unseen adversaries.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 117-123"},"PeriodicalIF":3.9,"publicationDate":"2024-08-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945959","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Editorial: Special session on IbPRIA 2023 社论:IbPRIA 2023 特别会议
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-01 DOI: 10.1016/j.patrec.2024.06.023
{"title":"Editorial: Special session on IbPRIA 2023","authors":"","doi":"10.1016/j.patrec.2024.06.023","DOIUrl":"10.1016/j.patrec.2024.06.023","url":null,"abstract":"","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"184 ","pages":"Page 238"},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141511007","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Decoding class dynamics in learning with noisy labels 在有噪声标签的学习中解码类别动态
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-08-01 DOI: 10.1016/j.patrec.2024.04.012

The creation of large-scale datasets annotated by humans inevitably introduces noisy labels, leading to reduced generalization in deep-learning models. Sample selection-based learning with noisy labels is a recent approach that exhibits promising upbeat performance improvements. The selection of clean samples amongst the noisy samples is an important criterion in the learning process of these models. In this work, we delve deeper into the clean-noise split decision and highlight the aspect that effective demarcation of samples would lead to better performance. We identify the Global Noise Conundrum in the existing models, where the distribution of samples is treated globally. We propose a per-class-based local distribution of samples and demonstrate the effectiveness of this approach in having a better clean-noise split. We validate our proposal on several benchmarks — both real and synthetic, and show substantial improvements over different state-of-the-art algorithms. We further propose a new metric, classiness to extend our analysis and highlight the effectiveness of the proposed method. Source code and instructions to reproduce this paper are available at https://github.com/aldakata/CCLM/

创建由人类注释的大规模数据集不可避免地会引入噪声标签,导致深度学习模型的泛化能力降低。基于样本选择的噪声标签学习是最近的一种方法,它在性能提升方面大有可为。在这些模型的学习过程中,从噪声样本中选择干净样本是一个重要标准。在这项工作中,我们深入探讨了 "干净样本-噪声样本 "的划分决策,并强调了有效划分样本将带来更好性能的观点。我们发现了现有模型中的 "全局噪声难题",即对样本分布进行全局处理。我们提出了一种基于每个类别的局部样本分布方法,并证明了这种方法在更好地划分净噪方面的有效性。我们在多个基准(包括真实基准和合成基准)上验证了我们的建议,结果表明,与不同的先进算法相比,我们的建议有了实质性的改进。我们进一步提出了一个新的指标--分类度,以扩展我们的分析并突出所提方法的有效性。本文的源代码和复制说明可从 https://github.com/aldakata/CCLM/ 网站获取。
{"title":"Decoding class dynamics in learning with noisy labels","authors":"","doi":"10.1016/j.patrec.2024.04.012","DOIUrl":"10.1016/j.patrec.2024.04.012","url":null,"abstract":"<div><p><span>The creation of large-scale datasets annotated by humans inevitably introduces noisy labels, leading to reduced generalization in deep-learning models. Sample selection-based learning with noisy labels is a recent approach that exhibits promising upbeat performance improvements<span>. The selection of clean samples amongst the noisy samples is an important criterion in the learning process of these models. In this work, we delve deeper into the clean-noise split decision and highlight the aspect that effective demarcation of samples would lead to better performance. We identify the Global Noise Conundrum in the existing models, where the distribution of samples is treated globally. We propose a per-class-based local distribution of samples and demonstrate the effectiveness of this approach in having a better clean-noise split. We validate our proposal on several benchmarks — both real and synthetic, and show substantial improvements over different state-of-the-art algorithms. We further propose a new metric, classiness to extend our analysis and highlight the effectiveness of the proposed method. Source code and instructions to reproduce this paper are available at </span></span><span><span>https://github.com/aldakata/CCLM/</span><svg><path></path></svg></span></p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"184 ","pages":"Pages 239-245"},"PeriodicalIF":3.9,"publicationDate":"2024-08-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"140777367","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
DECA-Net: Dual encoder and cross-attention fusion network for surgical instrument segmentation DECA-Net:用于手术器械分割的双编码器和交叉注意融合网络
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-31 DOI: 10.1016/j.patrec.2024.07.019
Sixin Liang , Jianzhou Zhang , Ang Bian , Jiaying You

Minimally invasive surgery is now widely used to reduce surgical risks, and automatic and accurate instrument segmentation from endoscope videos is crucial for computer-assisted surgical guidance. However, given the rapid development of CNN-based surgical instrument segmentation methods, challenges like motion blur and illumination issues can still cause erroneous segmentation. In this work, we propose a novel dual encoder and cross-attention network (DECA-Net) to overcome these limitations with enhanced context representation and irrelevant feature fusion. Our approach introduces a CNN and Transformer based dual encoder unit for local features and global context information extraction and hence strength the model’s robustness against various illumination conditions. Then an attention fusion module is utilized to combine local feature and global context information and to select instrument-related boundary features. To bridge the semantic gap between encoder and decoder, we propose a parallel dual cross-attention (DCA) block to capture the channel and spatial dependencies across multi-scale encoder to build the enhanced skip connection. Experimental results show that the proposed method achieves state-of-the-art performance on Endovis2017 and Kvasir-instrument datasets.

目前,微创手术已被广泛应用于降低手术风险,而从内窥镜视频中自动、准确地分割器械对于计算机辅助手术引导至关重要。然而,随着基于 CNN 的手术器械分割方法的快速发展,运动模糊和光照问题等挑战仍可能导致错误分割。在这项工作中,我们提出了一种新颖的双编码器和交叉注意网络(DECA-Net),通过增强上下文表示和不相关特征融合来克服这些限制。我们的方法引入了基于 CNN 和变换器的双编码器单元,用于局部特征和全局上下文信息提取,从而增强了模型在各种光照条件下的鲁棒性。然后,利用注意力融合模块将局部特征和全局上下文信息结合起来,并选择与仪器相关的边界特征。为了弥合编码器和解码器之间的语义鸿沟,我们提出了并行双交叉注意(DCA)模块,以捕捉多尺度编码器之间的通道和空间依赖性,从而建立增强的跳转连接。实验结果表明,所提出的方法在 Endovis2017 和 Kvasir-instrument 数据集上达到了最先进的性能。
{"title":"DECA-Net: Dual encoder and cross-attention fusion network for surgical instrument segmentation","authors":"Sixin Liang ,&nbsp;Jianzhou Zhang ,&nbsp;Ang Bian ,&nbsp;Jiaying You","doi":"10.1016/j.patrec.2024.07.019","DOIUrl":"10.1016/j.patrec.2024.07.019","url":null,"abstract":"<div><p>Minimally invasive surgery is now widely used to reduce surgical risks, and automatic and accurate instrument segmentation from endoscope videos is crucial for computer-assisted surgical guidance. However, given the rapid development of CNN-based surgical instrument segmentation methods, challenges like motion blur and illumination issues can still cause erroneous segmentation. In this work, we propose a novel dual encoder and cross-attention network (DECA-Net) to overcome these limitations with enhanced context representation and irrelevant feature fusion. Our approach introduces a CNN and Transformer based dual encoder unit for local features and global context information extraction and hence strength the model’s robustness against various illumination conditions. Then an attention fusion module is utilized to combine local feature and global context information and to select instrument-related boundary features. To bridge the semantic gap between encoder and decoder, we propose a parallel dual cross-attention (DCA) block to capture the channel and spatial dependencies across multi-scale encoder to build the enhanced skip connection. Experimental results show that the proposed method achieves state-of-the-art performance on Endovis2017 and Kvasir-instrument datasets.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 130-136"},"PeriodicalIF":3.9,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141945966","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Fingerprint membership and identity inference against generative adversarial networks 针对生成式对抗网络的指纹成员和身份推断
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-31 DOI: 10.1016/j.patrec.2024.07.018
Saverio Cavasin , Daniele Mari , Simone Milani , Mauro Conti

Generative models are gaining significant attention as potential catalysts for a novel industrial revolution. Since automated sample generation can be useful to solve privacy and data scarcity issues that usually affect learned biometric models, such technologies became widely spread in this field.

In this paper, we assess the vulnerabilities of generative machine learning models concerning identity protection by designing and testing an identity inference attack on fingerprint datasets created by means of a generative adversarial network. Experimental results show that the proposed solution proves to be effective under different configurations and easily extendable to other biometric measurements.

作为新工业革命的潜在催化剂,生成模型正受到广泛关注。在本文中,我们通过设计和测试对生成式对抗网络创建的指纹数据集的身份推理攻击,评估了生成式机器学习模型在身份保护方面的脆弱性。实验结果表明,所提出的解决方案在不同的配置下都证明是有效的,而且很容易扩展到其他生物识别测量。
{"title":"Fingerprint membership and identity inference against generative adversarial networks","authors":"Saverio Cavasin ,&nbsp;Daniele Mari ,&nbsp;Simone Milani ,&nbsp;Mauro Conti","doi":"10.1016/j.patrec.2024.07.018","DOIUrl":"10.1016/j.patrec.2024.07.018","url":null,"abstract":"<div><p>Generative models are gaining significant attention as potential catalysts for a novel industrial revolution. Since automated sample generation can be useful to solve privacy and data scarcity issues that usually affect learned biometric models, such technologies became widely spread in this field.</p><p>In this paper, we assess the vulnerabilities of generative machine learning models concerning identity protection by designing and testing an identity inference attack on fingerprint datasets created by means of a generative adversarial network. Experimental results show that the proposed solution proves to be effective under different configurations and easily extendable to other biometric measurements.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 184-189"},"PeriodicalIF":3.9,"publicationDate":"2024-07-31","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142048954","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Recent advances in behavioral and hidden biometrics for personal identification 用于个人身份识别的行为和隐性生物识别技术的最新进展
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-30 DOI: 10.1016/j.patrec.2024.07.016
Giulia Orrù , Ajita Rattani , Imad Rida , Sébastien Marcel
{"title":"Recent advances in behavioral and hidden biometrics for personal identification","authors":"Giulia Orrù ,&nbsp;Ajita Rattani ,&nbsp;Imad Rida ,&nbsp;Sébastien Marcel","doi":"10.1016/j.patrec.2024.07.016","DOIUrl":"10.1016/j.patrec.2024.07.016","url":null,"abstract":"","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 108-109"},"PeriodicalIF":3.9,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141887162","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Enhancing low-light images via dehazing principles: Essence and method 通过去色原理增强弱光图像:本质与方法
IF 3.9 3区 计算机科学 Q2 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-30 DOI: 10.1016/j.patrec.2024.07.017
Fei Li , Caiju Wang , Xiaomao Li

Given the visual resemblance between inverted low-light and hazy images, dehazing principles are borrowed to enhance low-light images. However, the essence of such methods remains unclear, and they are susceptible to over-enhancement. Regarding the above issues, in this letter, we present corresponding solutions. Specifically, we point out that the Haze Formation Model (HFM) used for image dehazing exhibits a Bidirectional Mapping Property (BMP), enabling adjustment of image brightness and contrast. Building upon this property, we give a comprehensive and in-depth theoretical explanation for why dehazing on inverted low-light image is a solution to the image brightness enhancement problem. Further, an Adaptive Full Dynamic Range Mapping (AFDRM) method is then proposed to guide HFM in restoring the visibility of low-light images without inversion, while overcoming the issue of over-enhancement. Extensive experiments validate our proof and demonstrate the efficacy of our method.

鉴于反转低照度图像和朦胧图像在视觉上的相似性,人们借用去雾原理来增强低照度图像。然而,这类方法的本质仍不明确,而且容易出现过度增强的问题。针对上述问题,我们在这封信中提出了相应的解决方案。具体来说,我们指出用于图像去噪的雾霾形成模型(HFM)具有双向映射特性(BMP),可以调整图像亮度和对比度。在这一特性的基础上,我们从理论上全面深入地解释了为什么对反转低照度图像进行去噪处理可以解决图像亮度增强问题。此外,我们还提出了一种自适应全动态范围映射(AFDRM)方法,以指导高频处理在不反转的情况下恢复低照度图像的可见度,同时克服过度增强的问题。大量实验验证了我们的证明,并证明了我们方法的有效性。
{"title":"Enhancing low-light images via dehazing principles: Essence and method","authors":"Fei Li ,&nbsp;Caiju Wang ,&nbsp;Xiaomao Li","doi":"10.1016/j.patrec.2024.07.017","DOIUrl":"10.1016/j.patrec.2024.07.017","url":null,"abstract":"<div><p>Given the visual resemblance between inverted low-light and hazy images, dehazing principles are borrowed to enhance low-light images. However, the essence of such methods remains unclear, and they are susceptible to over-enhancement. Regarding the above issues, in this letter, we present corresponding solutions. Specifically, we point out that the Haze Formation Model (HFM) used for image dehazing exhibits a Bidirectional Mapping Property (BMP), enabling adjustment of image brightness and contrast. Building upon this property, we give a comprehensive and in-depth theoretical explanation for why dehazing on inverted low-light image is a solution to the image brightness enhancement problem. Further, an Adaptive Full Dynamic Range Mapping (AFDRM) method is then proposed to guide HFM in restoring the visibility of low-light images without inversion, while overcoming the issue of over-enhancement. Extensive experiments validate our proof and demonstrate the efficacy of our method.</p></div>","PeriodicalId":54638,"journal":{"name":"Pattern Recognition Letters","volume":"185 ","pages":"Pages 167-174"},"PeriodicalIF":3.9,"publicationDate":"2024-07-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141992922","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":3,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Pattern Recognition Letters
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1