首页 > 最新文献

Machine Vision and Applications最新文献

英文 中文
An adversarial sample detection method based on heterogeneous denoising 基于异构去噪的对抗样本检测方法
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-09 DOI: 10.1007/s00138-024-01579-3
Lifang Zhu, Chao Liu, Zhiqiang Zhang, Yifan Cheng, Biao Jie, Xintao Ding

Deep learning has been used in many computer-vision-based applications. However, deep neural networks are vulnerable to adversarial examples that have been crafted specifically to fool a system while being imperceptible to humans. In this paper, we propose a detection defense method based on heterogeneous denoising on foreground and background (HDFB). Since an image region that dominates to the output classification is usually sensitive to adversarial perturbations, HDFB focuses defense on the foreground region rather than the whole image. First, HDFB uses class activation map to segment examples into foreground and background regions. Second, the foreground and background are encoded to square patches. Third, the encoded foreground is zoomed in and out and is denoised in two scales. Subsequently, the encoded background is denoised once using bilateral filtering. After that, the denoised foreground and background patches are decoded. Finally, the decoded foreground and background are stitched together as a denoised sample for classification. If the classifications of the denoised and input images are different, the input image is detected as an adversarial example. The comparison experiments are implemented on CIFAR-10 and MiniImageNet. The average detection rate (DR) against white-box attacks on the test sets of the two datasets is 86.4%. The average DR against black-box attacks on MiniImageNet is 88.4%. The experimental results suggest that HDFB shows high performance on adversarial examples and is robust against white-box and black-box adversarial attacks. However, HDFB is insecure if its defense parameters are exposed to attackers.

深度学习已被用于许多基于计算机视觉的应用中。然而,深度神经网络很容易受到对抗性示例的影响,这些对抗性示例专门用来欺骗系统,而人类却无法察觉。本文提出了一种基于前景和背景异构去噪(HDFB)的检测防御方法。由于对输出分类起主导作用的图像区域通常对对抗性扰动很敏感,因此 HDFB 将防御重点放在前景区域而不是整个图像上。首先,HDFB 使用类激活图将示例分割为前景和背景区域。其次,将前景和背景编码为正方形斑块。第三,对编码后的前景进行放大和缩小,并在两个尺度上进行去噪处理。随后,使用双边滤波对编码后的背景进行一次去噪。然后,对去噪的前景和背景斑块进行解码。最后,将解码后的前景和背景拼接在一起,作为去噪样本进行分类。如果去噪图像和输入图像的分类结果不同,输入图像就会被检测为对抗样本。对比实验在 CIFAR-10 和 MiniImageNet 上进行。在这两个数据集的测试集上,针对白盒攻击的平均检测率(DR)为 86.4%。在 MiniImageNet 上,针对黑盒攻击的平均检测率为 88.4%。实验结果表明,HDFB 在对抗性示例上表现出很高的性能,并且对白盒和黑盒对抗性攻击具有很强的鲁棒性。但是,如果 HDFB 的防御参数暴露给攻击者,它就不安全了。
{"title":"An adversarial sample detection method based on heterogeneous denoising","authors":"Lifang Zhu, Chao Liu, Zhiqiang Zhang, Yifan Cheng, Biao Jie, Xintao Ding","doi":"10.1007/s00138-024-01579-3","DOIUrl":"https://doi.org/10.1007/s00138-024-01579-3","url":null,"abstract":"<p>Deep learning has been used in many computer-vision-based applications. However, deep neural networks are vulnerable to adversarial examples that have been crafted specifically to fool a system while being imperceptible to humans. In this paper, we propose a detection defense method based on heterogeneous denoising on foreground and background (HDFB). Since an image region that dominates to the output classification is usually sensitive to adversarial perturbations, HDFB focuses defense on the foreground region rather than the whole image. First, HDFB uses class activation map to segment examples into foreground and background regions. Second, the foreground and background are encoded to square patches. Third, the encoded foreground is zoomed in and out and is denoised in two scales. Subsequently, the encoded background is denoised once using bilateral filtering. After that, the denoised foreground and background patches are decoded. Finally, the decoded foreground and background are stitched together as a denoised sample for classification. If the classifications of the denoised and input images are different, the input image is detected as an adversarial example. The comparison experiments are implemented on CIFAR-10 and MiniImageNet. The average detection rate (DR) against white-box attacks on the test sets of the two datasets is 86.4%. The average DR against black-box attacks on MiniImageNet is 88.4%. The experimental results suggest that HDFB shows high performance on adversarial examples and is robust against white-box and black-box adversarial attacks. However, HDFB is insecure if its defense parameters are exposed to attackers.\u0000</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-09","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573670","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
IAFPN: interlayer enhancement and multilayer fusion network for object detection IAFPN:用于物体检测的层间增强和多层融合网络
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-08 DOI: 10.1007/s00138-024-01577-5
Zhicheng Li, Chao Yang, Longyu Jiang

Feature pyramid network (FPN) improves object detection performance by means of top-down multilevel feature fusion. However, the current FPN-based methods have not effectively utilized the interlayer features to suppress the aliasing effects in the feature downward fusion process. We propose an interlayer attention feature pyramid network that attempts to integrate attention gates into FPN through interlayer enhancement to establish the correlation between context and model, thereby highlighting the salient region of each layer and suppressing the aliasing effects. Moreover, in order to avoid feature dilution in the feature downward fusion process and inability of multilayer features to utilize each other, simplified non-local algorithm is used in the multilayer fusion module to fuse and enhance the multiscale features. A comprehensive analysis of MS COCO and PASCAL VOC benchmarks demonstrate that our network achieves precise object localization and also outperforms current FPN-based object detection algorithms.

特征金字塔网络(FPN)通过自上而下的多层次特征融合提高了物体检测性能。然而,目前基于 FPN 的方法并未有效利用层间特征来抑制特征向下融合过程中的混叠效应。我们提出了一种层间注意力特征金字塔网络,试图通过层间增强将注意力门集成到 FPN 中,建立上下文与模型之间的相关性,从而突出各层的突出区域,抑制混叠效应。此外,为了避免特征向下融合过程中的特征稀释和多层特征无法相互利用,在多层融合模块中采用了简化的非局部算法来融合和增强多尺度特征。对 MS COCO 和 PASCAL VOC 基准的综合分析表明,我们的网络实现了精确的目标定位,其性能也优于目前基于 FPN 的目标检测算法。
{"title":"IAFPN: interlayer enhancement and multilayer fusion network for object detection","authors":"Zhicheng Li, Chao Yang, Longyu Jiang","doi":"10.1007/s00138-024-01577-5","DOIUrl":"https://doi.org/10.1007/s00138-024-01577-5","url":null,"abstract":"<p>Feature pyramid network (FPN) improves object detection performance by means of top-down multilevel feature fusion. However, the current FPN-based methods have not effectively utilized the interlayer features to suppress the aliasing effects in the feature downward fusion process. We propose an interlayer attention feature pyramid network that attempts to integrate attention gates into FPN through interlayer enhancement to establish the correlation between context and model, thereby highlighting the salient region of each layer and suppressing the aliasing effects. Moreover, in order to avoid feature dilution in the feature downward fusion process and inability of multilayer features to utilize each other, simplified non-local algorithm is used in the multilayer fusion module to fuse and enhance the multiscale features. A comprehensive analysis of MS COCO and PASCAL VOC benchmarks demonstrate that our network achieves precise object localization and also outperforms current FPN-based object detection algorithms.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573671","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GOA-net: generic occlusion aware networks for visual tracking GOA-网络:用于视觉跟踪的通用遮挡感知网络
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-07 DOI: 10.1007/s00138-024-01580-w
Mohana Murali Dasari, Rama Krishna Gorthi

Occlusion is a frequent phenomenon that hinders the task of visual object tracking. Since occlusion can be from any object and in any shape, data augmentation techniques will not greatly help identify or mitigate the tracker loss. Some of the existing works deal with occlusion only in an unsupervised manner. This paper proposes a generic deep learning framework for identifying occlusion in a given frame by formulating it as a supervised classification task for the first time. The proposed architecture introduces an “occlusion classification” branch into supervised trackers. This branch helps in the effective learning of features and also provides occlusion status for each frame. A metric is proposed to measure the performance of trackers under occlusion at frame level. The efficacy of the proposed framework is demonstrated on two supervised tracking paradigms: One is from the most commonly used Siamese region proposal class of trackers, and another from the emerging transformer-based trackers. This framework is tested on six diverse datasets (GOT-10k, LaSOT, OTB2015, TrackingNet, UAV123, and VOT2018), and it achieved significant improvements in performance over the corresponding baselines while performing on par with the state-of-the-art trackers. The contributions in this work are more generic, as any supervised tracker can easily adopt them.

遮挡是妨碍视觉物体跟踪任务的一种常见现象。由于遮挡可能来自任何物体,也可能是任何形状的物体,因此数据增强技术对识别或减少跟踪器的损失不会有很大帮助。现有的一些作品只是以无监督的方式处理遮挡问题。本文首次提出了一种通用的深度学习框架,通过将其表述为有监督的分类任务,来识别给定帧中的闭塞。所提出的架构在有监督跟踪器中引入了 "闭塞分类 "分支。该分支有助于有效学习特征,还能提供每个帧的闭塞状态。我们还提出了一种衡量标准,用于衡量跟踪器在帧级闭塞情况下的性能。我们在两个监督跟踪范例上演示了所提出框架的功效:一个是最常用的连体区域建议类跟踪器,另一个是新兴的基于变换器的跟踪器。该框架在六个不同的数据集(GOT-10k、LaSOT、OTB2015、TrackingNet、UAV123 和 VOT2018)上进行了测试,其性能比相应的基线有了显著提高,同时与最先进的跟踪器性能相当。这项工作的贡献更具通用性,因为任何有监督跟踪器都可以轻松采用它们。
{"title":"GOA-net: generic occlusion aware networks for visual tracking","authors":"Mohana Murali Dasari, Rama Krishna Gorthi","doi":"10.1007/s00138-024-01580-w","DOIUrl":"https://doi.org/10.1007/s00138-024-01580-w","url":null,"abstract":"<p><i>Occlusion</i> is a frequent phenomenon that hinders the task of visual object tracking. Since occlusion can be from any object and in any shape, data augmentation techniques will not greatly help identify or mitigate the tracker loss. Some of the existing works deal with occlusion only in an unsupervised manner. This paper proposes a generic deep learning framework for identifying occlusion in a given frame by formulating it as a supervised classification task for the first time. The proposed architecture introduces an “occlusion classification” branch into supervised trackers. This branch helps in the effective learning of features and also provides occlusion status for each frame. A metric is proposed to measure the performance of trackers under occlusion at frame level. The efficacy of the proposed framework is demonstrated on two supervised tracking paradigms: One is from the most commonly used Siamese region proposal class of trackers, and another from the emerging transformer-based trackers. This framework is tested on six diverse datasets (GOT-10k, LaSOT, OTB2015, TrackingNet, UAV123, and VOT2018), and it achieved significant improvements in performance over the corresponding baselines while performing on par with the state-of-the-art trackers. The contributions in this work are more generic, as any supervised tracker can easily adopt them.\u0000</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-07","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141573672","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Online camera auto-calibration appliable to road surveillance 适用于道路监控的在线摄像机自动校准功能
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-05 DOI: 10.1007/s00138-024-01576-6
Shusen Guo, Xianwen Yu, Yuejin Sha, Yifan Ju, Mingchen Zhu, Jiafu Wang

Camera calibration is an essential prerequisite for road surveillance applications, which determines the accuracy of obtaining three-dimensional spatial information from surveillance video. The common practice for calibration is collecting the correspondences between the object points and their projections on surveillance, which usually needs to operate the calibrator manually. However, complex traffic and calibrator requirement limit the applicability of existing methods to road scenes. This paper proposes an online camera auto-calibration method for road surveillance to overcome the above problem. It constructs a large-scale virtual checkerboard adopting the road information from surveillance video, in which the structural size of the checkerboard can be easily obtained in advance because of the standardization for road design. The position coordinates of checkerboard corners are used for calibrating camera parameters, which is designed as a “coarse-to-fine” two-step procedure to recover the camera intrinsic and extrinsic parameters efficiently. Experimental results based on real datasets demonstrate that the proposed approach can accurately estimate camera parameters without manual involvement or additional information input. It achieves competitive effects on road surveillance auto-calibration while having lower requirements and computational costs than the automatic state-of-the-art.

摄像机校准是道路监控应用的必要前提,它决定了从监控视频中获取三维空间信息的准确性。校准的常见做法是收集监控对象点与其投影之间的对应关系,通常需要手动操作校准器。然而,复杂的交通和校准器要求限制了现有方法对道路场景的适用性。本文提出了一种用于道路监控的在线摄像机自动校准方法,以克服上述问题。它利用监控视频中的道路信息构建了一个大尺度的虚拟棋盘,由于道路设计的标准化,棋盘的结构尺寸很容易提前获得。棋盘角的位置坐标用于校准摄像机参数,设计为 "从粗到细 "的两步程序,以有效恢复摄像机的内在和外在参数。基于真实数据集的实验结果表明,所提出的方法无需人工参与或额外的信息输入,就能准确估计摄像机参数。与最先进的自动校准方法相比,它的要求和计算成本更低,在道路监控自动校准方面取得了具有竞争力的效果。
{"title":"Online camera auto-calibration appliable to road surveillance","authors":"Shusen Guo, Xianwen Yu, Yuejin Sha, Yifan Ju, Mingchen Zhu, Jiafu Wang","doi":"10.1007/s00138-024-01576-6","DOIUrl":"https://doi.org/10.1007/s00138-024-01576-6","url":null,"abstract":"<p>Camera calibration is an essential prerequisite for road surveillance applications, which determines the accuracy of obtaining three-dimensional spatial information from surveillance video. The common practice for calibration is collecting the correspondences between the object points and their projections on surveillance, which usually needs to operate the calibrator manually. However, complex traffic and calibrator requirement limit the applicability of existing methods to road scenes. This paper proposes an online camera auto-calibration method for road surveillance to overcome the above problem. It constructs a large-scale virtual checkerboard adopting the road information from surveillance video, in which the structural size of the checkerboard can be easily obtained in advance because of the standardization for road design. The position coordinates of checkerboard corners are used for calibrating camera parameters, which is designed as a “coarse-to-fine” two-step procedure to recover the camera intrinsic and extrinsic parameters efficiently. Experimental results based on real datasets demonstrate that the proposed approach can accurately estimate camera parameters without manual involvement or additional information input. It achieves competitive effects on road surveillance auto-calibration while having lower requirements and computational costs than the automatic state-of-the-art.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547677","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Tree-managed network ensembles for video prediction 用于视频预测的树状管理网络集合
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-04 DOI: 10.1007/s00138-024-01575-7
Everett Fall, Kai-Wei Chang, Liang-Gee Chen

This paper presents an innovative approach that leverages a tree structure to effectively manage a large ensemble of neural networks for tackling complex video prediction tasks. Our proposed method introduces a novel technique for partitioning the function domain into simpler subsets, enabling piecewise learning by the ensemble. Seamlessly accessed by an accompanying tree structure with a time complexity of O(log(N)), this ensemble-tree framework progressively expands while training examples become more complex. The tree construction process incorporates a specialized algorithm that utilizes localized comparison functions, learned at each decision node. To evaluate the effectiveness of our method, we conducted experiments in two challenging scenarios: action-conditional video prediction in a 3D video game environment and error detection in real-world 3D printing scenarios. Our approach consistently outperformed existing methods by a significant margin across various experiments. Additionally, we introduce a new evaluation methodology for long-term video prediction tasks, which demonstrates improved alignment with qualitative observations. The results highlight the efficacy and superiority of our ensemble-tree approach in addressing complex video prediction challenges.

本文提出了一种创新方法,利用树状结构有效管理大型神经网络集合,以处理复杂的视频预测任务。我们提出的方法引入了一种新技术,可将功能域划分为更简单的子集,从而实现集合的分片学习。该集合树框架可通过时间复杂度为 O(log(N))的配套树结构无缝访问,并随着训练示例的复杂程度增加而逐步扩展。树的构建过程采用了一种专门的算法,利用在每个决策节点学习到的局部比较函数。为了评估我们方法的有效性,我们在两个具有挑战性的场景中进行了实验:三维视频游戏环境中的动作条件视频预测和真实世界三维打印场景中的错误检测。在各种实验中,我们的方法始终远远优于现有方法。此外,我们还为长期视频预测任务引入了一种新的评估方法,该方法与定性观察的一致性得到了改善。这些结果凸显了我们的集合树方法在应对复杂视频预测挑战方面的有效性和优越性。
{"title":"Tree-managed network ensembles for video prediction","authors":"Everett Fall, Kai-Wei Chang, Liang-Gee Chen","doi":"10.1007/s00138-024-01575-7","DOIUrl":"https://doi.org/10.1007/s00138-024-01575-7","url":null,"abstract":"<p>This paper presents an innovative approach that leverages a tree structure to effectively manage a large ensemble of neural networks for tackling complex video prediction tasks. Our proposed method introduces a novel technique for partitioning the function domain into simpler subsets, enabling piecewise learning by the ensemble. Seamlessly accessed by an accompanying tree structure with a time complexity of O(log(N)), this ensemble-tree framework progressively expands while training examples become more complex. The tree construction process incorporates a specialized algorithm that utilizes localized comparison functions, learned at each decision node. To evaluate the effectiveness of our method, we conducted experiments in two challenging scenarios: action-conditional video prediction in a 3D video game environment and error detection in real-world 3D printing scenarios. Our approach consistently outperformed existing methods by a significant margin across various experiments. Additionally, we introduce a new evaluation methodology for long-term video prediction tasks, which demonstrates improved alignment with qualitative observations. The results highlight the efficacy and superiority of our ensemble-tree approach in addressing complex video prediction challenges.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-04","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141547676","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Poly-cam: high resolution class activation map for convolutional neural networks Poly-cam:卷积神经网络的高分辨率类激活图谱
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-03 DOI: 10.1007/s00138-024-01567-7
Alexandre Englebert, Olivier Cornu, Christophe De Vleeschouwer

The demand for explainable AI continues to rise alongside advancements in deep learning technology. Existing methods such as convolutional neural networks often struggle to accurately pinpoint the image features justifying a network’s prediction due to low-resolution saliency maps (e.g., CAM), smooth visualizations from perturbation-based techniques, or numerous isolated peaky spots in gradient-based approaches. In response, our work seeks to merge information from earlier and later layers within the network to create high-resolution class activation maps that not only maintain a level of competitiveness with previous art in terms of insertion-deletion faithfulness metrics but also significantly surpass it regarding the precision in localizing class-specific features.

随着深度学习技术的不断进步,对可解释人工智能的需求也持续上升。卷积神经网络等现有方法往往难以准确定位图像特征,因为低分辨率的显著性图(如 CAM)、基于扰动技术的平滑可视化,或基于梯度的方法中大量孤立的峰值点,都会影响网络预测的准确性。对此,我们的研究试图融合网络中前层和后层的信息,创建高分辨率的类别激活图,不仅在插入-删除忠实度指标方面与前人的技术保持一定的竞争水平,而且在定位特定类别特征的精确度方面也大大超过前人。
{"title":"Poly-cam: high resolution class activation map for convolutional neural networks","authors":"Alexandre Englebert, Olivier Cornu, Christophe De Vleeschouwer","doi":"10.1007/s00138-024-01567-7","DOIUrl":"https://doi.org/10.1007/s00138-024-01567-7","url":null,"abstract":"<p>The demand for explainable AI continues to rise alongside advancements in deep learning technology. Existing methods such as convolutional neural networks often struggle to accurately pinpoint the image features justifying a network’s prediction due to low-resolution saliency maps (e.g., CAM), smooth visualizations from perturbation-based techniques, or numerous isolated peaky spots in gradient-based approaches. In response, our work seeks to merge information from earlier and later layers within the network to create high-resolution class activation maps that not only maintain a level of competitiveness with previous art in terms of insertion-deletion faithfulness metrics but also significantly surpass it regarding the precision in localizing class-specific features.\u0000</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141531056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Correction: Adversarial defence by learning differentiated feature representation in deep ensemble 更正:通过在深度集合中学习差异化特征表示进行对抗性防御
IF 2.4 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s00138-024-01583-7
Xi Chen, Wei Huang, Wei Guo, Fan Zhang, Jiayu Du, Zhizhong Zhou
{"title":"Correction: Adversarial defence by learning differentiated feature representation in deep ensemble","authors":"Xi Chen, Wei Huang, Wei Guo, Fan Zhang, Jiayu Du, Zhizhong Zhou","doi":"10.1007/s00138-024-01583-7","DOIUrl":"https://doi.org/10.1007/s00138-024-01583-7","url":null,"abstract":"","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141851964","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Adversarial defence by learning differentiated feature representation in deep ensemble 通过在深度集合中学习差异化特征表示进行对抗性防御
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s00138-024-01571-x
Xi Chen, Huang Wei, Wei Guo, Fan Zhang, Jiayu Du, Zhizhong Zhou

Deep learning models have been shown to be vulnerable to critical attacks under adversarial conditions. Attackers are able to generate powerful adversarial examples by searching for adversarial perturbations, without interfering with model training or directly modifying the model. This phenomenon indicates an endogenous problem in existing deep learning frameworks. Therefore, optimizing individual models for defense is often limited and can always be defeated by new attack methods. Ensemble defense has been shown to be effective in defending against adversarial attacks by combining diverse models. However, the problem of insufficient differentiation among existing models persists. Active defense in cyberspace security has successfully defended against unknown vulnerabilities by integrating subsystems with multiple different implementations to achieve a unified mission objective. Inspired by this, we propose exploring the feasibility of achieving model differentiation by changing the data features used in training individual models, as they are the core factor of functional implementation. We utilize several feature extraction methods to preprocess the data and train differentiated models based on these features. By generating adversarial perturbations to attack different models, we demonstrate that the feature representation of the data is highly resistant to adversarial perturbations. The entire ensemble is able to operate normally in an error-bearing environment.

研究表明,深度学习模型在对抗条件下很容易受到关键攻击。攻击者能够在不干扰模型训练或直接修改模型的情况下,通过搜索对抗性扰动生成强大的对抗性示例。这一现象表明,现有的深度学习框架存在内生性问题。因此,优化单个模型的防御往往是有限的,而且总是会被新的攻击方法击败。事实证明,通过组合不同的模型,集合防御可以有效抵御对抗性攻击。然而,现有模型之间差异化不足的问题依然存在。网络空间安全中的主动防御通过整合具有多种不同实现方式的子系统来实现统一的任务目标,成功抵御了未知漏洞的攻击。受此启发,我们提议探索通过改变用于训练单个模型的数据特征来实现模型差异化的可行性,因为数据特征是功能实现的核心因素。我们利用多种特征提取方法对数据进行预处理,并根据这些特征训练差异化模型。通过产生对抗性扰动来攻击不同的模型,我们证明了数据的特征表示对对抗性扰动具有很强的抵抗力。整个集合能够在有误差的环境中正常运行。
{"title":"Adversarial defence by learning differentiated feature representation in deep ensemble","authors":"Xi Chen, Huang Wei, Wei Guo, Fan Zhang, Jiayu Du, Zhizhong Zhou","doi":"10.1007/s00138-024-01571-x","DOIUrl":"https://doi.org/10.1007/s00138-024-01571-x","url":null,"abstract":"<p>Deep learning models have been shown to be vulnerable to critical attacks under adversarial conditions. Attackers are able to generate powerful adversarial examples by searching for adversarial perturbations, without interfering with model training or directly modifying the model. This phenomenon indicates an endogenous problem in existing deep learning frameworks. Therefore, optimizing individual models for defense is often limited and can always be defeated by new attack methods. Ensemble defense has been shown to be effective in defending against adversarial attacks by combining diverse models. However, the problem of insufficient differentiation among existing models persists. Active defense in cyberspace security has successfully defended against unknown vulnerabilities by integrating subsystems with multiple different implementations to achieve a unified mission objective. Inspired by this, we propose exploring the feasibility of achieving model differentiation by changing the data features used in training individual models, as they are the core factor of functional implementation. We utilize several feature extraction methods to preprocess the data and train differentiated models based on these features. By generating adversarial perturbations to attack different models, we demonstrate that the feature representation of the data is highly resistant to adversarial perturbations. The entire ensemble is able to operate normally in an error-bearing environment.</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141517784","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Towards scanning electron microscopy image denoising: a state-of-the-art overview, benchmark, taxonomies, and future direction 实现扫描电子显微镜图像去噪:最新技术概述、基准、分类和未来方向
IF 3.3 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s00138-024-01573-9
Sheikh Shah Mohammad Motiur Rahman, Michel Salomon, Sounkalo Dembélé

Scanning electron microscope (SEM) enables imaging of micro-nano scale objects. It is an analytical tool widely used in the material, earth and life sciences. However, SEM images often suffer from high noise levels, influenced by factors such as dwell time, the time during which the electron beam remains per pixel during acquisition. Slower dwell times reduce noise but risk damaging the sample, while faster ones introduce uncertainty. To this end, the latest state-of-the-art denoising techniques must be explored. Experimentation is crucial to identify the most effective methods that balance noise reduction and sample preservation, ensuring high-quality SEM images with enhanced clarity and accuracy. A thorough analysis tracing the evolution of image denoising techniques was conducted, ranging from classical methods to deep learning approaches. A comprehensive taxonomy of this reverse problem solutions was established, detailing the developmental flow of these methods. Subsequently, the latest state-of-the-art techniques were identified and reviewed based on their reproducibility and the public availability of their source code. The selected techniques were then tested and investigated using scanning electron microscope images. After in-depth analysis and benchmarking, it is clear that the existing deep learning-based denoising techniques fall short in maintaining a balance between noise reduction and preserving crucial information for SEM images. Issues like information removal and over-smoothing have been identified. To address these constraints, there is a critical need for the development of SEM image denoising techniques that prioritize both noise reduction and information preservation. Additionally, one can see that the combination of several networks, such as the generative adversarial network and the convolutional neural network (CNN), known as BoostNet, or the vision transformer and the CNN, known as SCUNet, improves denoising performance. It is recommended to use blind techniques to denoise real noise while taking into account detail preservation and tackling excessive smoothing, particularly in the context of SEM. In the future the use of explainable AI will facilitate the debugging and the identification of these problems.

扫描电子显微镜(SEM)可对微纳米级物体进行成像。它是一种广泛应用于材料科学、地球科学和生命科学的分析工具。然而,扫描电子显微镜图像往往受驻留时间(即采集时电子束在每个像素上停留的时间)等因素的影响而出现高噪声。较慢的停留时间会降低噪声,但有可能损坏样品,而较快的停留时间则会带来不确定性。为此,必须探索最新的去噪技术。实验对于确定最有效的方法至关重要,这些方法能在减少噪音和保护样品之间取得平衡,确保高质量的扫描电镜图像具有更高的清晰度和准确性。从经典方法到深度学习方法,我们对图像去噪技术的演变进行了深入分析。对这种反向问题解决方案进行了全面分类,详细介绍了这些方法的发展流程。随后,根据这些技术的可重复性及其源代码的公开性,确定并审查了最新的先进技术。然后,利用扫描电子显微镜图像对所选技术进行了测试和研究。经过深入分析和基准测试后发现,现有的基于深度学习的去噪技术在保持 SEM 图像降噪和保留关键信息之间的平衡方面存在不足。已经发现了信息去除和过度平滑等问题。为了解决这些制约因素,亟需开发同时优先考虑降噪和保存信息的 SEM 图像去噪技术。此外,我们还可以看到,将生成式对抗网络和卷积神经网络(称为 BoostNet)或视觉转换器和卷积神经网络(称为 SCUNet)等多个网络结合起来,可以提高去噪性能。建议使用盲技术对真实噪声进行去噪,同时考虑到细节保留和解决过度平滑问题,特别是在 SEM 的情况下。未来,使用可解释的人工智能将有助于调试和识别这些问题。
{"title":"Towards scanning electron microscopy image denoising: a state-of-the-art overview, benchmark, taxonomies, and future direction","authors":"Sheikh Shah Mohammad Motiur Rahman, Michel Salomon, Sounkalo Dembélé","doi":"10.1007/s00138-024-01573-9","DOIUrl":"https://doi.org/10.1007/s00138-024-01573-9","url":null,"abstract":"<p>Scanning electron microscope (SEM) enables imaging of micro-nano scale objects. It is an analytical tool widely used in the material, earth and life sciences. However, SEM images often suffer from high noise levels, influenced by factors such as dwell time, the time during which the electron beam remains per pixel during acquisition. Slower dwell times reduce noise but risk damaging the sample, while faster ones introduce uncertainty. To this end, the latest state-of-the-art denoising techniques must be explored. Experimentation is crucial to identify the most effective methods that balance noise reduction and sample preservation, ensuring high-quality SEM images with enhanced clarity and accuracy. A thorough analysis tracing the evolution of image denoising techniques was conducted, ranging from classical methods to deep learning approaches. A comprehensive taxonomy of this reverse problem solutions was established, detailing the developmental flow of these methods. Subsequently, the latest state-of-the-art techniques were identified and reviewed based on their reproducibility and the public availability of their source code. The selected techniques were then tested and investigated using scanning electron microscope images. After in-depth analysis and benchmarking, it is clear that the existing deep learning-based denoising techniques fall short in maintaining a balance between noise reduction and preserving crucial information for SEM images. Issues like information removal and over-smoothing have been identified. To address these constraints, there is a critical need for the development of SEM image denoising techniques that prioritize both noise reduction and information preservation. Additionally, one can see that the combination of several networks, such as the generative adversarial network and the convolutional neural network (CNN), known as BoostNet, or the vision transformer and the CNN, known as SCUNet, improves denoising performance. It is recommended to use blind techniques to denoise real noise while taking into account detail preservation and tackling excessive smoothing, particularly in the context of SEM. In the future the use of explainable AI will facilitate the debugging and the identification of these problems.\u0000</p>","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":3.3,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141509087","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Rocnet: 3D robust registration of points clouds using deep learning Rocnet:利用深度学习实现点云的三维稳健注册
IF 2.4 4区 计算机科学 Q3 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Pub Date : 2024-07-01 DOI: 10.1007/s00138-024-01584-6
Karim Slimani, Brahim Tamadazte, Catherine Achard
{"title":"Rocnet: 3D robust registration of points clouds using deep learning","authors":"Karim Slimani, Brahim Tamadazte, Catherine Achard","doi":"10.1007/s00138-024-01584-6","DOIUrl":"https://doi.org/10.1007/s00138-024-01584-6","url":null,"abstract":"","PeriodicalId":51116,"journal":{"name":"Machine Vision and Applications","volume":null,"pages":null},"PeriodicalIF":2.4,"publicationDate":"2024-07-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"141689602","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":4,"RegionCategory":"计算机科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Machine Vision and Applications
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:481959085
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1