Self-supervised learning-based re-parameterization instance segmentation for hazardous object in terahertz image

IF 3.5 2区工程技术 Q2 OPTICS Optics and Lasers in Engineering Pub Date : 2024-10-01 DOI:10.1016/j.optlaseng.2024.108620

{"title":"Self-supervised learning-based re-parameterization instance segmentation for hazardous object in terahertz image","authors":"","doi":"10.1016/j.optlaseng.2024.108620","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, terahertz imaging technology has shown widespread potential applications in the public security screening field. Previous terahertz hazardous object detection techniques primarily relied on manual recognition and image processing methods. Some solutions incorporating deep learning technologies depend on large quantities of high-quality data, making it challenging to achieve low-cost and high-performance detection. To solve the issue of dependency on large amounts of high-quality data, we propose a diversified dynamic structure You Only Look Once (DDS-YOLO) model based on masked image modeling and structural reparameterization for the instance segmentation of hazardous objects in terahertz security inspection images. To address the scarcity and annotating difficulty problems of terahertz security inspection image samples, we combine automatic data strategies with masked image modeling for self-supervised learning. We propose a multilevel feature refinement fusion mechanism to enhance the quality of learned feature representations. The backbone parameter transfer and fine-tuning training strategies are employed to achieve hazardous object instance segmentation on the terahertz dataset. To address the low detection accuracy issue caused by the poor terahertz image quality, we develop a reparameterizable hierarchical structure for the backbone, improve a multi-scale feature-integrating neck, and design a dynamically decoupled head with lower computational requirements to enhance the performance of the instance segmentation model. Experimental results demonstrate that the proposed model accurately outputs detection boxes, categories, and segmentation masks for hazardous objects with minimal training samples. The comparative experimental results indicate that the proposed model outperforms existing state-of-the-art methods in terms of detection performance. The proposed DDS-YOLO model achieves 59.3 % in mask mean Average Precision (mAP) and 61.3 % in box mAP, and the model parameters and computational requirements also meet practical application scenarios.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816624005980","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}

引用次数: 0

Abstract

Recently, terahertz imaging technology has shown widespread potential applications in the public security screening field. Previous terahertz hazardous object detection techniques primarily relied on manual recognition and image processing methods. Some solutions incorporating deep learning technologies depend on large quantities of high-quality data, making it challenging to achieve low-cost and high-performance detection. To solve the issue of dependency on large amounts of high-quality data, we propose a diversified dynamic structure You Only Look Once (DDS-YOLO) model based on masked image modeling and structural reparameterization for the instance segmentation of hazardous objects in terahertz security inspection images. To address the scarcity and annotating difficulty problems of terahertz security inspection image samples, we combine automatic data strategies with masked image modeling for self-supervised learning. We propose a multilevel feature refinement fusion mechanism to enhance the quality of learned feature representations. The backbone parameter transfer and fine-tuning training strategies are employed to achieve hazardous object instance segmentation on the terahertz dataset. To address the low detection accuracy issue caused by the poor terahertz image quality, we develop a reparameterizable hierarchical structure for the backbone, improve a multi-scale feature-integrating neck, and design a dynamically decoupled head with lower computational requirements to enhance the performance of the instance segmentation model. Experimental results demonstrate that the proposed model accurately outputs detection boxes, categories, and segmentation masks for hazardous objects with minimal training samples. The comparative experimental results indicate that the proposed model outperforms existing state-of-the-art methods in terms of detection performance. The proposed DDS-YOLO model achieves 59.3 % in mask mean Average Precision (mAP) and 61.3 % in box mAP, and the model parameters and computational requirements also meet practical application scenarios.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于自监督学习的太赫兹图像危险物体重参数化实例分割

最近，太赫兹成像技术在公共安检领域显示出了广泛的应用潜力。以往的太赫兹危险物体检测技术主要依赖人工识别和图像处理方法。一些融合了深度学习技术的解决方案依赖于大量高质量数据，因此实现低成本、高性能的检测具有挑战性。为了解决对大量高质量数据的依赖问题，我们提出了一种基于遮蔽图像建模和结构重参数化的多样化动态结构 "你只看一次"（DDS-YOLO）模型，用于太赫兹安检图像中危险物体的实例分割。为了解决太赫兹安检图像样本稀缺和注释困难的问题，我们将自动数据策略与掩蔽图像建模相结合，进行自监督学习。我们提出了一种多级特征细化融合机制，以提高所学特征表征的质量。采用骨干参数转移和微调训练策略，在太赫兹数据集上实现了危险物体实例分割。针对太赫兹图像质量差导致的检测准确率低的问题，我们为主干开发了可重新参数化的分层结构，改进了多尺度特征整合颈部，并设计了计算要求更低的动态解耦头部，以提高实例分割模型的性能。实验结果表明，所提出的模型能以最少的训练样本准确输出危险物体的检测框、类别和分割掩码。对比实验结果表明，所提出的模型在检测性能方面优于现有的最先进方法。所提出的 DDS-YOLO 模型在掩码平均精度（mAP）方面达到了 59.3%，在箱体平均精度（mAP）方面达到了 61.3%，而且模型参数和计算要求也符合实际应用场景。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Optics and Lasers in Engineering 工程技术-光学

CiteScore

8.90

自引率

8.70%

发文量

384

审稿时长

42 days

期刊介绍： Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods. Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following: -Optical Metrology- Optical Methods for 3D visualization and virtual engineering- Optical Techniques for Microsystems- Imaging, Microscopy and Adaptive Optics- Computational Imaging- Laser methods in manufacturing- Integrated optical and photonic sensors- Optics and Photonics in Life Science- Hyperspectral and spectroscopic methods- Infrared and Terahertz techniques