{"title":"Self-supervised learning-based re-parameterization instance segmentation for hazardous object in terahertz image","authors":"","doi":"10.1016/j.optlaseng.2024.108620","DOIUrl":null,"url":null,"abstract":"<div><div>Recently, terahertz imaging technology has shown widespread potential applications in the public security screening field. Previous terahertz hazardous object detection techniques primarily relied on manual recognition and image processing methods. Some solutions incorporating deep learning technologies depend on large quantities of high-quality data, making it challenging to achieve low-cost and high-performance detection. To solve the issue of dependency on large amounts of high-quality data, we propose a diversified dynamic structure You Only Look Once (DDS-YOLO) model based on masked image modeling and structural reparameterization for the instance segmentation of hazardous objects in terahertz security inspection images. To address the scarcity and annotating difficulty problems of terahertz security inspection image samples, we combine automatic data strategies with masked image modeling for self-supervised learning. We propose a multilevel feature refinement fusion mechanism to enhance the quality of learned feature representations. The backbone parameter transfer and fine-tuning training strategies are employed to achieve hazardous object instance segmentation on the terahertz dataset. To address the low detection accuracy issue caused by the poor terahertz image quality, we develop a reparameterizable hierarchical structure for the backbone, improve a multi-scale feature-integrating neck, and design a dynamically decoupled head with lower computational requirements to enhance the performance of the instance segmentation model. Experimental results demonstrate that the proposed model accurately outputs detection boxes, categories, and segmentation masks for hazardous objects with minimal training samples. The comparative experimental results indicate that the proposed model outperforms existing state-of-the-art methods in terms of detection performance. The proposed DDS-YOLO model achieves 59.3 % in mask mean Average Precision (mAP) and 61.3 % in box mAP, and the model parameters and computational requirements also meet practical application scenarios.</div></div>","PeriodicalId":49719,"journal":{"name":"Optics and Lasers in Engineering","volume":null,"pages":null},"PeriodicalIF":3.5000,"publicationDate":"2024-10-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Optics and Lasers in Engineering","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0143816624005980","RegionNum":2,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q2","JCRName":"OPTICS","Score":null,"Total":0}
引用次数: 0
Abstract
Recently, terahertz imaging technology has shown widespread potential applications in the public security screening field. Previous terahertz hazardous object detection techniques primarily relied on manual recognition and image processing methods. Some solutions incorporating deep learning technologies depend on large quantities of high-quality data, making it challenging to achieve low-cost and high-performance detection. To solve the issue of dependency on large amounts of high-quality data, we propose a diversified dynamic structure You Only Look Once (DDS-YOLO) model based on masked image modeling and structural reparameterization for the instance segmentation of hazardous objects in terahertz security inspection images. To address the scarcity and annotating difficulty problems of terahertz security inspection image samples, we combine automatic data strategies with masked image modeling for self-supervised learning. We propose a multilevel feature refinement fusion mechanism to enhance the quality of learned feature representations. The backbone parameter transfer and fine-tuning training strategies are employed to achieve hazardous object instance segmentation on the terahertz dataset. To address the low detection accuracy issue caused by the poor terahertz image quality, we develop a reparameterizable hierarchical structure for the backbone, improve a multi-scale feature-integrating neck, and design a dynamically decoupled head with lower computational requirements to enhance the performance of the instance segmentation model. Experimental results demonstrate that the proposed model accurately outputs detection boxes, categories, and segmentation masks for hazardous objects with minimal training samples. The comparative experimental results indicate that the proposed model outperforms existing state-of-the-art methods in terms of detection performance. The proposed DDS-YOLO model achieves 59.3 % in mask mean Average Precision (mAP) and 61.3 % in box mAP, and the model parameters and computational requirements also meet practical application scenarios.
期刊介绍:
Optics and Lasers in Engineering aims at providing an international forum for the interchange of information on the development of optical techniques and laser technology in engineering. Emphasis is placed on contributions targeted at the practical use of methods and devices, the development and enhancement of solutions and new theoretical concepts for experimental methods.
Optics and Lasers in Engineering reflects the main areas in which optical methods are being used and developed for an engineering environment. Manuscripts should offer clear evidence of novelty and significance. Papers focusing on parameter optimization or computational issues are not suitable. Similarly, papers focussed on an application rather than the optical method fall outside the journal''s scope. The scope of the journal is defined to include the following:
-Optical Metrology-
Optical Methods for 3D visualization and virtual engineering-
Optical Techniques for Microsystems-
Imaging, Microscopy and Adaptive Optics-
Computational Imaging-
Laser methods in manufacturing-
Integrated optical and photonic sensors-
Optics and Photonics in Life Science-
Hyperspectral and spectroscopic methods-
Infrared and Terahertz techniques