整合超分辨率和平均教师网络的建筑物体检测半监督学习方法

Journal of Infrastructure Intelligence and Resilience Pub Date : 2024-03-08 DOI:10.1016/j.iintel.2024.100095

Wen-Jie Zhang , Hua-Ping Wan , Peng-Hua Hu , Hui-Bin Ge , Yaozhi Luo , Michael D. Todd

{"title":"整合超分辨率和平均教师网络的建筑物体检测半监督学习方法","authors":"Wen-Jie Zhang , Hua-Ping Wan , Peng-Hua Hu , Hui-Bin Ge , Yaozhi Luo , Michael D. Todd","doi":"10.1016/j.iintel.2024.100095","DOIUrl":null,"url":null,"abstract":"<div><p>Deep learning-based object detection methods are utilized for safety management at construction sites, which require large-scale, high-quality, and well-labeled datasets for training. The existing construction datasets are relatively small due to the high expense of labor-intensive annotation, and the varying quality of the construction images also affects the detection performance of the model. To address the limitations of datasets, this study proposes a new method for construction object detection by integrating super-resolution and semi-supervised learning. The proposed method improves the quality of construction images and achieves excellent detection performance with limited labeled data. First, the Real-ESRGAN model is introduced to improve the quality of construction images and make the construction objects visible. The proposed super-resolution method can enhance the texture details of low-resolution images, hence improving the performance of object detection models. Second, the mean-teacher network is adopted to expand the training set, thus avoiding the labor-intensive annotation work. To verify the effectiveness of the proposed method, the method is applied to the state-of-the-art Yolov5 object detection model, and construction images from the Site Object Detection Dataset (SODA) with different labeled data proportions (from 10% to 50% in 10% intervals with an extreme case of 5%) are used as the training set. By comparing with the existing supervised learning method, it is shown that the proposed method can achieve better detection performance. In particular, the method is more effective in enhancing detection performance when the proportion of the labeled data is smaller, which is of great practical value in real-world engineering. The experimental results show the potential of the proposed method in improving image quality and reducing the expense of developing construction datasets.</p></div>","PeriodicalId":100791,"journal":{"name":"Journal of Infrastructure Intelligence and Resilience","volume":null,"pages":null},"PeriodicalIF":0.0000,"publicationDate":"2024-03-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://www.sciencedirect.com/science/article/pii/S2772991524000148/pdfft?md5=a1f292ff4e6a45e5e49364629c2b74b7&pid=1-s2.0-S2772991524000148-main.pdf","citationCount":"0","resultStr":"{\"title\":\"Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network\",\"authors\":\"Wen-Jie Zhang , Hua-Ping Wan , Peng-Hua Hu , Hui-Bin Ge , Yaozhi Luo , Michael D. Todd\",\"doi\":\"10.1016/j.iintel.2024.100095\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<div><p>Deep learning-based object detection methods are utilized for safety management at construction sites, which require large-scale, high-quality, and well-labeled datasets for training. The existing construction datasets are relatively small due to the high expense of labor-intensive annotation, and the varying quality of the construction images also affects the detection performance of the model. To address the limitations of datasets, this study proposes a new method for construction object detection by integrating super-resolution and semi-supervised learning. The proposed method improves the quality of construction images and achieves excellent detection performance with limited labeled data. First, the Real-ESRGAN model is introduced to improve the quality of construction images and make the construction objects visible. The proposed super-resolution method can enhance the texture details of low-resolution images, hence improving the performance of object detection models. Second, the mean-teacher network is adopted to expand the training set, thus avoiding the labor-intensive annotation work. To verify the effectiveness of the proposed method, the method is applied to the state-of-the-art Yolov5 object detection model, and construction images from the Site Object Detection Dataset (SODA) with different labeled data proportions (from 10% to 50% in 10% intervals with an extreme case of 5%) are used as the training set. By comparing with the existing supervised learning method, it is shown that the proposed method can achieve better detection performance. In particular, the method is more effective in enhancing detection performance when the proportion of the labeled data is smaller, which is of great practical value in real-world engineering. The experimental results show the potential of the proposed method in improving image quality and reducing the expense of developing construction datasets.</p></div>\",\"PeriodicalId\":100791,\"journal\":{\"name\":\"Journal of Infrastructure Intelligence and Resilience\",\"volume\":null,\"pages\":null},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-03-08\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"https://www.sciencedirect.com/science/article/pii/S2772991524000148/pdfft?md5=a1f292ff4e6a45e5e49364629c2b74b7&pid=1-s2.0-S2772991524000148-main.pdf\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"Journal of Infrastructure Intelligence and Resilience\",\"FirstCategoryId\":\"1085\",\"ListUrlMain\":\"https://www.sciencedirect.com/science/article/pii/S2772991524000148\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"\",\"JCRName\":\"\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"Journal of Infrastructure Intelligence and Resilience","FirstCategoryId":"1085","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S2772991524000148","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}

引用次数: 0

摘要

基于深度学习的物体检测方法可用于建筑工地的安全管理，这需要大规模、高质量和标记良好的数据集进行训练。由于标注工作耗费大量人力物力，现有的建筑数据集相对较小，而且建筑图像的质量参差不齐，也影响了模型的检测性能。针对数据集的局限性，本研究通过整合超分辨率和半监督学习，提出了一种新的建筑物体检测方法。所提出的方法提高了建筑图像的质量，并在有限的标注数据下实现了出色的检测性能。首先，引入 Real-ESRGAN 模型来提高建筑图像的质量，使建筑物体清晰可见。所提出的超分辨率方法可以增强低分辨率图像的纹理细节，从而提高物体检测模型的性能。其次，采用均值教师网络来扩展训练集，从而避免了劳动密集型标注工作。为了验证所提方法的有效性，我们将该方法应用于最先进的 Yolov5 物体检测模型，并使用了场地物体检测数据集（SODA）中不同标注数据比例（从 10%到 50%，每 10%为一个区间，极端情况为 5%）的建筑图像作为训练集。通过与现有的监督学习方法进行比较，结果表明所提出的方法可以获得更好的检测性能。特别是当标注数据的比例较小时，该方法能更有效地提高检测性能，这在实际工程中具有重要的实用价值。实验结果表明，所提出的方法在提高图像质量和减少构建数据集的费用方面具有潜力。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

Semi-supervised learning approach for construction object detection by integrating super-resolution and mean teacher network

Deep learning-based object detection methods are utilized for safety management at construction sites, which require large-scale, high-quality, and well-labeled datasets for training. The existing construction datasets are relatively small due to the high expense of labor-intensive annotation, and the varying quality of the construction images also affects the detection performance of the model. To address the limitations of datasets, this study proposes a new method for construction object detection by integrating super-resolution and semi-supervised learning. The proposed method improves the quality of construction images and achieves excellent detection performance with limited labeled data. First, the Real-ESRGAN model is introduced to improve the quality of construction images and make the construction objects visible. The proposed super-resolution method can enhance the texture details of low-resolution images, hence improving the performance of object detection models. Second, the mean-teacher network is adopted to expand the training set, thus avoiding the labor-intensive annotation work. To verify the effectiveness of the proposed method, the method is applied to the state-of-the-art Yolov5 object detection model, and construction images from the Site Object Detection Dataset (SODA) with different labeled data proportions (from 10% to 50% in 10% intervals with an extreme case of 5%) are used as the training set. By comparing with the existing supervised learning method, it is shown that the proposed method can achieve better detection performance. In particular, the method is more effective in enhancing detection performance when the proportion of the labeled data is smaller, which is of great practical value in real-world engineering. The experimental results show the potential of the proposed method in improving image quality and reducing the expense of developing construction datasets.

求助全文

通过发布文献求助，成功后即可免费获取论文全文。去求助

来源期刊

Journal of Infrastructure Intelligence and Resilience

CiteScore

2.10

自引率

0.00%

发文量