Bin Liu, Jian Kang, Haiyan Guan, Xiaodong Zhi, Yongtao Yu, Lingfei Ma, Daifeng Peng, Linlin Xu, Dongchuan Wang
{"title":"RTCNet: A novel real-time triple branch network for pavement crack semantic segmentation","authors":"Bin Liu, Jian Kang, Haiyan Guan, Xiaodong Zhi, Yongtao Yu, Lingfei Ma, Daifeng Peng, Linlin Xu, Dongchuan Wang","doi":"10.1016/j.jag.2024.104347","DOIUrl":null,"url":null,"abstract":"Although real-time semantic segmentation of pavement cracks is crucial for road evaluation and maintenance decision-making, it is a challenging task due to low operational efficiency and over-segmentation of existing methods. To address these challenges, in this paper, incorporating Transformers and CNNs, we propose a real-time triple-branch crack semantic segmentation network (RTCNet) using digital camera images. The three branches include a detail branch for capturing local detail features, a context branch for extracting global contextual information, and a boundary branch for obtaining crack boundary information. First, to further enhance crack features, we design a Detail Enhance Transformer (DET) module for enlarging global receptive fields and a Multiscale Aggregation (MSA) module for multiscale learning in the context branch. Second, a Boundary Refinement (BR) module with Sobel operators embedded in the boundary branch is designed to refine the crack boundaries. Last, a Detail-Context Fusion (DCF) module is designed to aggregate the intermediate features extracted from the different branches efficiently Comprehensive quantitative and visual comparisons on four datasets showed that the proposed RTCNet outperforms the comparative models in terms of efficiency and effectiveness with the highest F<ce:inf loc=\"post\">1</ce:inf>-score, mIoU, and Frames Per Second (FPS) of 90.56%, 90.25%, and 87.34 in DeepCrack537 dataset, respectively. We also contribute an extensive dataset of pavement cracks, consisting of 464 manually annotated digital images, which is publicly accessible at <ce:inter-ref xlink:href=\"https://github.com/NJSkate/BeijingHighway-dataset\" xlink:type=\"simple\">https://github.com/NJSkate/BeijingHighway-dataset</ce:inter-ref>.","PeriodicalId":50341,"journal":{"name":"International Journal of Applied Earth Observation and Geoinformation","volume":"41 1","pages":""},"PeriodicalIF":7.5000,"publicationDate":"2024-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"International Journal of Applied Earth Observation and Geoinformation","FirstCategoryId":"89","ListUrlMain":"https://doi.org/10.1016/j.jag.2024.104347","RegionNum":1,"RegionCategory":"地球科学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"Earth and Planetary Sciences","Score":null,"Total":0}
引用次数: 0
Abstract
Although real-time semantic segmentation of pavement cracks is crucial for road evaluation and maintenance decision-making, it is a challenging task due to low operational efficiency and over-segmentation of existing methods. To address these challenges, in this paper, incorporating Transformers and CNNs, we propose a real-time triple-branch crack semantic segmentation network (RTCNet) using digital camera images. The three branches include a detail branch for capturing local detail features, a context branch for extracting global contextual information, and a boundary branch for obtaining crack boundary information. First, to further enhance crack features, we design a Detail Enhance Transformer (DET) module for enlarging global receptive fields and a Multiscale Aggregation (MSA) module for multiscale learning in the context branch. Second, a Boundary Refinement (BR) module with Sobel operators embedded in the boundary branch is designed to refine the crack boundaries. Last, a Detail-Context Fusion (DCF) module is designed to aggregate the intermediate features extracted from the different branches efficiently Comprehensive quantitative and visual comparisons on four datasets showed that the proposed RTCNet outperforms the comparative models in terms of efficiency and effectiveness with the highest F1-score, mIoU, and Frames Per Second (FPS) of 90.56%, 90.25%, and 87.34 in DeepCrack537 dataset, respectively. We also contribute an extensive dataset of pavement cracks, consisting of 464 manually annotated digital images, which is publicly accessible at https://github.com/NJSkate/BeijingHighway-dataset.
期刊介绍:
The International Journal of Applied Earth Observation and Geoinformation publishes original papers that utilize earth observation data for natural resource and environmental inventory and management. These data primarily originate from remote sensing platforms, including satellites and aircraft, supplemented by surface and subsurface measurements. Addressing natural resources such as forests, agricultural land, soils, and water, as well as environmental concerns like biodiversity, land degradation, and hazards, the journal explores conceptual and data-driven approaches. It covers geoinformation themes like capturing, databasing, visualization, interpretation, data quality, and spatial uncertainty.