Sudao He, Gang Zhao, Jun Chen, Shenghan Zhang, Dhanada Mishra, Matthew Ming-Fai Yuen
{"title":"Weakly-aligned cross-modal learning framework for subsurface defect segmentation on building façades using UAVs","authors":"Sudao He, Gang Zhao, Jun Chen, Shenghan Zhang, Dhanada Mishra, Matthew Ming-Fai Yuen","doi":"10.1016/j.autcon.2024.105946","DOIUrl":null,"url":null,"abstract":"Infrared (IR) thermography combined with Unmanned Aerial Vehicles (UAVs) offers an innovative approach for automated building façades inspections. However, extracting quantitative defect information from a single image poses a significant challenge. To address this, this paper introduces a Weakly-aligned Cross-modal Learning framework for subsurface defect segmentation using UAVs. This framework consists of two main components: the Multimodal Feature Description Network (MFDN) and the Prompt-aided Cross-modal Graph Learning (PCGL) algorithm. Initially, RGB–IR image pairs are processed by MFDN to extract feature descriptors for multi-modal alignment. The PCGL algorithm identifies visually critical areas through graph partitioning on a Wasserstein graph. These critical areas are transferred to the aligned IR image, and a Wasserstein Adjacency Graph (WAG) is constructed based on masked superpixel segmentation. Finally, the defects contours are pinpointed by detecting abnormal vertices of the WAG. The effectiveness is validated through controlled laboratory experiments and field applications on tiled façades.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"6 1","pages":""},"PeriodicalIF":9.6000,"publicationDate":"2025-01-08","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Automation in Construction","FirstCategoryId":"5","ListUrlMain":"https://doi.org/10.1016/j.autcon.2024.105946","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"CONSTRUCTION & BUILDING TECHNOLOGY","Score":null,"Total":0}
引用次数: 0
Abstract
Infrared (IR) thermography combined with Unmanned Aerial Vehicles (UAVs) offers an innovative approach for automated building façades inspections. However, extracting quantitative defect information from a single image poses a significant challenge. To address this, this paper introduces a Weakly-aligned Cross-modal Learning framework for subsurface defect segmentation using UAVs. This framework consists of two main components: the Multimodal Feature Description Network (MFDN) and the Prompt-aided Cross-modal Graph Learning (PCGL) algorithm. Initially, RGB–IR image pairs are processed by MFDN to extract feature descriptors for multi-modal alignment. The PCGL algorithm identifies visually critical areas through graph partitioning on a Wasserstein graph. These critical areas are transferred to the aligned IR image, and a Wasserstein Adjacency Graph (WAG) is constructed based on masked superpixel segmentation. Finally, the defects contours are pinpointed by detecting abnormal vertices of the WAG. The effectiveness is validated through controlled laboratory experiments and field applications on tiled façades.
期刊介绍:
Automation in Construction is an international journal that focuses on publishing original research papers related to the use of Information Technologies in various aspects of the construction industry. The journal covers topics such as design, engineering, construction technologies, and the maintenance and management of constructed facilities.
The scope of Automation in Construction is extensive and covers all stages of the construction life cycle. This includes initial planning and design, construction of the facility, operation and maintenance, as well as the eventual dismantling and recycling of buildings and engineering structures.