Pub Date : 2026-01-26DOI: 10.1016/j.autcon.2026.106794
Yuxin Zhang, Xi Wang, Mo Hu, Zhenyu Zhang
Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.
{"title":"Bridging dual knowledge graphs for multi-hop question answering in construction safety","authors":"Yuxin Zhang, Xi Wang, Mo Hu, Zhenyu Zhang","doi":"10.1016/j.autcon.2026.106794","DOIUrl":"10.1016/j.autcon.2026.106794","url":null,"abstract":"<div><div>Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106794"},"PeriodicalIF":11.5,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-24DOI: 10.1016/j.autcon.2026.106799
Difeng Hu , You Dong , Mingkai Li , Hanmo Wang , Tao Wang
BIM-derived point clouds are valuable for semantic segmentation and BIM modeling, but distribution discrepancies between BIM and real-world scans significantly degrade segmentation performance. To mitigate this issue, this paper develops a margin-aware maximum classifier discrepancy (MMCD) method, which extends the conventional MCD framework by incorporating a margin-aware mechanism. Task-specific classifiers act as discriminators to encourage the feature generator to learn domain-invariant yet discriminative features for unlabeled real point clouds, improving BIM-to-scan distribution alignment and segmentation accuracy. A margin-aware discrepancy loss is formulated to enforce sufficient margin between features and classification boundaries, improving robustness to domain shift. In addition, a training strategy is proposed to support MMCD optimization. Finally, a refined RandLA-Net with an attention-based upsampling module is constructed as the backbone for validation. Experiments demonstrate that the proposed approach achieves superior performance, with an IoU of 72.79% and an overall accuracy of 87.99%, outperforming RandLA-Net variants with or without MCD.
{"title":"Margin-aware maximum classifier discrepancy for BIM-to-scan semantic segmentation of building point clouds","authors":"Difeng Hu , You Dong , Mingkai Li , Hanmo Wang , Tao Wang","doi":"10.1016/j.autcon.2026.106799","DOIUrl":"10.1016/j.autcon.2026.106799","url":null,"abstract":"<div><div>BIM-derived point clouds are valuable for semantic segmentation and BIM modeling, but distribution discrepancies between BIM and real-world scans significantly degrade segmentation performance. To mitigate this issue, this paper develops a margin-aware maximum classifier discrepancy (MMCD) method, which extends the conventional MCD framework by incorporating a margin-aware mechanism. Task-specific classifiers act as discriminators to encourage the feature generator to learn domain-invariant yet discriminative features for unlabeled real point clouds, improving BIM-to-scan distribution alignment and segmentation accuracy. A margin-aware discrepancy loss is formulated to enforce sufficient margin between features and classification boundaries, improving robustness to domain shift. In addition, a training strategy is proposed to support MMCD optimization. Finally, a refined RandLA-Net with an attention-based upsampling module is constructed as the backbone for validation. Experiments demonstrate that the proposed approach achieves superior performance, with an IoU of 72.79% and an overall accuracy of 87.99%, outperforming RandLA-Net variants with or without MCD.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106799"},"PeriodicalIF":11.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146036034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-24DOI: 10.1016/j.autcon.2026.106795
Yufei Zhang , Gang Li , Runjie Shen
Infrastructure defect detection is vital for public safety and sustainable societal development. In recent years, advances in computer vision have gradually promoted the intelligence and automation of infrastructure defect detection. This paper provides a comprehensive overview of research progress and emerging trends in computer vision-based detection of diverse defect types across multiple infrastructure scenarios, including datasets, evaluation metrics, and methods. A classification framework is introduced that centers on single and multiple visual modalities. The former includes traditional image processing, machine learning, and deep learning techniques, reflecting the evolution of the field. The latter focuses on data-level, feature-level, and decision-level fusion strategies, highlighting opportunities to improve detection performance with multiple visual modalities. Methods are further categorized according to their characteristics and model architectures. Finally, existing challenges are summarized, and promising research directions are outlined based on the strengths and limitations of current methods.
{"title":"Computer vision for infrastructure defect detection: Methods and trends","authors":"Yufei Zhang , Gang Li , Runjie Shen","doi":"10.1016/j.autcon.2026.106795","DOIUrl":"10.1016/j.autcon.2026.106795","url":null,"abstract":"<div><div>Infrastructure defect detection is vital for public safety and sustainable societal development. In recent years, advances in computer vision have gradually promoted the intelligence and automation of infrastructure defect detection. This paper provides a comprehensive overview of research progress and emerging trends in computer vision-based detection of diverse defect types across multiple infrastructure scenarios, including datasets, evaluation metrics, and methods. A classification framework is introduced that centers on single and multiple visual modalities. The former includes traditional image processing, machine learning, and deep learning techniques, reflecting the evolution of the field. The latter focuses on data-level, feature-level, and decision-level fusion strategies, highlighting opportunities to improve detection performance with multiple visual modalities. Methods are further categorized according to their characteristics and model architectures. Finally, existing challenges are summarized, and promising research directions are outlined based on the strengths and limitations of current methods.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106795"},"PeriodicalIF":11.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-23DOI: 10.1016/j.autcon.2026.106788
Cong Chen, Shenghan Zhang
Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.
无人驾驶飞行器(uav)已成为建筑物外观检查的重要工具。然而,由于立面上的重复模式,自动注册无人机拍摄的图像到建筑信息建模(BIM)模型,虽然对建筑维护很重要,仍然具有挑战性。现有的方法通常依赖于GPS数据,在城市环境中缺乏足够的精度。本文提出了一个无gps的自动化框架,通过利用重叠图像的信息将无人机捕获的图像序列注册到BIM模型。该框架包括三个关键部分:(1)使用ground SAM 2从图像中提取语义关键点;(2)实现虚拟无人机摄像机模型,实现BIM坐标与图像坐标之间关键点的双向投影;(3)开发粒子滤波运动模型,使用图像序列实现图像到bim的注册。该方法将各种数据类型注册到BIM模型中,包括重叠的视觉图像序列,红外(IR)-视觉对和farade缺陷。
{"title":"GPS-free automated registration of UAV-captured façade image sequences to BIM using semantic key points","authors":"Cong Chen, Shenghan Zhang","doi":"10.1016/j.autcon.2026.106788","DOIUrl":"10.1016/j.autcon.2026.106788","url":null,"abstract":"<div><div>Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106788"},"PeriodicalIF":11.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-22DOI: 10.1016/j.autcon.2026.106796
Qingwei Zeng , Shunxin Yang , Chang Xu , Jitong Ding , Qiwei Chen , Guoyang Lu
Pavement management data often suffers from severe class imbalance, and existing project-level maintenance and rehabilitation (M&R) decision-making models generally lack post-maintenance evaluation mechanisms. To address these issues, this paper proposes a project-level automated pavement M&R decision-making framework that considers data imbalance and incorporates post-maintenance evaluation (PMDNN). First, a Conditional Tabular Generative Adversarial Network (CTGAN) is developed to augment imbalanced M&R datasets. Next, two deep neural networks (DNNs) are constructed, for pavement performance prediction and for M&R decision-making, respectively. Finally, these two DNNs are nested to enable post-maintenance evaluation, supporting iterative adjustment of suboptimal M&R plans. Results demonstrate that the CTGAN effectively addresses data imbalance and accurately simulates the distribution of the original data. Compared with other data augmentation models, the CTGAN generates data with 4.7%–18.1% higher quality. Additionally, relative to multiple baseline frameworks, the proposed PMDNN framework achieves a 1.91%–4.71% higher overall decision accuracy. These findings indicate that PMDNN can support pavement management systems in making decisions more closely aligned with expert judgment.
{"title":"Project-level automated pavement maintenance and rehabilitation decision-making with data imbalance mitigation and post-maintenance evaluation","authors":"Qingwei Zeng , Shunxin Yang , Chang Xu , Jitong Ding , Qiwei Chen , Guoyang Lu","doi":"10.1016/j.autcon.2026.106796","DOIUrl":"10.1016/j.autcon.2026.106796","url":null,"abstract":"<div><div>Pavement management data often suffers from severe class imbalance, and existing project-level maintenance and rehabilitation (M&R) decision-making models generally lack post-maintenance evaluation mechanisms. To address these issues, this paper proposes a project-level automated pavement M&R decision-making framework that considers data imbalance and incorporates post-maintenance evaluation (PMDNN). First, a Conditional Tabular Generative Adversarial Network (CTGAN) is developed to augment imbalanced M&R datasets. Next, two deep neural networks (DNNs) are constructed, for pavement performance prediction and for M&R decision-making, respectively. Finally, these two DNNs are nested to enable post-maintenance evaluation, supporting iterative adjustment of suboptimal M&R plans. Results demonstrate that the CTGAN effectively addresses data imbalance and accurately simulates the distribution of the original data. Compared with other data augmentation models, the CTGAN generates data with 4.7%–18.1% higher quality. Additionally, relative to multiple baseline frameworks, the proposed PMDNN framework achieves a 1.91%–4.71% higher overall decision accuracy. These findings indicate that PMDNN can support pavement management systems in making decisions more closely aligned with expert judgment.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106796"},"PeriodicalIF":11.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.autcon.2026.106786
Insoo Jeong , Junghoon Kim , Seungmo Lim , Jeongbin Hwang , Seokho Chi
This paper proposes a lightweight domain adaptation framework for construction safety monitoring by fine-tuning a pretrained text-to-image diffusion model using Low-Rank Adaptation (LoRA). To simulate high-risk construction environments underrepresented in training data, the model was adapted to environmental features and specific hazards, focusing on visually dominant scenarios including falls, struck-by, and caught-in incidents. To address data scarcity, Multi-LoRA fine-tuning was conducted using 20 images per hazard type (totaling 60 across three hazards) and 30 background images, enabling both contextual and hazard-specific adaptation. The generated images achieved the highest semantic consistency, yielding the top mean Contrastive Language-Image Pre-training (CLIP) scores with minimal variance, and improved visual realism by reducing the Fréchet Inception Distance (FID) by 86.72 points. Furthermore, a YOLOv8 model trained exclusively on these synthetic images achieved a mean average precision ([email protected]:0.95) of 94.1% on real-world frames, comparable to a baseline model trained on real data.
{"title":"Addressing data scarcity in construction safety monitoring using low-rank adaptation (LoRA)-tuned domain-specific image generation","authors":"Insoo Jeong , Junghoon Kim , Seungmo Lim , Jeongbin Hwang , Seokho Chi","doi":"10.1016/j.autcon.2026.106786","DOIUrl":"10.1016/j.autcon.2026.106786","url":null,"abstract":"<div><div>This paper proposes a lightweight domain adaptation framework for construction safety monitoring by fine-tuning a pretrained text-to-image diffusion model using Low-Rank Adaptation (LoRA). To simulate high-risk construction environments underrepresented in training data, the model was adapted to environmental features and specific hazards, focusing on visually dominant scenarios including falls, struck-by, and caught-in incidents. To address data scarcity, Multi-LoRA fine-tuning was conducted using 20 images per hazard type (totaling 60 across three hazards) and 30 background images, enabling both contextual and hazard-specific adaptation. The generated images achieved the highest semantic consistency, yielding the top mean Contrastive Language-Image Pre-training (CLIP) scores with minimal variance, and improved visual realism by reducing the Fréchet Inception Distance (FID) by 86.72 points. Furthermore, a YOLOv8 model trained exclusively on these synthetic images achieved a mean average precision ([email protected]:0.95) of 94.1% on real-world frames, comparable to a baseline model trained on real data.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106786"},"PeriodicalIF":11.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.autcon.2026.106779
Mohamed Elrifaee , Tarek Zayed , Ahmed Mansour , Eslam Ali
Construction sites remain among the most hazardous work environments, where the lack of non-intrusive, worker-independent monitoring systems limits proactive safety management. Compared to existing approaches that rely heavily on wearables, RFID tags, or bespoke infrastructure, this paper presents a passive and non-intrusive framework leveraging WiFi probe request tracking for safety monitoring in semi-open areas with static hazards. Using low-cost TP-Link routers, the proposed system localizes workers without requiring active participation or additional equipment. To improve robustness beyond conventional fingerprinting models, a joint Autoencoder–Transformer architecture is employed to capture latent dependencies among access points, significantly reducing localization uncertainty. The resulting position estimates are integrated into a modified Zonal Safety Analysis (mZSA) framework adapted for semi-open construction zones. Unlike deterministic approaches that overlook error variability, the proposed method incorporates distribution-specific error modeling, enabling confidence-aware risk buffers. The framework provides a scalable, uncertainty-aware pathway for real-time risk detection in semi-open construction environments.
{"title":"Uncertainty-aware risk mapping with passive WiFi and modified Zonal Safety Analysis (mZSA) in BIM for construction","authors":"Mohamed Elrifaee , Tarek Zayed , Ahmed Mansour , Eslam Ali","doi":"10.1016/j.autcon.2026.106779","DOIUrl":"10.1016/j.autcon.2026.106779","url":null,"abstract":"<div><div>Construction sites remain among the most hazardous work environments, where the lack of non-intrusive, worker-independent monitoring systems limits proactive safety management. Compared to existing approaches that rely heavily on wearables, RFID tags, or bespoke infrastructure, this paper presents a passive and non-intrusive framework leveraging WiFi probe request tracking for safety monitoring in semi-open areas with static hazards. Using low-cost TP-Link routers, the proposed system localizes workers without requiring active participation or additional equipment. To improve robustness beyond conventional fingerprinting models, a joint Autoencoder–Transformer architecture is employed to capture latent dependencies among access points, significantly reducing localization uncertainty. The resulting position estimates are integrated into a modified Zonal Safety Analysis (mZSA) framework adapted for semi-open construction zones. Unlike deterministic approaches that overlook error variability, the proposed method incorporates distribution-specific error modeling, enabling confidence-aware risk buffers. The framework provides a scalable, uncertainty-aware pathway for real-time risk detection in semi-open construction environments.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106779"},"PeriodicalIF":11.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014921","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-21DOI: 10.1016/j.autcon.2026.106793
Tzu-Hsuan Lin , Sheng-Hong Wu , Yu-Chen Su , Alan Putranto
Distributed fiber optic sensing (DFOS) enables continuous strain and temperature monitoring across civil infrastructure, yet installation remains labor-intensive. This paper presents ROADRobot (Robotic System for Automated Deployment of DFOS), a robotic platform integrating closed-loop tension control, calibrated adhesive dispensing, infrared-guided trajectory tracking, and mechanical bead consolidation for automated DFOS deployment. Laboratory validation on wooden and steel substrates identified optimal parameters of 3–6 cm/s traverse velocity and 0.16–0.32 mm/s dispensing velocity, achieving trajectory deviation within 2 mm. Confined-space deployment in a 450 × 450 mm steel channel demonstrated operation under geometric constraints. Comparative trials showed a 46.8% reduction in deployment time versus single-technician manual installation (p < 0.001, Cohen's d = 34.98) with 41% lower variability. OTDR testing confirmed fiber integrity with 0.042 dB insertion loss over 5.5 m. These results establish technical viability, though significant development remains for field application, including curved paths and non-horizontal surfaces.
{"title":"Automated robotic deployment of distributed fiber optic sensing for construction monitoring","authors":"Tzu-Hsuan Lin , Sheng-Hong Wu , Yu-Chen Su , Alan Putranto","doi":"10.1016/j.autcon.2026.106793","DOIUrl":"10.1016/j.autcon.2026.106793","url":null,"abstract":"<div><div>Distributed fiber optic sensing (DFOS) enables continuous strain and temperature monitoring across civil infrastructure, yet installation remains labor-intensive. This paper presents ROADRobot (Robotic System for Automated Deployment of DFOS), a robotic platform integrating closed-loop tension control, calibrated adhesive dispensing, infrared-guided trajectory tracking, and mechanical bead consolidation for automated DFOS deployment. Laboratory validation on wooden and steel substrates identified optimal parameters of 3–6 cm/s traverse velocity and 0.16–0.32 mm/s dispensing velocity, achieving trajectory deviation within 2 mm. Confined-space deployment in a 450 × 450 mm steel channel demonstrated operation under geometric constraints. Comparative trials showed a 46.8% reduction in deployment time versus single-technician manual installation (<em>p</em> < 0.001, Cohen's d = 34.98) with 41% lower variability. OTDR testing confirmed fiber integrity with 0.042 dB insertion loss over 5.5 m. These results establish technical viability, though significant development remains for field application, including curved paths and non-horizontal surfaces.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106793"},"PeriodicalIF":11.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014919","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-20DOI: 10.1016/j.autcon.2026.106789
Mingchao Li , Zuguang Zhang , Qiubing Ren , Yantao Yu , Jingyue Yuan , Jiamei Ma
Substantial crack imagery is hard to acquire in dam structural inspection due to high costs and risks. Crack image generation, as a crucial yet challenging visual task, still struggles with the quality-diversity trade-off under data scarcity. This paper thus presents CrackFSGAN, a few-shot Generative Adversarial Network (GAN) adaptation method for generating realistic, diverse dam crack images from limited samples. It incorporates the Cross-Scale Channel Interaction (CSCI) module to ensure robust gradient flow across network weights for efficient training, and the Self-Supervised Discriminator (SSDr), a redesigned feature-encoder with an additional decoder, to learn more discriminative, region-extensive feature maps. Extensive experiments on multiple damage datasets against state-of-the-art GANs validate CrackFSGAN's superiority in few-shot image synthesis quality and diversity, and its effectiveness in data augmentation for downstream crack detection tasks. Notably, it supports high-resolution (1024 × 1024 pixel2) crack image generation, offering a promising solution to data scarcity and advancing intelligent structural damage detection.
{"title":"Few-shot GAN adaptation for high-fidelity and diverse crack image generation in dam damage detection","authors":"Mingchao Li , Zuguang Zhang , Qiubing Ren , Yantao Yu , Jingyue Yuan , Jiamei Ma","doi":"10.1016/j.autcon.2026.106789","DOIUrl":"10.1016/j.autcon.2026.106789","url":null,"abstract":"<div><div>Substantial crack imagery is hard to acquire in dam structural inspection due to high costs and risks. Crack image generation, as a crucial yet challenging visual task, still struggles with the quality-diversity trade-off under data scarcity. This paper thus presents CrackFSGAN, a few-shot Generative Adversarial Network (GAN) adaptation method for generating realistic, diverse dam crack images from limited samples. It incorporates the Cross-Scale Channel Interaction (CSCI) module to ensure robust gradient flow across network weights for efficient training, and the Self-Supervised Discriminator (SSDr), a redesigned feature-encoder with an additional decoder, to learn more discriminative, region-extensive feature maps. Extensive experiments on multiple damage datasets against state-of-the-art GANs validate CrackFSGAN's superiority in few-shot image synthesis quality and diversity, and its effectiveness in data augmentation for downstream crack detection tasks. Notably, it supports high-resolution (1024 × 1024 pixel<sup>2</sup>) crack image generation, offering a promising solution to data scarcity and advancing intelligent structural damage detection.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106789"},"PeriodicalIF":11.5,"publicationDate":"2026-01-20","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-19DOI: 10.1016/j.autcon.2026.106787
Penglu Chen , Yi Tan , Wen Yi
Amid rapid global urbanization, cities have shifted into a predominantly building maintenance-oriented phase. Therefore, given that existing studies focus on inspecting simple standalone buildings with single UAV, this paper proposes an automatic path planning method for the refined inspection of complex, irregular building clusters. First, an adaptive layering mechanism is introduced to generate full coverage inspection points based on the structural characteristics of the 3D building cluster model. Initial obstacle free flight paths are then derived by integrating A* and greedy algorithms. Further path optimization is conducted by applying the 2-opt algorithm to eliminate intersections and reduce flight distance, while the DP (Douglas Peucke) algorithm is employed simplified the trajectory by reducing redundant waypoints. Experimental validation on six irregularly shaped buildings demonstrates a 9.6% reduction in flight path length and a 47.7% decrease in intermediate waypoints. The proposed framework enables refined inspection path planning for building clusters, improving the automation level and practical applicability of multi-UAVs based building operation and maintenance.
{"title":"Adaptive planning of multi-UAV refined inspection path for complex and irregular building clusters","authors":"Penglu Chen , Yi Tan , Wen Yi","doi":"10.1016/j.autcon.2026.106787","DOIUrl":"10.1016/j.autcon.2026.106787","url":null,"abstract":"<div><div>Amid rapid global urbanization, cities have shifted into a predominantly building maintenance-oriented phase. Therefore, given that existing studies focus on inspecting simple standalone buildings with single UAV, this paper proposes an automatic path planning method for the refined inspection of complex, irregular building clusters. First, an adaptive layering mechanism is introduced to generate full coverage inspection points based on the structural characteristics of the 3D building cluster model. Initial obstacle free flight paths are then derived by integrating A* and greedy algorithms. Further path optimization is conducted by applying the 2-opt algorithm to eliminate intersections and reduce flight distance, while the DP (Douglas Peucke) algorithm is employed simplified the trajectory by reducing redundant waypoints. Experimental validation on six irregularly shaped buildings demonstrates a 9.6% reduction in flight path length and a 47.7% decrease in intermediate waypoints. The proposed framework enables refined inspection path planning for building clusters, improving the automation level and practical applicability of multi-UAVs based building operation and maintenance.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106787"},"PeriodicalIF":11.5,"publicationDate":"2026-01-19","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000937","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}