Pub Date : 2026-01-06DOI: 10.1016/j.autcon.2025.106757
Gilsu Jeong , Joonseok Lee , Moonseo Park , Changbum R. Ahn
Hanging objects, referring to materials or components lifted and transported by tower cranes, require continuous monitoring, as undetected suspended loads can cause severe accidents and disrupt construction workflows. However, conventional vision-based detection models struggle to recognize the hanging state due to visual ambiguity and reliance on appearance without spatial reasoning. To address this challenge, this paper proposes a framework that leverages monocular depth information to infer the hanging state more effectively. The approach incorporates a depth-aware feature module, which captures depth differences and spatial context, and a segmentation-guided depth preprocessing that refines object boundaries. Integrated into baseline detectors, the proposed method significantly improves detection accuracy and reduces false positives in complex scenes. Experimental results demonstrate the value of depth-aware modeling and establish a foundation for reliable, state-aware detection of hanging objects, enabling automated monitoring and supporting more efficient management of lifting operations and site workflows in construction environments.
{"title":"Depth-aware detection of hanging objects for state reasoning in construction sites","authors":"Gilsu Jeong , Joonseok Lee , Moonseo Park , Changbum R. Ahn","doi":"10.1016/j.autcon.2025.106757","DOIUrl":"10.1016/j.autcon.2025.106757","url":null,"abstract":"<div><div>Hanging objects, referring to materials or components lifted and transported by tower cranes, require continuous monitoring, as undetected suspended loads can cause severe accidents and disrupt construction workflows. However, conventional vision-based detection models struggle to recognize the hanging state due to visual ambiguity and reliance on appearance without spatial reasoning. To address this challenge, this paper proposes a framework that leverages monocular depth information to infer the hanging state more effectively. The approach incorporates a depth-aware feature module, which captures depth differences and spatial context, and a segmentation-guided depth preprocessing that refines object boundaries. Integrated into baseline detectors, the proposed method significantly improves detection accuracy and reduces false positives in complex scenes. Experimental results demonstrate the value of depth-aware modeling and establish a foundation for reliable, state-aware detection of hanging objects, enabling automated monitoring and supporting more efficient management of lifting operations and site workflows in construction environments.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106757"},"PeriodicalIF":11.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145921014","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.autcon.2025.106749
Kichang Choi , Minwoo Jeong , Younga Shin , Jong won Ma , Kinam Kim , Hongjo Kim
The Retrieval-Augmented Generation (RAG) framework struggles in low resource languages like Korean, particularly in specialized domains such as construction. This paper proposes RAGO-CONSTRUCT, a retrieval optimization methodology for existing RAG systems that integrate Contrastive Sentence Generation (CSG) and Sentence Block Embedding (SBE) with Matryoshka Representation Learning (MRL) to improve retrieval accuracy in Korean construction documents. CSG enables automated dataset generation using local LLMs, while SBE optimizes document chunking strategies to align with embedding model strengths. An 8986-pair training dataset was generated using local LLMs without requiring manual annotation, enabling fine-tuning with Multiple Negative Ranking Loss and Matryoshka Representation Learning. RAGO-CONSTRUCT demonstrated model-agnostic effectiveness across different embedding architectures, with multilingual-e5-large achieving 53.7 % overall accuracy, outperforming OpenAI's text-embedding-3-large by 12.35 % point. The methodology showed consistent performance improvements regardless of the base embedding model used. This approach addresses critical challenges in domain-specific RAG applications for low-resource languages.
{"title":"Retrieval optimization for construction documents in low-resource languages using contrastive sentence generation and matryoshka representation learning","authors":"Kichang Choi , Minwoo Jeong , Younga Shin , Jong won Ma , Kinam Kim , Hongjo Kim","doi":"10.1016/j.autcon.2025.106749","DOIUrl":"10.1016/j.autcon.2025.106749","url":null,"abstract":"<div><div>The Retrieval-Augmented Generation (RAG) framework struggles in low resource languages like Korean, particularly in specialized domains such as construction. This paper proposes RAGO-CONSTRUCT, a retrieval optimization methodology for existing RAG systems that integrate Contrastive Sentence Generation (CSG) and Sentence Block Embedding (SBE) with Matryoshka Representation Learning (MRL) to improve retrieval accuracy in Korean construction documents. CSG enables automated dataset generation using local LLMs, while SBE optimizes document chunking strategies to align with embedding model strengths. An 8986-pair training dataset was generated using local LLMs without requiring manual annotation, enabling fine-tuning with Multiple Negative Ranking Loss and Matryoshka Representation Learning. RAGO-CONSTRUCT demonstrated model-agnostic effectiveness across different embedding architectures, with multilingual-e5-large achieving 53.7 % overall accuracy, outperforming OpenAI's text-embedding-3-large by 12.35 % point. The methodology showed consistent performance improvements regardless of the base embedding model used. This approach addresses critical challenges in domain-specific RAG applications for low-resource languages.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106749"},"PeriodicalIF":11.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145920967","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-06DOI: 10.1016/j.autcon.2025.106755
Ke You , Ke Chen , Fan Xue
The global Architecture, Engineering, and Construction (AEC) industry has witnessed surging demand for Construction Digital Transformation (CDT) over the past decade. Scan-to-BIM delivers accurate as-is conditions and reconstructs detailed BIM for diverse CDT applications. Researchers have proposed automated scan-to-BIM using algorithms and AI to minimize labor demands, but a comprehensive review with systematic guidelines is lacking. This paper presents a conceptual model of scan-to-BIM processes and reviews development patterns and trends based on 58 cases. Based on the model, this paper offers a four-step guideline for AEC practitioners to adopt automated scan-to-BIM effectively. The contribution of this paper is three-fold. First, the conceptual model offers a comprehensive and simplified overview of scan-to-BIM processes for beginners. Secondly, trends emerge, e.g., transformation from rigid rules to AI methods. Thirdly, the best-practice guidelines empower AEC practitioners to maximize scan-to-BIM advantages tailored to their needs.
{"title":"Automated scan-to-BIM for construction digital transformation: Conceptual framework, processing methods and best-practice guidelines","authors":"Ke You , Ke Chen , Fan Xue","doi":"10.1016/j.autcon.2025.106755","DOIUrl":"10.1016/j.autcon.2025.106755","url":null,"abstract":"<div><div>The global Architecture, Engineering, and Construction (AEC) industry has witnessed surging demand for Construction Digital Transformation (CDT) over the past decade. Scan-to-BIM delivers accurate as-is conditions and reconstructs detailed BIM for diverse CDT applications. Researchers have proposed automated scan-to-BIM using algorithms and AI to minimize labor demands, but a comprehensive review with systematic guidelines is lacking. This paper presents a conceptual model of scan-to-BIM processes and reviews development patterns and trends based on 58 cases. Based on the model, this paper offers a four-step guideline for AEC practitioners to adopt automated scan-to-BIM effectively. The contribution of this paper is three-fold. First, the conceptual model offers a comprehensive and simplified overview of scan-to-BIM processes for beginners. Secondly, trends emerge, e.g., transformation from rigid rules to AI methods. Thirdly, the best-practice guidelines empower AEC practitioners to maximize scan-to-BIM advantages tailored to their needs.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106755"},"PeriodicalIF":11.5,"publicationDate":"2026-01-06","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145903163","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1016/j.autcon.2025.106758
Zhao Zhang , Fengyang He , Zhonghao Chen , Lei Yuan , Hong Guan , Zengxi Pan , Huijun Li
As civil engineering advances toward next-generation construction, the integration of robotics, automation, and sustainable manufacturing is becoming increasingly critical. Robotic Wire Arc Additive Manufacturing (WAAM) provides a promising pathway through flexible deposition control and efficient material utilisation in steel structures. This review focuses on WAAM-fabricated steels and synthesises current developments in process, material behaviour, structural applications and future research directions. Relationships between WAAM parameters and deposition strategies are examined to clarify their influence on the performance of WAAM-fabricated steels. Reported material behaviours, including tensile, fatigue, corrosion, and high temperature behaviour, are systematically assessed. Structural applications relevant to direct fabrication, hybrid construction, and repair-related interventions are evaluated to illustrate practical pathways for WAAM in civil engineering. By linking WAAM process with both material and structural performance, this review establishes knowledge and guidance for advancing WAAM toward reliable and efficient adoption in both academic research and industrial practice within civil engineering.
{"title":"Comprehensive review of robotic wire arc additive manufacturing for steel structures: Process, material behaviour, structural applications and pathways to automated construction","authors":"Zhao Zhang , Fengyang He , Zhonghao Chen , Lei Yuan , Hong Guan , Zengxi Pan , Huijun Li","doi":"10.1016/j.autcon.2025.106758","DOIUrl":"10.1016/j.autcon.2025.106758","url":null,"abstract":"<div><div>As civil engineering advances toward next-generation construction, the integration of robotics, automation, and sustainable manufacturing is becoming increasingly critical. Robotic Wire Arc Additive Manufacturing (WAAM) provides a promising pathway through flexible deposition control and efficient material utilisation in steel structures. This review focuses on WAAM-fabricated steels and synthesises current developments in process, material behaviour, structural applications and future research directions. Relationships between WAAM parameters and deposition strategies are examined to clarify their influence on the performance of WAAM-fabricated steels. Reported material behaviours, including tensile, fatigue, corrosion, and high temperature behaviour, are systematically assessed. Structural applications relevant to direct fabrication, hybrid construction, and repair-related interventions are evaluated to illustrate practical pathways for WAAM in civil engineering. By linking WAAM process with both material and structural performance, this review establishes knowledge and guidance for advancing WAAM toward reliable and efficient adoption in both academic research and industrial practice within civil engineering.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106758"},"PeriodicalIF":11.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145903173","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-05DOI: 10.1016/j.autcon.2026.106762
Pinsheng Duan , Xuehai Fu , Jinxin Hu , Jianliang Zhou , Ping Guo
Construction sites are dynamic, complex, high-risk environments, where Unmanned Aerial Vehicles (UAVs) are vital for enhancing safety inspection efficiency. As large-scale dynamic obstacles, tower cranes can interfere with effective UAV inspection paths. This paper proposes a safety inspection path planning method under the spatiotemporal interference of multiple tower cranes. First, a 3D model of the construction site is reconstructed, and inspection viewpoints for UAV flights are generated by optimizing safety inspection strategies. Then, a hierarchical path planning framework is established: the lower-level planner strictly enforces real-time safety obstacle avoidance strategies, while the higher-level planner focuses on global planning to meet inspection requirements. Finally, both simulation and real project studies are conducted to verify the feasibility of the method. Results from the real project show that the effective coverage area is increased by 39.01 % compared with traditional methods. This paper provides theoretical and practical support for UAV-assisted safety inspections in construction.
{"title":"Path planning for UAV-based construction safety inspection under spatiotemporal interference from tower cranes","authors":"Pinsheng Duan , Xuehai Fu , Jinxin Hu , Jianliang Zhou , Ping Guo","doi":"10.1016/j.autcon.2026.106762","DOIUrl":"10.1016/j.autcon.2026.106762","url":null,"abstract":"<div><div>Construction sites are dynamic, complex, high-risk environments, where Unmanned Aerial Vehicles (UAVs) are vital for enhancing safety inspection efficiency. As large-scale dynamic obstacles, tower cranes can interfere with effective UAV inspection paths. This paper proposes a safety inspection path planning method under the spatiotemporal interference of multiple tower cranes. First, a 3D model of the construction site is reconstructed, and inspection viewpoints for UAV flights are generated by optimizing safety inspection strategies. Then, a hierarchical path planning framework is established: the lower-level planner strictly enforces real-time safety obstacle avoidance strategies, while the higher-level planner focuses on global planning to meet inspection requirements. Finally, both simulation and real project studies are conducted to verify the feasibility of the method. Results from the real project show that the effective coverage area is increased by 39.01 % compared with traditional methods. This paper provides theoretical and practical support for UAV-assisted safety inspections in construction.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106762"},"PeriodicalIF":11.5,"publicationDate":"2026-01-05","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145903172","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-03DOI: 10.1016/j.autcon.2025.106752
Chen Zhang , Dhanada K. Mishra , Matthew M.F. Yuen , Yantao Yu , Jize Zhang
Accurate pixel-level segmentation of concrete spalling has been severely hampered by the prohibitive cost of manual annotation. This paper investigates how accurate pixel-level defect segmentation can be achieved using only low-cost weakly supervised bounding box annotations. A three-stage framework is proposed to generate and refine pseudo-masks from bounding boxes using the Segment Anything Model (SAM), dynamic self-correction, and inference-time fusion. The proposed method outperformed existing techniques by over 10% in F1 score on a large-scale spalling dataset. These findings establish the economic viability of deploying scalable automated inspection systems by drastically reducing data annotation costs, providing a practical and scalable pathway for spalling assessment.
{"title":"Accurate concrete spalling segmentation from bounding box supervision using Segment Anything","authors":"Chen Zhang , Dhanada K. Mishra , Matthew M.F. Yuen , Yantao Yu , Jize Zhang","doi":"10.1016/j.autcon.2025.106752","DOIUrl":"10.1016/j.autcon.2025.106752","url":null,"abstract":"<div><div>Accurate pixel-level segmentation of concrete spalling has been severely hampered by the prohibitive cost of manual annotation. This paper investigates how accurate pixel-level defect segmentation can be achieved using only low-cost weakly supervised bounding box annotations. A three-stage framework is proposed to generate and refine pseudo-masks from bounding boxes using the Segment Anything Model (SAM), dynamic self-correction, and inference-time fusion. The proposed method outperformed existing techniques by over 10% in F1 score on a large-scale spalling dataset. These findings establish the economic viability of deploying scalable automated inspection systems by drastically reducing data annotation costs, providing a practical and scalable pathway for spalling assessment.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106752"},"PeriodicalIF":11.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-03DOI: 10.1016/j.autcon.2025.106737
Kang Fu , Yiguo Xue , Daohong Qiu , Jingkai Qu , Huimin Gong
Accurate prediction of TBM tunneling loads is essential for enabling intelligent control. This paper proposes an intelligent prediction framework that integrates modal reconstruction with collaborative modeling. An improved Multivariate Variational Mode Decomposition (IMVMD) combined with Refined Composite Multiscale Diversity Entropy (RCMDE) is employed to extract the trend, seasonal, cyclic, and residual components of tunneling load signals. For each component, specialized predictive models, including Transformer, Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM), and Extreme Gradient Boosting (XGBoost), are developed to construct a collaborative hybrid learning architecture. A CNN-LSTM-based error correction strategy is further introduced, resulting in a corrected hybrid learning (CHL) model that achieved an R2 of 0.9972, a MAPE of 0.66 %, and an MAE of 11.73, exceeding traditional models by more than 60 % on average. The proposed method provides reliable technical support for intelligent perception and automated control in TBM tunneling.
{"title":"Intelligent prediction of TBM tunneling loads based on modal reconstruction and collaborative modeling","authors":"Kang Fu , Yiguo Xue , Daohong Qiu , Jingkai Qu , Huimin Gong","doi":"10.1016/j.autcon.2025.106737","DOIUrl":"10.1016/j.autcon.2025.106737","url":null,"abstract":"<div><div>Accurate prediction of TBM tunneling loads is essential for enabling intelligent control. This paper proposes an intelligent prediction framework that integrates modal reconstruction with collaborative modeling. An improved Multivariate Variational Mode Decomposition (IMVMD) combined with Refined Composite Multiscale Diversity Entropy (RCMDE) is employed to extract the trend, seasonal, cyclic, and residual components of tunneling load signals. For each component, specialized predictive models, including Transformer, Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM), and Extreme Gradient Boosting (XGBoost), are developed to construct a collaborative hybrid learning architecture. A CNN-LSTM-based error correction strategy is further introduced, resulting in a corrected hybrid learning (CHL) model that achieved an <em>R</em><sup>2</sup> of 0.9972, a <em>MAPE</em> of 0.66 %, and an <em>MAE</em> of 11.73, exceeding traditional models by more than 60 % on average. The proposed method provides reliable technical support for intelligent perception and automated control in TBM tunneling.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106737"},"PeriodicalIF":11.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.autcon.2025.106747
Jie Zhou , Chao Ban , Chengjun Liu , Zeyao Li , Huade Zhou , Hsinming Shang
The distribution and evolution of temperature field are key concerns in freezing restoration projects, while traditional methods face limitations due to sparse sensor placement and simplified simulation inputs. More effective and accurate methods are needed to determine the temperature field. A PSO-based digital twin model was developed and validated with a tunnel freezing restoration project in Bangkok, Thailand. By integrating real-time field temperature data, the model enables dynamic optimization of parameters, enhancing the accuracy. Single-parameter optimization achieves fast convergence, ideal for early-stage calibration, while multi-parameter optimization improves performance under complex conditions. In these cases, PSO demonstrates better performance compared with GA and DE. When using multiple measurement points, the model may encounter local optima. The hybrid optimization strategy (GA-PSO) provides an effective pathway to mitigate the issue of local optima. This paper demonstrates the model feasibility and effectiveness, offering a practical approach for dynamic temperature management in complex freezing environments.
{"title":"Digital twin–driven temperature field optimization in tunnel freezing restoration using particle swarm optimization","authors":"Jie Zhou , Chao Ban , Chengjun Liu , Zeyao Li , Huade Zhou , Hsinming Shang","doi":"10.1016/j.autcon.2025.106747","DOIUrl":"10.1016/j.autcon.2025.106747","url":null,"abstract":"<div><div>The distribution and evolution of temperature field are key concerns in freezing restoration projects, while traditional methods face limitations due to sparse sensor placement and simplified simulation inputs. More effective and accurate methods are needed to determine the temperature field. A PSO-based digital twin model was developed and validated with a tunnel freezing restoration project in Bangkok, Thailand. By integrating real-time field temperature data, the model enables dynamic optimization of parameters, enhancing the accuracy. Single-parameter optimization achieves fast convergence, ideal for early-stage calibration, while multi-parameter optimization improves performance under complex conditions. In these cases, PSO demonstrates better performance compared with GA and DE. When using multiple measurement points, the model may encounter local optima. The hybrid optimization strategy (GA-PSO) provides an effective pathway to mitigate the issue of local optima. This paper demonstrates the model feasibility and effectiveness, offering a practical approach for dynamic temperature management in complex freezing environments.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106747"},"PeriodicalIF":11.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2CIW of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.
{"title":"Self-supervised learning for multi-label sewer defect classification","authors":"Tugba Yildizli , Tianlong Jia , Jeroen Langeveld , Riccardo Taormina","doi":"10.1016/j.autcon.2025.106751","DOIUrl":"10.1016/j.autcon.2025.106751","url":null,"abstract":"<div><div>Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2<sub>CIW</sub> of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106751"},"PeriodicalIF":11.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.autcon.2025.106742
Junhyung Cho , Mingyu Shin , Joongheon Kim , Soyi Jung
Autonomous excavation systems face fundamental challenges balancing computational tractability with operational sophistication. This paper presents the collaborative learning for excavation framework (CLEF), resolving this trade-off through strategic decomposition: separating high-level planning from low-level execution while maintaining collaborative optimization. The framework’s key contributions include a bidirectional information flow between specialized modules consisting of reinforcement learning for strategic planning using polar coordinates, and attention-enhanced generative adversarial imitation learning (A-GAIL) with multi-head attention capturing phase-specific temporal dependencies. Unlike monolithic approaches suffering computational intractability, CLEF enables module specialization while coordinating through shared representations. Planning decisions condition trajectory generation while execution outcomes update environmental models, creating adaptive behavior without manual tuning. Validation demonstrates 90.8% success rate compared to 71.1% for monolithic approaches, with trajectory generation achieving 91.3% completion confirming superior performance essential for construction automation.
{"title":"Collaborative learning architecture for autonomous excavator planning and execution","authors":"Junhyung Cho , Mingyu Shin , Joongheon Kim , Soyi Jung","doi":"10.1016/j.autcon.2025.106742","DOIUrl":"10.1016/j.autcon.2025.106742","url":null,"abstract":"<div><div>Autonomous excavation systems face fundamental challenges balancing computational tractability with operational sophistication. This paper presents the collaborative learning for excavation framework (CLEF), resolving this trade-off through strategic decomposition: separating high-level planning from low-level execution while maintaining collaborative optimization. The framework’s key contributions include a bidirectional information flow between specialized modules consisting of reinforcement learning for strategic planning using polar coordinates, and attention-enhanced generative adversarial imitation learning (A-GAIL) with multi-head attention capturing phase-specific temporal dependencies. Unlike monolithic approaches suffering computational intractability, CLEF enables module specialization while coordinating through shared representations. Planning decisions condition trajectory generation while execution outcomes update environmental models, creating adaptive behavior without manual tuning. Validation demonstrates 90.8% success rate compared to 71.1% for monolithic approaches, with trajectory generation achieving 91.3% completion confirming superior performance essential for construction automation.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106742"},"PeriodicalIF":11.5,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}