Pub Date : 2026-01-03DOI: 10.1016/j.autcon.2025.106752
Chen Zhang , Dhanada K. Mishra , Matthew M.F. Yuen , Yantao Yu , Jize Zhang
Accurate pixel-level segmentation of concrete spalling has been severely hampered by the prohibitive cost of manual annotation. This paper investigates how accurate pixel-level defect segmentation can be achieved using only low-cost weakly supervised bounding box annotations. A three-stage framework is proposed to generate and refine pseudo-masks from bounding boxes using the Segment Anything Model (SAM), dynamic self-correction, and inference-time fusion. The proposed method outperformed existing techniques by over 10% in F1 score on a large-scale spalling dataset. These findings establish the economic viability of deploying scalable automated inspection systems by drastically reducing data annotation costs, providing a practical and scalable pathway for spalling assessment.
{"title":"Accurate concrete spalling segmentation from bounding box supervision using Segment Anything","authors":"Chen Zhang , Dhanada K. Mishra , Matthew M.F. Yuen , Yantao Yu , Jize Zhang","doi":"10.1016/j.autcon.2025.106752","DOIUrl":"10.1016/j.autcon.2025.106752","url":null,"abstract":"<div><div>Accurate pixel-level segmentation of concrete spalling has been severely hampered by the prohibitive cost of manual annotation. This paper investigates how accurate pixel-level defect segmentation can be achieved using only low-cost weakly supervised bounding box annotations. A three-stage framework is proposed to generate and refine pseudo-masks from bounding boxes using the Segment Anything Model (SAM), dynamic self-correction, and inference-time fusion. The proposed method outperformed existing techniques by over 10% in F1 score on a large-scale spalling dataset. These findings establish the economic viability of deploying scalable automated inspection systems by drastically reducing data annotation costs, providing a practical and scalable pathway for spalling assessment.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106752"},"PeriodicalIF":11.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880901","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-03DOI: 10.1016/j.autcon.2025.106737
Kang Fu , Yiguo Xue , Daohong Qiu , Jingkai Qu , Huimin Gong
Accurate prediction of TBM tunneling loads is essential for enabling intelligent control. This paper proposes an intelligent prediction framework that integrates modal reconstruction with collaborative modeling. An improved Multivariate Variational Mode Decomposition (IMVMD) combined with Refined Composite Multiscale Diversity Entropy (RCMDE) is employed to extract the trend, seasonal, cyclic, and residual components of tunneling load signals. For each component, specialized predictive models, including Transformer, Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM), and Extreme Gradient Boosting (XGBoost), are developed to construct a collaborative hybrid learning architecture. A CNN-LSTM-based error correction strategy is further introduced, resulting in a corrected hybrid learning (CHL) model that achieved an R2 of 0.9972, a MAPE of 0.66 %, and an MAE of 11.73, exceeding traditional models by more than 60 % on average. The proposed method provides reliable technical support for intelligent perception and automated control in TBM tunneling.
{"title":"Intelligent prediction of TBM tunneling loads based on modal reconstruction and collaborative modeling","authors":"Kang Fu , Yiguo Xue , Daohong Qiu , Jingkai Qu , Huimin Gong","doi":"10.1016/j.autcon.2025.106737","DOIUrl":"10.1016/j.autcon.2025.106737","url":null,"abstract":"<div><div>Accurate prediction of TBM tunneling loads is essential for enabling intelligent control. This paper proposes an intelligent prediction framework that integrates modal reconstruction with collaborative modeling. An improved Multivariate Variational Mode Decomposition (IMVMD) combined with Refined Composite Multiscale Diversity Entropy (RCMDE) is employed to extract the trend, seasonal, cyclic, and residual components of tunneling load signals. For each component, specialized predictive models, including Transformer, Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network–Long Short-Term Memory (CNN-LSTM), and Extreme Gradient Boosting (XGBoost), are developed to construct a collaborative hybrid learning architecture. A CNN-LSTM-based error correction strategy is further introduced, resulting in a corrected hybrid learning (CHL) model that achieved an <em>R</em><sup>2</sup> of 0.9972, a <em>MAPE</em> of 0.66 %, and an <em>MAE</em> of 11.73, exceeding traditional models by more than 60 % on average. The proposed method provides reliable technical support for intelligent perception and automated control in TBM tunneling.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106737"},"PeriodicalIF":11.5,"publicationDate":"2026-01-03","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880897","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2026-01-02DOI: 10.1016/j.autcon.2025.106747
Jie Zhou , Chao Ban , Chengjun Liu , Zeyao Li , Huade Zhou , Hsinming Shang
The distribution and evolution of temperature field are key concerns in freezing restoration projects, while traditional methods face limitations due to sparse sensor placement and simplified simulation inputs. More effective and accurate methods are needed to determine the temperature field. A PSO-based digital twin model was developed and validated with a tunnel freezing restoration project in Bangkok, Thailand. By integrating real-time field temperature data, the model enables dynamic optimization of parameters, enhancing the accuracy. Single-parameter optimization achieves fast convergence, ideal for early-stage calibration, while multi-parameter optimization improves performance under complex conditions. In these cases, PSO demonstrates better performance compared with GA and DE. When using multiple measurement points, the model may encounter local optima. The hybrid optimization strategy (GA-PSO) provides an effective pathway to mitigate the issue of local optima. This paper demonstrates the model feasibility and effectiveness, offering a practical approach for dynamic temperature management in complex freezing environments.
{"title":"Digital twin–driven temperature field optimization in tunnel freezing restoration using particle swarm optimization","authors":"Jie Zhou , Chao Ban , Chengjun Liu , Zeyao Li , Huade Zhou , Hsinming Shang","doi":"10.1016/j.autcon.2025.106747","DOIUrl":"10.1016/j.autcon.2025.106747","url":null,"abstract":"<div><div>The distribution and evolution of temperature field are key concerns in freezing restoration projects, while traditional methods face limitations due to sparse sensor placement and simplified simulation inputs. More effective and accurate methods are needed to determine the temperature field. A PSO-based digital twin model was developed and validated with a tunnel freezing restoration project in Bangkok, Thailand. By integrating real-time field temperature data, the model enables dynamic optimization of parameters, enhancing the accuracy. Single-parameter optimization achieves fast convergence, ideal for early-stage calibration, while multi-parameter optimization improves performance under complex conditions. In these cases, PSO demonstrates better performance compared with GA and DE. When using multiple measurement points, the model may encounter local optima. The hybrid optimization strategy (GA-PSO) provides an effective pathway to mitigate the issue of local optima. This paper demonstrates the model feasibility and effectiveness, offering a practical approach for dynamic temperature management in complex freezing environments.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106747"},"PeriodicalIF":11.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880898","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2CIW of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.
{"title":"Self-supervised learning for multi-label sewer defect classification","authors":"Tugba Yildizli , Tianlong Jia , Jeroen Langeveld , Riccardo Taormina","doi":"10.1016/j.autcon.2025.106751","DOIUrl":"10.1016/j.autcon.2025.106751","url":null,"abstract":"<div><div>Automated sewer defect detection has advanced through deep learning, particularly supervised methods using CCTV images, but based on large annotated datasets. This paper proposes a semi-supervised learning (SSL) approach to reduce labeling demands. The method comprises self-supervised pre-training on unlabeled images using SwAV (Swapping Assignments between multiple Views) followed by fine-tuning for multi-label classification. Experiments on the Sewer-ML dataset demonstrate that the SSL approach, trained on only 35k labeled images, achieves an F1-score of 69.11%, and F2<sub>CIW</sub> of 54.22%, surpassing the fully supervised baseline trained from scratch on 1.04 million images. Increasing the unlabeled pre-training data further enhances performance, while ImageNet initialization consistently outperforms training from scratch. Self-supervised learning also helps mitigate the effects of mislabeled data, which is observed to be present even in the Sewer-ML ground truth. Overall, self-supervised learning provides an accurate, scalable, and cost-effective alternative to fully supervised approaches, particularly in data-scarce or imperfectly labeled scenarios.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106751"},"PeriodicalIF":11.5,"publicationDate":"2026-01-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880902","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-30DOI: 10.1016/j.autcon.2025.106742
Junhyung Cho , Mingyu Shin , Joongheon Kim , Soyi Jung
Autonomous excavation systems face fundamental challenges balancing computational tractability with operational sophistication. This paper presents the collaborative learning for excavation framework (CLEF), resolving this trade-off through strategic decomposition: separating high-level planning from low-level execution while maintaining collaborative optimization. The framework’s key contributions include a bidirectional information flow between specialized modules consisting of reinforcement learning for strategic planning using polar coordinates, and attention-enhanced generative adversarial imitation learning (A-GAIL) with multi-head attention capturing phase-specific temporal dependencies. Unlike monolithic approaches suffering computational intractability, CLEF enables module specialization while coordinating through shared representations. Planning decisions condition trajectory generation while execution outcomes update environmental models, creating adaptive behavior without manual tuning. Validation demonstrates 90.8% success rate compared to 71.1% for monolithic approaches, with trajectory generation achieving 91.3% completion confirming superior performance essential for construction automation.
{"title":"Collaborative learning architecture for autonomous excavator planning and execution","authors":"Junhyung Cho , Mingyu Shin , Joongheon Kim , Soyi Jung","doi":"10.1016/j.autcon.2025.106742","DOIUrl":"10.1016/j.autcon.2025.106742","url":null,"abstract":"<div><div>Autonomous excavation systems face fundamental challenges balancing computational tractability with operational sophistication. This paper presents the collaborative learning for excavation framework (CLEF), resolving this trade-off through strategic decomposition: separating high-level planning from low-level execution while maintaining collaborative optimization. The framework’s key contributions include a bidirectional information flow between specialized modules consisting of reinforcement learning for strategic planning using polar coordinates, and attention-enhanced generative adversarial imitation learning (A-GAIL) with multi-head attention capturing phase-specific temporal dependencies. Unlike monolithic approaches suffering computational intractability, CLEF enables module specialization while coordinating through shared representations. Planning decisions condition trajectory generation while execution outcomes update environmental models, creating adaptive behavior without manual tuning. Validation demonstrates 90.8% success rate compared to 71.1% for monolithic approaches, with trajectory generation achieving 91.3% completion confirming superior performance essential for construction automation.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106742"},"PeriodicalIF":11.5,"publicationDate":"2025-12-30","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880900","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-29DOI: 10.1016/j.autcon.2025.106741
Wang Wang , Mingjing Xu , Zhen Cao , Jingzi Guo , Chong Liu , Haowei Zhang , Xiaoling Zhang
The automation of 3D bridge inspection is critically limited by scarce annotated data and a fundamental lack of understanding regarding which intrinsic point cloud features drive Sim-to-Real (S2R) success. The paper proposes a unified procedural synthesis framework to overcome this data bottleneck. The core contributions are twofold: (1) Dual-output generation, which yields segmented ground truth and the first bridge component-level point cloud completion dataset via physical simulation. (2) Systematic feature ablation, establishing a definitive S2R importance hierarchy: Surface Normals Geometry RGB. This finding offers critical guidance for efficient sensor deployment and data synthesis. A model trained exclusively on synthetic data achieved a satisfactory 84.2% mIoU on a real-world benchmark, validating direct S2R transfer and proving synthetic data can substitute manual annotation. The validated methodology provides the foundation to seamlessly integrate procedural damage models, extending automation from component identification to defect detection for analysis-ready digital twins.
{"title":"Unified data synthesis for automated 3D Visual Inspection and digital twinning of bridges","authors":"Wang Wang , Mingjing Xu , Zhen Cao , Jingzi Guo , Chong Liu , Haowei Zhang , Xiaoling Zhang","doi":"10.1016/j.autcon.2025.106741","DOIUrl":"10.1016/j.autcon.2025.106741","url":null,"abstract":"<div><div>The automation of 3D bridge inspection is critically limited by scarce annotated data and a fundamental lack of understanding regarding which intrinsic point cloud features drive Sim-to-Real (S2R) success. The paper proposes a unified procedural synthesis framework to overcome this data bottleneck. The core contributions are twofold: (1) Dual-output generation, which yields segmented ground truth and the first bridge component-level point cloud completion dataset via physical simulation. (2) Systematic feature ablation, establishing a definitive S2R importance hierarchy: Surface Normals <span><math><mo>≫</mo></math></span> Geometry <span><math><mo>></mo></math></span> RGB. This finding offers critical guidance for efficient sensor deployment and data synthesis. A model trained exclusively on synthetic data achieved a satisfactory 84.2% mIoU on a real-world benchmark, validating direct S2R transfer and proving synthetic data can substitute manual annotation. The validated methodology provides the foundation to seamlessly integrate procedural damage models, extending automation from component identification to defect detection for analysis-ready digital twins.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106741"},"PeriodicalIF":11.5,"publicationDate":"2025-12-29","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145880899","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.autcon.2025.106743
Guang-Zhu Zhang, Qingliang Xu, Hong-Feng Li, Chun-Peng Han, Qiushi Li
Accurate maintenance planning requires not only detecting pavement distress but also reconstructing its 3D geometry and reporting metrics. This paper develops Iterative Geometry Encoding Volume-Lite (IGEV-Lite), a compact derivative of IGEV-Stereo, and couples it with a variable-baseline stereo platform. IGEV-Lite adopts a GhostNetV2 backbone with feature transfer–re-encoding context network and a compact iterative updater, plus deployment accelerations; instance-level region of interest (ROI) cropping focuses computation, and plane-referenced, gridded integration yields maximum depth and integrated volume. Under a unified protocol, accuracy improves from an end-point error (EPE) of 0.608 to 0.584 px (pixels) and a disparity outlier rate (D1) of 3.24 % to 2.97 %, while latency drops from 135 ms to 97 ms. Quantification tests conducted at a perpendicular angle to the ground achieve 2.7 % depth and 0.9 % volume error at B = 240 mm. Combining a lightweight stereo backbone with plane-referenced integration provides deployment-ready, geometry-faithful quantification of distress.
{"title":"Real-time stereo reconstruction and geometric quantification of pavement distress with a variable-baseline platform","authors":"Guang-Zhu Zhang, Qingliang Xu, Hong-Feng Li, Chun-Peng Han, Qiushi Li","doi":"10.1016/j.autcon.2025.106743","DOIUrl":"10.1016/j.autcon.2025.106743","url":null,"abstract":"<div><div>Accurate maintenance planning requires not only detecting pavement distress but also reconstructing its 3D geometry and reporting metrics. This paper develops Iterative Geometry Encoding Volume-Lite (IGEV-Lite), a compact derivative of IGEV-Stereo, and couples it with a variable-baseline stereo platform. IGEV-Lite adopts a GhostNetV2 backbone with feature transfer–re-encoding context network and a compact iterative updater, plus deployment accelerations; instance-level region of interest (ROI) cropping focuses computation, and plane-referenced, gridded integration yields maximum depth and integrated volume. Under a unified protocol, accuracy improves from an end-point error (EPE) of 0.608 to 0.584 px (pixels) and a disparity outlier rate (D1) of 3.24 % to 2.97 %, while latency drops from 135 ms to 97 ms. Quantification tests conducted at a perpendicular angle to the ground achieve 2.7 % depth and 0.9 % volume error at B = 240 mm. Combining a lightweight stereo backbone with plane-referenced integration provides deployment-ready, geometry-faithful quantification of distress.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106743"},"PeriodicalIF":11.5,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145844991","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-27DOI: 10.1016/j.autcon.2025.106744
Sarah Mokhtar, Caitlin Mueller
Physical phenomena, including aerodynamics and heat transfer, exhibit complex shape-dependent relationships with building geometry, shaping microclimates that directly affect urban livability and comfort. In design and engineering practice, surrogate models reduce the computational burden of simulations, providing faster and more iterative performance feedback within design workflows. This paper introduces Per-FORM, a framework that leverages implicit neural representations (INRs) for predictive modeling in the built environment. The approach accommodates variations in geometric complexity, scale, and topology while representing continuous physical fields through decoupled modules encoding both geometry and building influence. Its ability to infer full-field and near-surface predictions is evaluated across multiple metrics, demonstrating state-of-the-art accuracy for complex geometries. Beyond predictive accuracy, Per-FORM brings simulation-in-the-loop feedback into digital workflows, supporting performance-informed exploration, ideation, and conceptualization, and enriching informed creative processes in design and engineering practice.
{"title":"Implicit neural representations for surrogate modeling in the built environment","authors":"Sarah Mokhtar, Caitlin Mueller","doi":"10.1016/j.autcon.2025.106744","DOIUrl":"10.1016/j.autcon.2025.106744","url":null,"abstract":"<div><div>Physical phenomena, including aerodynamics and heat transfer, exhibit complex shape-dependent relationships with building geometry, shaping microclimates that directly affect urban livability and comfort. In design and engineering practice, surrogate models reduce the computational burden of simulations, providing faster and more iterative performance feedback within design workflows. This paper introduces Per-FORM, a framework that leverages implicit neural representations (INRs) for predictive modeling in the built environment. The approach accommodates variations in geometric complexity, scale, and topology while representing continuous physical fields through decoupled modules encoding both geometry and building influence. Its ability to infer full-field and near-surface predictions is evaluated across multiple metrics, demonstrating state-of-the-art accuracy for complex geometries. Beyond predictive accuracy, Per-FORM brings simulation-in-the-loop feedback into digital workflows, supporting performance-informed exploration, ideation, and conceptualization, and enriching informed creative processes in design and engineering practice.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106744"},"PeriodicalIF":11.5,"publicationDate":"2025-12-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837256","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.autcon.2025.106730
Ben Huang , Fei Kang , Xi Liu
Accurate damage detection is critical for ensuring the safety and long-term stability of dams. However, conventional inspection methods often suffer from low automation, high labor intensity, and high costs. To address these limitations, this paper proposes an intelligent detection system based on an enhanced YOLOX framework, designed for real-time identification of multiple damage types in concrete dams using unmanned aerial vehicles (UAVs). The improved model is lightweight, containing only 8.94 million parameters, yet achieves a mAP50 of 0.821 and an F1-score of 0.781. Based on this model, detection software was implemented with the PyQt5 framework, and an integrated UAV-based system was constructed to support high-precision, real-time analysis of both image and video data. This approach provides an automated and intelligent solution for the visual inspection of concrete dam damage, offering significant potential for practical engineering applications and future intelligent monitoring systems.
{"title":"Intelligent UAV-based deep learning system for multi-class concrete dam damage detection","authors":"Ben Huang , Fei Kang , Xi Liu","doi":"10.1016/j.autcon.2025.106730","DOIUrl":"10.1016/j.autcon.2025.106730","url":null,"abstract":"<div><div>Accurate damage detection is critical for ensuring the safety and long-term stability of dams. However, conventional inspection methods often suffer from low automation, high labor intensity, and high costs. To address these limitations, this paper proposes an intelligent detection system based on an enhanced YOLOX framework, designed for real-time identification of multiple damage types in concrete dams using unmanned aerial vehicles (UAVs). The improved model is lightweight, containing only 8.94 million parameters, yet achieves a mAP<sub>50</sub> of 0.821 and an <em>F</em><sub>1</sub>-score of 0.781. Based on this model, detection software was implemented with the PyQt5 framework, and an integrated UAV-based system was constructed to support high-precision, real-time analysis of both image and video data. This approach provides an automated and intelligent solution for the visual inspection of concrete dam damage, offering significant potential for practical engineering applications and future intelligent monitoring systems.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106730"},"PeriodicalIF":11.5,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837257","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-12-26DOI: 10.1016/j.autcon.2025.106738
Mingchen Li , Ziqi Hu , Parastoo Mohebi , Shuhao Li , Zhe Wang
To enhance energy efficiency and occupant satisfaction, modern buildings have collected rich streams of operational and sensor data. Semantic models for buildings, such as the Brick schema expressed in the Resource Description Framework (RDF) and Web Ontology Language (OWL), have standardized the representation of devices, points, and systems. However, non-expert users still faced barriers to accessing such data, because effective use required proficiency in the SPARQL Protocol and RDF Query Language (SPARQL) and navigation of thousands of interconnected nodes and relations. This paper presents BuildingGPT2, a framework that combined large language model fine-tuning, vector-graph retrieval-augmented generation, and chain-of-thought prompting to enable natural-language querying of Brick-based models. The framework was trained on semantic models from 40 real buildings and evaluated in a zero-shot setting on 5 held-out buildings. Using LLaMA 3.1–70B, SPARQL query generation accuracy improved from 49.25 % to 97.11 %, substantially lowering the barrier to interacting with building semantic models.
{"title":"Enhancing LLM-based building data query with chain-of-thought, retrieval-augmented generation, and fine-tuning","authors":"Mingchen Li , Ziqi Hu , Parastoo Mohebi , Shuhao Li , Zhe Wang","doi":"10.1016/j.autcon.2025.106738","DOIUrl":"10.1016/j.autcon.2025.106738","url":null,"abstract":"<div><div>To enhance energy efficiency and occupant satisfaction, modern buildings have collected rich streams of operational and sensor data. Semantic models for buildings, such as the Brick schema expressed in the Resource Description Framework (RDF) and Web Ontology Language (OWL), have standardized the representation of devices, points, and systems. However, non-expert users still faced barriers to accessing such data, because effective use required proficiency in the SPARQL Protocol and RDF Query Language (SPARQL) and navigation of thousands of interconnected nodes and relations. This paper presents BuildingGPT2, a framework that combined large language model fine-tuning, vector-graph retrieval-augmented generation, and chain-of-thought prompting to enable natural-language querying of Brick-based models. The framework was trained on semantic models from 40 real buildings and evaluated in a zero-shot setting on 5 held-out buildings. Using LLaMA 3.1–70B, SPARQL query generation accuracy improved from 49.25 % to 97.11 %, substantially lowering the barrier to interacting with building semantic models.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"182 ","pages":"Article 106738"},"PeriodicalIF":11.5,"publicationDate":"2025-12-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145837318","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}