Pub Date : 2025-01-17DOI: 10.1016/j.autcon.2025.105962
Haoyu Zhang, Stephen Wu, Xiangyun Luo, Yong Huang, Hui Li
Computer vision technology and monitoring videos have been employed to obtain structural displacement measurements. Noniterative algorithms are mainly designed for rapid tracking of the motions of individual image points, rather than dense motion fields. Iterative algorithms are limited to estimating motion fields with small amplitudes and require high computation cost to achieve high accuracy. This paper introduces a noniterative method for vision-based measurements that balances speed and density. The method employs an attention-based matching strategy applied to Transformer-enhanced image features. Motion priors and a physics-informed denoising approach are integrated to improve measurement accuracy. Tested on challenging truss and cable-stayed bridge vibration videos, the method demonstrated superior displacement measurement performance compared to conventional approaches. It also achieved greater robustness to brightness changes and partial occlusions while requiring minimal human intervention. This method supports the development of automated and affordable vibration monitoring systems.
{"title":"Efficient matching of Transformer-enhanced features for accurate vision-based displacement measurement","authors":"Haoyu Zhang, Stephen Wu, Xiangyun Luo, Yong Huang, Hui Li","doi":"10.1016/j.autcon.2025.105962","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105962","url":null,"abstract":"Computer vision technology and monitoring videos have been employed to obtain structural displacement measurements. Noniterative algorithms are mainly designed for rapid tracking of the motions of individual image points, rather than dense motion fields. Iterative algorithms are limited to estimating motion fields with small amplitudes and require high computation cost to achieve high accuracy. This paper introduces a noniterative method for vision-based measurements that balances speed and density. The method employs an attention-based matching strategy applied to Transformer-enhanced image features. Motion priors and a physics-informed denoising approach are integrated to improve measurement accuracy. Tested on challenging truss and cable-stayed bridge vibration videos, the method demonstrated superior displacement measurement performance compared to conventional approaches. It also achieved greater robustness to brightness changes and partial occlusions while requiring minimal human intervention. This method supports the development of automated and affordable vibration monitoring systems.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"74 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"143027296","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-17DOI: 10.1016/j.autcon.2025.105964
Yue Pan, Wen He, Jin-Jian Chen
This paper presents a hybrid deep learning model named the Online Learning-based Multi-Attribute Spatial-Temporal Transformer Network (OMSTTN) to predict excavation-induced risks during foundation pit excavation. OMSTTN integrates a hybrid Transformer offline model with a parallel embedding layer to process diverse monitoring attributes and employs a Spatial-Temporal Transformer block to capture complex spatiotemporal correlations. An online learning mechanism enables dynamic adaptation to evolving conditions, enhancing prediction accuracy. Validated on a real-world XuZhou Rail Transit project, OMSTTN achieves strong prediction performance (MAE: 0.0461, RMSE: 0.0699, R2: 0.9441). Comparative experiments demonstrate its effectiveness in handling multi-attribute data, dynamic changes, and spatiotemporal patterns. In short, OMSTTN narrows the research gap by providing a spatiotemporal framework for accurate risk prediction, offering significant potential for early risk detection and proactive management in excavation engineering.
{"title":"Spatiotemporal deep learning for multi-attribute prediction of excavation-induced risk","authors":"Yue Pan, Wen He, Jin-Jian Chen","doi":"10.1016/j.autcon.2025.105964","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105964","url":null,"abstract":"This paper presents a hybrid deep learning model named the Online Learning-based Multi-Attribute Spatial-Temporal Transformer Network (OMSTTN) to predict excavation-induced risks during foundation pit excavation. OMSTTN integrates a hybrid Transformer offline model with a parallel embedding layer to process diverse monitoring attributes and employs a Spatial-Temporal Transformer block to capture complex spatiotemporal correlations. An online learning mechanism enables dynamic adaptation to evolving conditions, enhancing prediction accuracy. Validated on a real-world XuZhou Rail Transit project, OMSTTN achieves strong prediction performance (MAE: 0.0461, RMSE: 0.0699, R<ce:sup loc=\"post\">2</ce:sup>: 0.9441). Comparative experiments demonstrate its effectiveness in handling multi-attribute data, dynamic changes, and spatiotemporal patterns. In short, OMSTTN narrows the research gap by providing a spatiotemporal framework for accurate risk prediction, offering significant potential for early risk detection and proactive management in excavation engineering.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"55 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-17","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142988131","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This paper introduces a Neural Network Solver (NNS) for Railway Geometry Rectification Linear Program Model (RGRLPM), integrating tamping and fine-tuning operations for millimeter-precision adjustments. The NNS, enhanced by a grad norm process for faster convergence, achieves rectification plans three times faster than the simplex method. Dynamic programming is applied to allocate adjustments between tamping and fine-tuning. Experiments reveal that reducing 10 m and 5/30 m chord offset limits to 0.4 times improves dynamic performance over manual schemes. At a 0.2 reduction factor, cumulative rectification decreases by 5.6%, and the Sperling index drops by 26.9%, highlighting superior efficiency and dynamic outcomes.
{"title":"Optimizing Railway Track Tamping and Geometry Fine-Tuning Allocation Using a Neural Network-Based Solver","authors":"Congyang Xu, Huakun Sun, Siyuan Zhou, Zhiting Chang, Yanhua Guo, Ping Wang, Weijun Wu, Qing He","doi":"10.1016/j.autcon.2024.105958","DOIUrl":"https://doi.org/10.1016/j.autcon.2024.105958","url":null,"abstract":"This paper introduces a Neural Network Solver (NNS) for Railway Geometry Rectification Linear Program Model (RGRLPM), integrating tamping and fine-tuning operations for millimeter-precision adjustments. The NNS, enhanced by a grad norm process for faster convergence, achieves rectification plans three times faster than the simplex method. Dynamic programming is applied to allocate adjustments between tamping and fine-tuning. Experiments reveal that reducing 10 m and 5/30 m chord offset limits to 0.4 times improves dynamic performance over manual schemes. At a 0.2 reduction factor, cumulative rectification decreases by 5.6%, and the Sperling index drops by 26.9%, highlighting superior efficiency and dynamic outcomes.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"49 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142988138","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-16DOI: 10.1016/j.autcon.2025.105976
Xiaohua Bao, Junhong Li, Jun Shen, Xiangsheng Chen, Zefan Huang, Hongzhi Cui
The quality of shield-tunnel segment assembly is uncertain and quantifying the probabilistic coupling effects of these factors is challenging. This paper presents a method for assessing shield-tunnel segment quality using a copula model with numerical simulation. A two-dimensional joint probability-distribution model is developed to model influencing factors, establishing a reliability-based evaluation system for segment assembly quality. Key steps include tunnel section division, marginal and copula function selection, and reliability assessment, focusing on a large submarine-shield tunnel. A Monte Carlo simulation examines the impact of various copula models on reliability estimates, validating the proposed method. Key findings show that (1) the selection of marginal and copula functions significantly affects segment assembly quality and reliability, with the commonly used Gaussian copula not always being optimal, and (2) failure probabilities can vary by up to 84 times due to differing construction conditions and geological factors across tunnel sections.
{"title":"Evaluation of shield-tunnel segment assembly quality using a copula model and numerical simulation","authors":"Xiaohua Bao, Junhong Li, Jun Shen, Xiangsheng Chen, Zefan Huang, Hongzhi Cui","doi":"10.1016/j.autcon.2025.105976","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105976","url":null,"abstract":"The quality of shield-tunnel segment assembly is uncertain and quantifying the probabilistic coupling effects of these factors is challenging. This paper presents a method for assessing shield-tunnel segment quality using a copula model with numerical simulation. A two-dimensional joint probability-distribution model is developed to model influencing factors, establishing a reliability-based evaluation system for segment assembly quality. Key steps include tunnel section division, marginal and copula function selection, and reliability assessment, focusing on a large submarine-shield tunnel. A Monte Carlo simulation examines the impact of various copula models on reliability estimates, validating the proposed method. Key findings show that (1) the selection of marginal and copula functions significantly affects segment assembly quality and reliability, with the commonly used Gaussian copula not always being optimal, and (2) failure probabilities can vary by up to 84 times due to differing construction conditions and geological factors across tunnel sections.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"96 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142988134","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Soft-body climbing robots can automatically adapt to the external shape of the climbing surface, but their load-carrying capacity and output torque are insufficient. To address this problem, a bionic climbing robot that can adapt to different complex climbing surfaces as well as a high load-bearing capacity is designed. The proposed robot consists of three bionic crab-pincer gripping structures and two retractable torsos, and its gripping action is achieved by cable-driven. The mechanical models of the cable-driven and rotatable joints were established, and the relationship between motor input torque and end force was determined. The experimental results show that the climbing robot designed in this paper exhibits strong adaptivity on a variety of different materials and different shapes of climbing surfaces, and has strong climbing stability. Its maximum pipe climbing diameter is 290 mm, and the maximum load capacity is 10.5 kg.
{"title":"Structural design and optimization of adaptive soft adhesion bionic climbing robot","authors":"Huaixin Chen, Quansheng Jiang, Zihan Zhang, Shilei Wu, Yehu Shen, Fengyu Xu","doi":"10.1016/j.autcon.2025.105975","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105975","url":null,"abstract":"Soft-body climbing robots can automatically adapt to the external shape of the climbing surface, but their load-carrying capacity and output torque are insufficient. To address this problem, a bionic climbing robot that can adapt to different complex climbing surfaces as well as a high load-bearing capacity is designed. The proposed robot consists of three bionic crab-pincer gripping structures and two retractable torsos, and its gripping action is achieved by cable-driven. The mechanical models of the cable-driven and rotatable joints were established, and the relationship between motor input torque and end force was determined. The experimental results show that the climbing robot designed in this paper exhibits strong adaptivity on a variety of different materials and different shapes of climbing surfaces, and has strong climbing stability. Its maximum pipe climbing diameter is 290 mm, and the maximum load capacity is 10.5 kg.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"6 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-16","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142988137","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1016/j.autcon.2024.105953
Tsung-Wei Huang, Yi-Hsiang Chen, Jacob J. Lin, Chuin-Shan Chen
On-site rebar inspection is crucial for structural safety but remains labor-intensive and time-consuming. While deep learning presents a promising solution, existing research often relies on limited real-world labeled data. This paper introduces a framework to train a deep learning model for on-site rebar instance segmentation without human labeling. Synthetic data are generated from BIM models, creating a Synthetic On-site Rebar Dataset (SORD) with 25,287 labeled images. Domain adaptation is incorporated to bridge the gap between synthetic and real-world non-labeled data. This approach eliminates the need for human labeling. It significantly enhances model performance, achieving a threefold improvement in Average Precision (AP) metrics compared to models trained on limited real-world data. Additionally, the proposed method demonstrates superior performance across various on-site rebar images collected online, underscoring its generalizability and practical applications.
{"title":"Deep learning without human labeling for on-site rebar instance segmentation using synthetic BIM data and domain adaptation","authors":"Tsung-Wei Huang, Yi-Hsiang Chen, Jacob J. Lin, Chuin-Shan Chen","doi":"10.1016/j.autcon.2024.105953","DOIUrl":"https://doi.org/10.1016/j.autcon.2024.105953","url":null,"abstract":"On-site rebar inspection is crucial for structural safety but remains labor-intensive and time-consuming. While deep learning presents a promising solution, existing research often relies on limited real-world labeled data. This paper introduces a framework to train a deep learning model for on-site rebar instance segmentation without human labeling. Synthetic data are generated from BIM models, creating a Synthetic On-site Rebar Dataset (SORD) with 25,287 labeled images. Domain adaptation is incorporated to bridge the gap between synthetic and real-world non-labeled data. This approach eliminates the need for human labeling. It significantly enhances model performance, achieving a threefold improvement in Average Precision (AP) metrics compared to models trained on limited real-world data. Additionally, the proposed method demonstrates superior performance across various on-site rebar images collected online, underscoring its generalizability and practical applications.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"8 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987833","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1016/j.autcon.2025.105972
Yonghui An, Jianren Ning, Chuanchuan Hou, Jinping Ou
The application of Unmanned Aerial Vehicle (UAV) automatic flight is increasingly popular for structural surface inspection. To address the low level of automation and insufficient adaption of the flight path in response to environmental obstacles, a method of automatic planning UAV inspection mission based on the Geometric Digital Twin (GDT) model and Voxelized Obstacle Information (VOI) is proposed. First, a method for shifting the Field of View (FOV) centroids in parallel is proposed to efficiently generate inspection waypoints. Second, a waypoints adjustment method based on environmental VOI of 3D point clouds is proposed to address the safety issues. Third, a method combining Genetic Algorithm (GA) with A* based on VOI is proposed for optimizing UAV flight path to avoid real-world obstacles. The feasibility of the proposed methods was verified in both an office building and a steel truss bridge. Compared to existing methods, the efficiency is significantly improved.
{"title":"Efficient low-collision UAV-based automated structural surface inspection using geometric digital twin and voxelized obstacle information","authors":"Yonghui An, Jianren Ning, Chuanchuan Hou, Jinping Ou","doi":"10.1016/j.autcon.2025.105972","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105972","url":null,"abstract":"The application of Unmanned Aerial Vehicle (UAV) automatic flight is increasingly popular for structural surface inspection. To address the low level of automation and insufficient adaption of the flight path in response to environmental obstacles, a method of automatic planning UAV inspection mission based on the Geometric Digital Twin (GDT) model and Voxelized Obstacle Information (VOI) is proposed. First, a method for shifting the Field of View (FOV) centroids in parallel is proposed to efficiently generate inspection waypoints. Second, a waypoints adjustment method based on environmental VOI of 3D point clouds is proposed to address the safety issues. Third, a method combining Genetic Algorithm (GA) with A* based on VOI is proposed for optimizing UAV flight path to avoid real-world obstacles. The feasibility of the proposed methods was verified in both an office building and a steel truss bridge. Compared to existing methods, the efficiency is significantly improved.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"45 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987832","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1016/j.autcon.2025.105961
Honghong Song, Xiaofeng Zhu, Haijiang Li, Gang Yang
As bridges age, manual repair decision-making methods struggle to meet growing maintenance demands. This paper develops AI systems that can imitate experts' decision processes by mining implicit relationships between bridge damage images and corresponding repair proposals. A multimodal deep learning-based end-to-end decision-making method is proposed to extract and map features of bridge damage images and repair proposal texts, automating damage repair proposal generation. The model is trained and validated using a dataset from historical inspection reports. The model's image feature extraction is evaluated using Class Activation Mapping (CAM), while text generation achieved BLEU-1 to BLEU-4 scores of 0.76, 0.743, 0.712, and 0.705, respectively, with 82 % accuracy in human evaluation. The results indicate the model's effectiveness in handling complex image features and generating long text, addressing challenges in automated bridge repair decision-making.
{"title":"Multimodal deep learning-based automatic generation of repair proposals for steel bridge shallow damage","authors":"Honghong Song, Xiaofeng Zhu, Haijiang Li, Gang Yang","doi":"10.1016/j.autcon.2025.105961","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105961","url":null,"abstract":"As bridges age, manual repair decision-making methods struggle to meet growing maintenance demands. This paper develops AI systems that can imitate experts' decision processes by mining implicit relationships between bridge damage images and corresponding repair proposals. A multimodal deep learning-based end-to-end decision-making method is proposed to extract and map features of bridge damage images and repair proposal texts, automating damage repair proposal generation. The model is trained and validated using a dataset from historical inspection reports. The model's image feature extraction is evaluated using Class Activation Mapping (CAM), while text generation achieved BLEU-1 to BLEU-4 scores of 0.76, 0.743, 0.712, and 0.705, respectively, with 82 % accuracy in human evaluation. The results indicate the model's effectiveness in handling complex image features and generating long text, addressing challenges in automated bridge repair decision-making.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"30 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987830","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1016/j.autcon.2025.105967
Sizhong Qin, Wenjie Liao, Yuli Huang, Shulu Zhang, Yi Gu, Jin Han, Xinzheng Lu
Traditional reinforced concrete (RC) frame design depends on extensive engineering experience and iterative verification processes, often resulting in significant inefficiencies. The diversity in the topologies and behaviors of structural components further presents considerable obstacles to effective machine learning applications in design. This paper introduces an approach using heterogeneous graph neural networks (HetGNNs) to automate and optimize the dimensioning of frame components. This method captures the distinct frame topologies by developing a precisely tailored heterogeneous graph node representation. Leveraging a unique dataset derived from engineering drawings, the HetGNN model learns to size the component sections accurately. It is demonstrated that this method offers a transformative improvement in the efficiency, accuracy, and cost-effectiveness of structural design while adhering to design standards. The size design of RC frame structures can be completed in under one second, with an average size deviation of around 50 mm (one module) compared to those designed by engineers.
{"title":"Intelligent design for component size generation in reinforced concrete frame structures using heterogeneous graph neural networks","authors":"Sizhong Qin, Wenjie Liao, Yuli Huang, Shulu Zhang, Yi Gu, Jin Han, Xinzheng Lu","doi":"10.1016/j.autcon.2025.105967","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105967","url":null,"abstract":"Traditional reinforced concrete (RC) frame design depends on extensive engineering experience and iterative verification processes, often resulting in significant inefficiencies. The diversity in the topologies and behaviors of structural components further presents considerable obstacles to effective machine learning applications in design. This paper introduces an approach using heterogeneous graph neural networks (HetGNNs) to automate and optimize the dimensioning of frame components. This method captures the distinct frame topologies by developing a precisely tailored heterogeneous graph node representation. Leveraging a unique dataset derived from engineering drawings, the HetGNN model learns to size the component sections accurately. It is demonstrated that this method offers a transformative improvement in the efficiency, accuracy, and cost-effectiveness of structural design while adhering to design standards. The size design of RC frame structures can be completed in under one second, with an average size deviation of around 50 mm (one module) compared to those designed by engineers.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"30 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987829","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Pub Date : 2025-01-13DOI: 10.1016/j.autcon.2025.105971
Fei Kang, Ben Huang, Gang Wan
Underwater damage poses significant risks to the safe operation of dams, making timely detection critical. Traditional manual inspection methods are hazardous, time-consuming, and labor-intensive. This paper introduces an automated detection system integrating remotely operated vehicles (ROVs) and enhanced deep-learning technologies. The proposed YOLOv8n-DCW model incorporates deformable convolution networks, coordinate attention mechanisms (CoordAtt), and an improved loss function to boost detection performance. Trained on an underwater dam damage dataset, the model achieved an 84.5 % mean average precision. Ablation studies validated the effectiveness of these enhancements, while comparative experiments demonstrated the superiority of YOLOv8n-DCW over existing models and CoordAtt's advantage among attention mechanisms. The developed detection software, integrated with the ROV, was tested in a laboratory pool, confirming its practicality and efficiency. This system offers a safer, faster, and cost-effective solution for underwater dam damage detection, addressing limitations of traditional methods and providing a robust tool for engineering applications.
{"title":"Automated detection of underwater dam damage using remotely operated vehicles and deep learning technologies","authors":"Fei Kang, Ben Huang, Gang Wan","doi":"10.1016/j.autcon.2025.105971","DOIUrl":"https://doi.org/10.1016/j.autcon.2025.105971","url":null,"abstract":"Underwater damage poses significant risks to the safe operation of dams, making timely detection critical. Traditional manual inspection methods are hazardous, time-consuming, and labor-intensive. This paper introduces an automated detection system integrating remotely operated vehicles (ROVs) and enhanced deep-learning technologies. The proposed YOLOv8n-DCW model incorporates deformable convolution networks, coordinate attention mechanisms (CoordAtt), and an improved loss function to boost detection performance. Trained on an underwater dam damage dataset, the model achieved an 84.5 % mean average precision. Ablation studies validated the effectiveness of these enhancements, while comparative experiments demonstrated the superiority of YOLOv8n-DCW over existing models and CoordAtt's advantage among attention mechanisms. The developed detection software, integrated with the ROV, was tested in a laboratory pool, confirming its practicality and efficiency. This system offers a safer, faster, and cost-effective solution for underwater dam damage detection, addressing limitations of traditional methods and providing a robust tool for engineering applications.","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"388 1","pages":""},"PeriodicalIF":10.3,"publicationDate":"2025-01-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"142987827","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}