首页 > 最新文献

Automation in Construction最新文献

英文 中文
Automated diagnosis of bridge expansion joint defects using voiceprint features and deep learning 基于声纹特征和深度学习的桥梁伸缩缝缺陷自动诊断
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-27 DOI: 10.1016/j.autcon.2025.106739
Yixuan Chen , Hongzhe Zhao , Yichao Xu , Yufeng Zhang , Jian Zhang
Bridge Expansion Joints (BEJs) are crucial for bridge safety, yet their acoustic signals are complex and easily disturbed by traffic noise, limiting traditional identification accuracy. To address this, an intelligent monitoring system based on voiceprint features and deep learning is developed. Its key contributions include: (1) a cloud-edge collaborative voiceprint monitoring device that integrates audio sampling, embedded processing, cloud server and wireless transmission, enabling long-term data collection and remote diagnosis under noisy environments; (2) the use of first- and second-order differential Mel Frequency Cepstral Coefficients (MFCC) for feature extraction, improving discriminability; and (3) the Hybrid Attention Fusion Network (HAFNet), built on a pre-trained convolutional backbone with multi-scale attention, achieving high-precision recognition of typical BEJ faults, with testing accuracies of 97.99% and 99.00% for two vehicle types. Field experiments demonstrate the system's stability, reliability, and feasibility for real-time BEJ monitoring.
桥梁伸缩缝对桥梁安全至关重要,但其声信号复杂,容易受到交通噪声的干扰,限制了传统的识别精度。为了解决这个问题,开发了一种基于声纹特征和深度学习的智能监控系统。其主要贡献包括:(1)集成了音频采样、嵌入式处理、云服务器和无线传输的云边缘协作声纹监测设备,实现了嘈杂环境下的长期数据采集和远程诊断;(2)利用一阶和二阶差分模频倒谱系数(MFCC)进行特征提取,提高了识别能力;(3)混合注意力融合网络(HAFNet),基于预训练的多尺度关注卷积主干,实现了对典型BEJ故障的高精度识别,两种车型的测试准确率分别为97.99%和99.00%。现场实验验证了该系统的稳定性、可靠性和实时监测的可行性。
{"title":"Automated diagnosis of bridge expansion joint defects using voiceprint features and deep learning","authors":"Yixuan Chen ,&nbsp;Hongzhe Zhao ,&nbsp;Yichao Xu ,&nbsp;Yufeng Zhang ,&nbsp;Jian Zhang","doi":"10.1016/j.autcon.2025.106739","DOIUrl":"10.1016/j.autcon.2025.106739","url":null,"abstract":"<div><div>Bridge Expansion Joints (BEJs) are crucial for bridge safety, yet their acoustic signals are complex and easily disturbed by traffic noise, limiting traditional identification accuracy. To address this, an intelligent monitoring system based on voiceprint features and deep learning is developed. Its key contributions include: (1) a cloud-edge collaborative voiceprint monitoring device that integrates audio sampling, embedded processing, cloud server and wireless transmission, enabling long-term data collection and remote diagnosis under noisy environments; (2) the use of first- and second-order differential Mel Frequency Cepstral Coefficients (MFCC) for feature extraction, improving discriminability; and (3) the Hybrid Attention Fusion Network (HAFNet), built on a pre-trained convolutional backbone with multi-scale attention, achieving high-precision recognition of typical BEJ faults, with testing accuracies of 97.99% and 99.00% for two vehicle types. Field experiments demonstrate the system's stability, reliability, and feasibility for real-time BEJ monitoring.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106739"},"PeriodicalIF":11.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071951","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Weld seam extraction and path generation for robotic welding of steel structures based on 3D vision 基于三维视觉的钢结构机器人焊接焊缝提取与路径生成
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-27 DOI: 10.1016/j.autcon.2026.106792
Jinxin Yi , Xuan Kong , Hao Tang , Jie Zhang , Zhenming Chen , Lu Deng
Recent advances in computer vision have provided new solutions for intelligent welding. However, existing vision-based weld seam extraction techniques exhibit limited adaptability to various workpieces in unstructured environments. Therefore, this paper proposes a three-dimensional vision-based method tailored for weld seam extraction and path generation. The proposed method synergizes a deep learning-based point cloud segmentation technique with an improved multi-scale point cloud registration algorithm to reconstruct the complete point cloud model of all weld regions in the workpieces. Subsequently, the welding paths and torch poses are calculated using an optimized multi-plane fitting algorithm integrated with geometry model of weld seam. Experimental validation on four workpieces demonstrates that the proposed method achieves good accuracy and outperforms the existing techniques in terms of efficiency and applicability, offering a robust solution for automated welding of steel structures.
计算机视觉的最新进展为智能焊接提供了新的解决方案。然而,现有的基于视觉的焊缝提取技术对非结构化环境中各种工件的适应性有限。因此,本文提出了一种针对焊缝提取和路径生成的三维视觉方法。该方法将基于深度学习的点云分割技术与改进的多尺度点云配准算法相结合,重建工件中所有焊缝区域的完整点云模型。然后,结合焊缝几何模型,采用优化后的多平面拟合算法计算焊接路径和焊枪位姿。在4个工件上进行的实验验证表明,该方法具有良好的精度,在效率和适用性方面优于现有的方法,为钢结构自动化焊接提供了可靠的解决方案。
{"title":"Weld seam extraction and path generation for robotic welding of steel structures based on 3D vision","authors":"Jinxin Yi ,&nbsp;Xuan Kong ,&nbsp;Hao Tang ,&nbsp;Jie Zhang ,&nbsp;Zhenming Chen ,&nbsp;Lu Deng","doi":"10.1016/j.autcon.2026.106792","DOIUrl":"10.1016/j.autcon.2026.106792","url":null,"abstract":"<div><div>Recent advances in computer vision have provided new solutions for intelligent welding. However, existing vision-based weld seam extraction techniques exhibit limited adaptability to various workpieces in unstructured environments. Therefore, this paper proposes a three-dimensional vision-based method tailored for weld seam extraction and path generation. The proposed method synergizes a deep learning-based point cloud segmentation technique with an improved multi-scale point cloud registration algorithm to reconstruct the complete point cloud model of all weld regions in the workpieces. Subsequently, the welding paths and torch poses are calculated using an optimized multi-plane fitting algorithm integrated with geometry model of weld seam. Experimental validation on four workpieces demonstrates that the proposed method achieves good accuracy and outperforms the existing techniques in terms of efficiency and applicability, offering a robust solution for automated welding of steel structures.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106792"},"PeriodicalIF":11.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071953","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
LLM-enabled multi-agent framework for natural language interaction with graph-based digital twins 支持llm的多代理框架,用于与基于图的数字孪生进行自然语言交互
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-27 DOI: 10.1016/j.autcon.2026.106791
Yuandong Pan , Mudan Wang , Linjun Lu , Rabindra Lamsal , Erika Pärn , Sisi Zlatanova , Ioannis Brilakis
Digital twins are increasingly used in the Architecture, Engineering, and Construction (AEC) industry, but their adoption is often hindered by the need for specialised knowledge, such as database querying. This paper presents Graph-DT-GPT, a multi-agent framework that integrates Large Language Models (LLMs) with graph-based digital twins to enable natural language interaction. The framework is designed with modular agents, including decision, query generation, and answer extraction, and grounds all LLMs’ outputs in structured graph data to improve response reliability and reduce hallucinations. The framework is evaluated on two use cases: a city-level graph with over 40,000 building nodes and room-level apartment layout graphs. Graph-DT-GPT achieves 100% and 95.5% answer correctness using Claude Sonnet 4.5 and GPT-4o, respectively, in the city-scale case, and 100% correctness in the room-level case, significantly outperforming baseline methods including LangChain Neo4j pipelines by approximately 40% and 10%, respectively. These results demonstrate its scalability and potential to enhance accessible, accurate information retrieval in AEC digital twin applications.
数字双胞胎越来越多地用于架构、工程和建筑(AEC)行业,但它们的采用往往受到对专业知识(如数据库查询)的需求的阻碍。本文介绍了Graph-DT-GPT,这是一个多智能体框架,它将大型语言模型(llm)与基于图的数字双胞胎集成在一起,以实现自然语言交互。该框架采用模块化代理设计,包括决策、查询生成和答案提取,并将llm的所有输出基于结构化图数据,以提高响应可靠性并减少幻觉。该框架在两个用例上进行评估:包含超过40,000个建筑节点的城市级图和房间级公寓布局图。使用Claude Sonnet 4.5和gpt - 40, Graph-DT-GPT在城市规模的情况下分别达到100%和95.5%的答案正确性,在房间级别的情况下达到100%的正确性,显著优于包括LangChain Neo4j管道在内的基线方法,分别约为40%和10%。这些结果证明了它的可扩展性和潜力,以提高可访问的,准确的信息检索在AEC数字孪生应用。
{"title":"LLM-enabled multi-agent framework for natural language interaction with graph-based digital twins","authors":"Yuandong Pan ,&nbsp;Mudan Wang ,&nbsp;Linjun Lu ,&nbsp;Rabindra Lamsal ,&nbsp;Erika Pärn ,&nbsp;Sisi Zlatanova ,&nbsp;Ioannis Brilakis","doi":"10.1016/j.autcon.2026.106791","DOIUrl":"10.1016/j.autcon.2026.106791","url":null,"abstract":"<div><div>Digital twins are increasingly used in the Architecture, Engineering, and Construction (AEC) industry, but their adoption is often hindered by the need for specialised knowledge, such as database querying. This paper presents Graph-DT-GPT, a multi-agent framework that integrates Large Language Models (LLMs) with graph-based digital twins to enable natural language interaction. The framework is designed with modular agents, including decision, query generation, and answer extraction, and grounds all LLMs’ outputs in structured graph data to improve response reliability and reduce hallucinations. The framework is evaluated on two use cases: a city-level graph with over 40,000 building nodes and room-level apartment layout graphs. Graph-DT-GPT achieves 100% and 95.5% answer correctness using Claude Sonnet 4.5 and GPT-4o, respectively, in the city-scale case, and 100% correctness in the room-level case, significantly outperforming baseline methods including LangChain Neo4j pipelines by approximately 40% and 10%, respectively. These results demonstrate its scalability and potential to enhance accessible, accurate information retrieval in AEC digital twin applications.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106791"},"PeriodicalIF":11.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071735","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Control strategies for Cellular Automata-based generative design in architecture and urbanism 基于元胞自动机的生成式建筑与城市设计控制策略
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-27 DOI: 10.1016/j.autcon.2025.106754
Yiming Liu, Christiane M. Herr
As Artificial Intelligence transforms design through decentralised and self-organising generative systems, Cellular Automata (CA) exemplify a foundational yet underexplored paradigm capable of bridging rule-based emergence and computational creativity in architecture and urbanism. Driven by simple local rules, CA produce spatially responsive and systemic patterns well-suited to capturing the dynamics of complex interrelated systems, making them valuable for generative design exploration. This review systematically investigates control strategies for guiding CA-based generative processes. It identifies temporal logic methods for adjusting CA behaviour through bibliometric analysis. The review further demonstrates control factors, computational control, and human-mediated control, analysing their impact on the adaptability of CA design processes at each stage through the content-based synthesis. The results reveal the advantages of different control strategies in guiding goal-directed CA generation. This study advances the understanding of CA-based design mechanisms and highlights opportunities to develop intelligent control, process-oriented design tools integrating data-driven and AI technologies.
随着人工智能通过分散和自组织的生成系统改变设计,细胞自动机(CA)体现了一种基础但尚未得到充分探索的范式,能够在建筑和城市规划中连接基于规则的出现和计算创造力。在简单的局部规则的驱动下,CA产生空间响应和系统模式,非常适合捕获复杂相互关联系统的动态,使它们对生成设计探索有价值。本文系统地研究了指导基于ca的生成过程的控制策略。它通过文献计量分析确定了调整CA行为的时间逻辑方法。本文进一步论证了控制因素、计算控制和人为控制,并通过基于内容的综合分析了它们对每个阶段CA设计过程适应性的影响。结果揭示了不同控制策略在指导目标导向CA生成方面的优势。本研究促进了对基于ca的设计机制的理解,并强调了开发集成数据驱动和人工智能技术的智能控制、面向过程的设计工具的机会。
{"title":"Control strategies for Cellular Automata-based generative design in architecture and urbanism","authors":"Yiming Liu,&nbsp;Christiane M. Herr","doi":"10.1016/j.autcon.2025.106754","DOIUrl":"10.1016/j.autcon.2025.106754","url":null,"abstract":"<div><div>As Artificial Intelligence transforms design through decentralised and self-organising generative systems, Cellular Automata (CA) exemplify a foundational yet underexplored paradigm capable of bridging rule-based emergence and computational creativity in architecture and urbanism. Driven by simple local rules, CA produce spatially responsive and systemic patterns well-suited to capturing the dynamics of complex interrelated systems, making them valuable for generative design exploration. This review systematically investigates control strategies for guiding CA-based generative processes. It identifies temporal logic methods for adjusting CA behaviour through bibliometric analysis. The review further demonstrates control factors, computational control, and human-mediated control, analysing their impact on the adaptability of CA design processes at each stage through the content-based synthesis. The results reveal the advantages of different control strategies in guiding goal-directed CA generation. This study advances the understanding of CA-based design mechanisms and highlights opportunities to develop intelligent control, process-oriented design tools integrating data-driven and AI technologies.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106754"},"PeriodicalIF":11.5,"publicationDate":"2026-01-27","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071952","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Bridging dual knowledge graphs for multi-hop question answering in construction safety 面向建筑安全多跳问答的桥接双知识图
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-26 DOI: 10.1016/j.autcon.2026.106794
Yuxin Zhang, Xi Wang, Mo Hu, Zhenyu Zhang
Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.
安全法规的信息检索和问题回答对于自动化建筑符合性检查至关重要,但法规文本的语言和结构复杂性阻碍了这一点。许多查询是多跳的,需要在相互连接的子句之间进行合成。为了应对这一挑战,本文介绍了BifrostRAG,这是一个双图检索增强生成(RAG)系统,可以对语言关系和文档结构进行建模。提出的体系结构支持混合检索机制,该机制结合了图遍历和基于向量的语义搜索,使大型语言模型能够对文本的内容和结构进行推理。在多跳问题数据集上,BifrostRAG达到了92.8%的准确率,85.5%的召回率和87.3%的F1分数。这些结果明显优于纯矢量和纯图形的RAG基线,使BifrostRAG成为llm驱动的符合性检查的强大知识引擎。本文提出的双图混合检索机制为跨知识密集型工程领域的复杂技术文档导航提供了可转移的蓝图。
{"title":"Bridging dual knowledge graphs for multi-hop question answering in construction safety","authors":"Yuxin Zhang,&nbsp;Xi Wang,&nbsp;Mo Hu,&nbsp;Zhenyu Zhang","doi":"10.1016/j.autcon.2026.106794","DOIUrl":"10.1016/j.autcon.2026.106794","url":null,"abstract":"<div><div>Information retrieval and question answering from safety regulations are essential for automated construction compliance checking but are hindered by the linguistic and structural complexity of regulatory text. Many queries are multi-hop, requiring synthesis across interlinked clauses. To address the challenge, this paper introduces BifrostRAG, a dual-graph retrieval-augmented generation (RAG) system that models both linguistic relationships and document structure. The proposed architecture supports a hybrid retrieval mechanism that combines graph traversal with vector-based semantic search, enabling large language models to reason over both the content and the structure of the text. On a multi-hop question dataset, BifrostRAG achieves 92.8% precision, 85.5% recall, and an F1 score of 87.3%. These results significantly outperform vector-only and graph-only RAG baselines, establishing BifrostRAG as a robust knowledge engine for LLM-driven compliance checking. The dual-graph, hybrid retrieval mechanism presented in this paper offers a transferable blueprint for navigating complex technical documents across knowledge-intensive engineering domains.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106794"},"PeriodicalIF":11.5,"publicationDate":"2026-01-26","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146048561","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Margin-aware maximum classifier discrepancy for BIM-to-scan semantic segmentation of building point clouds 基于边缘感知的建筑点云bim -扫描语义分割最大分类器差异
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-24 DOI: 10.1016/j.autcon.2026.106799
Difeng Hu , You Dong , Mingkai Li , Hanmo Wang , Tao Wang
BIM-derived point clouds are valuable for semantic segmentation and BIM modeling, but distribution discrepancies between BIM and real-world scans significantly degrade segmentation performance. To mitigate this issue, this paper develops a margin-aware maximum classifier discrepancy (MMCD) method, which extends the conventional MCD framework by incorporating a margin-aware mechanism. Task-specific classifiers act as discriminators to encourage the feature generator to learn domain-invariant yet discriminative features for unlabeled real point clouds, improving BIM-to-scan distribution alignment and segmentation accuracy. A margin-aware discrepancy loss is formulated to enforce sufficient margin between features and classification boundaries, improving robustness to domain shift. In addition, a training strategy is proposed to support MMCD optimization. Finally, a refined RandLA-Net with an attention-based upsampling module is constructed as the backbone for validation. Experiments demonstrate that the proposed approach achieves superior performance, with an IoU of 72.79% and an overall accuracy of 87.99%, outperforming RandLA-Net variants with or without MCD.
BIM衍生的点云对于语义分割和BIM建模很有价值,但是BIM和真实扫描之间的分布差异会显著降低分割性能。为了解决这一问题,本文提出了一种边缘感知的最大分类器差异(MMCD)方法,该方法通过引入边缘感知机制扩展了传统的最大分类器差异框架。特定于任务的分类器充当鉴别器,以鼓励特征生成器学习未标记的真实点云的域不变但有区别的特征,从而提高bim到扫描的分布对齐和分割精度。制定了一个边界感知差异损失来强制特征和分类边界之间有足够的边界,提高了对域移位的鲁棒性。此外,提出了一种支持MMCD优化的训练策略。最后,构建了一个改进的基于注意力上采样模块的RandLA-Net作为验证的主干。实验表明,该方法取得了优异的性能,IoU为72.79%,总体准确率为87.99%,优于带或不带MCD的RandLA-Net变体。
{"title":"Margin-aware maximum classifier discrepancy for BIM-to-scan semantic segmentation of building point clouds","authors":"Difeng Hu ,&nbsp;You Dong ,&nbsp;Mingkai Li ,&nbsp;Hanmo Wang ,&nbsp;Tao Wang","doi":"10.1016/j.autcon.2026.106799","DOIUrl":"10.1016/j.autcon.2026.106799","url":null,"abstract":"<div><div>BIM-derived point clouds are valuable for semantic segmentation and BIM modeling, but distribution discrepancies between BIM and real-world scans significantly degrade segmentation performance. To mitigate this issue, this paper develops a margin-aware maximum classifier discrepancy (MMCD) method, which extends the conventional MCD framework by incorporating a margin-aware mechanism. Task-specific classifiers act as discriminators to encourage the feature generator to learn domain-invariant yet discriminative features for unlabeled real point clouds, improving BIM-to-scan distribution alignment and segmentation accuracy. A margin-aware discrepancy loss is formulated to enforce sufficient margin between features and classification boundaries, improving robustness to domain shift. In addition, a training strategy is proposed to support MMCD optimization. Finally, a refined RandLA-Net with an attention-based upsampling module is constructed as the backbone for validation. Experiments demonstrate that the proposed approach achieves superior performance, with an IoU of 72.79% and an overall accuracy of 87.99%, outperforming RandLA-Net variants with or without MCD.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106799"},"PeriodicalIF":11.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146036034","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computer vision for infrastructure defect detection: Methods and trends 基础设施缺陷检测的计算机视觉:方法和趋势
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-24 DOI: 10.1016/j.autcon.2026.106795
Yufei Zhang , Gang Li , Runjie Shen
Infrastructure defect detection is vital for public safety and sustainable societal development. In recent years, advances in computer vision have gradually promoted the intelligence and automation of infrastructure defect detection. This paper provides a comprehensive overview of research progress and emerging trends in computer vision-based detection of diverse defect types across multiple infrastructure scenarios, including datasets, evaluation metrics, and methods. A classification framework is introduced that centers on single and multiple visual modalities. The former includes traditional image processing, machine learning, and deep learning techniques, reflecting the evolution of the field. The latter focuses on data-level, feature-level, and decision-level fusion strategies, highlighting opportunities to improve detection performance with multiple visual modalities. Methods are further categorized according to their characteristics and model architectures. Finally, existing challenges are summarized, and promising research directions are outlined based on the strengths and limitations of current methods.
基础设施缺陷检测对公共安全和社会可持续发展至关重要。近年来,计算机视觉的进步逐步推动了基础设施缺陷检测的智能化和自动化。本文提供了研究进展的全面概述,以及跨多个基础设施场景的基于计算机视觉的各种缺陷类型检测的新兴趋势,包括数据集、评估度量和方法。介绍了一种以单视觉模态和多视觉模态为中心的分类框架。前者包括传统的图像处理、机器学习和深度学习技术,反映了该领域的发展。后者侧重于数据级、特征级和决策级融合策略,突出了通过多种视觉模式提高检测性能的机会。方法根据其特征和模型架构进一步分类。最后,根据现有方法的优势和局限性,总结了存在的挑战,并展望了未来的研究方向。
{"title":"Computer vision for infrastructure defect detection: Methods and trends","authors":"Yufei Zhang ,&nbsp;Gang Li ,&nbsp;Runjie Shen","doi":"10.1016/j.autcon.2026.106795","DOIUrl":"10.1016/j.autcon.2026.106795","url":null,"abstract":"<div><div>Infrastructure defect detection is vital for public safety and sustainable societal development. In recent years, advances in computer vision have gradually promoted the intelligence and automation of infrastructure defect detection. This paper provides a comprehensive overview of research progress and emerging trends in computer vision-based detection of diverse defect types across multiple infrastructure scenarios, including datasets, evaluation metrics, and methods. A classification framework is introduced that centers on single and multiple visual modalities. The former includes traditional image processing, machine learning, and deep learning techniques, reflecting the evolution of the field. The latter focuses on data-level, feature-level, and decision-level fusion strategies, highlighting opportunities to improve detection performance with multiple visual modalities. Methods are further categorized according to their characteristics and model architectures. Finally, existing challenges are summarized, and promising research directions are outlined based on the strengths and limitations of current methods.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106795"},"PeriodicalIF":11.5,"publicationDate":"2026-01-24","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033310","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPS-free automated registration of UAV-captured façade image sequences to BIM using semantic key points 使用语义关键点将无人机捕获的farade图像序列自动注册到BIM中,无需gps
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-23 DOI: 10.1016/j.autcon.2026.106788
Cong Chen, Shenghan Zhang
Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.
无人驾驶飞行器(uav)已成为建筑物外观检查的重要工具。然而,由于立面上的重复模式,自动注册无人机拍摄的图像到建筑信息建模(BIM)模型,虽然对建筑维护很重要,仍然具有挑战性。现有的方法通常依赖于GPS数据,在城市环境中缺乏足够的精度。本文提出了一个无gps的自动化框架,通过利用重叠图像的信息将无人机捕获的图像序列注册到BIM模型。该框架包括三个关键部分:(1)使用ground SAM 2从图像中提取语义关键点;(2)实现虚拟无人机摄像机模型,实现BIM坐标与图像坐标之间关键点的双向投影;(3)开发粒子滤波运动模型,使用图像序列实现图像到bim的注册。该方法将各种数据类型注册到BIM模型中,包括重叠的视觉图像序列,红外(IR)-视觉对和farade缺陷。
{"title":"GPS-free automated registration of UAV-captured façade image sequences to BIM using semantic key points","authors":"Cong Chen,&nbsp;Shenghan Zhang","doi":"10.1016/j.autcon.2026.106788","DOIUrl":"10.1016/j.autcon.2026.106788","url":null,"abstract":"<div><div>Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106788"},"PeriodicalIF":11.5,"publicationDate":"2026-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Project-level automated pavement maintenance and rehabilitation decision-making with data imbalance mitigation and post-maintenance evaluation 基于数据不平衡缓解和养护后评估的项目级自动路面养护和修复决策
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-22 DOI: 10.1016/j.autcon.2026.106796
Qingwei Zeng , Shunxin Yang , Chang Xu , Jitong Ding , Qiwei Chen , Guoyang Lu
Pavement management data often suffers from severe class imbalance, and existing project-level maintenance and rehabilitation (M&R) decision-making models generally lack post-maintenance evaluation mechanisms. To address these issues, this paper proposes a project-level automated pavement M&R decision-making framework that considers data imbalance and incorporates post-maintenance evaluation (PMDNN). First, a Conditional Tabular Generative Adversarial Network (CTGAN) is developed to augment imbalanced M&R datasets. Next, two deep neural networks (DNNs) are constructed, for pavement performance prediction and for M&R decision-making, respectively. Finally, these two DNNs are nested to enable post-maintenance evaluation, supporting iterative adjustment of suboptimal M&R plans. Results demonstrate that the CTGAN effectively addresses data imbalance and accurately simulates the distribution of the original data. Compared with other data augmentation models, the CTGAN generates data with 4.7%–18.1% higher quality. Additionally, relative to multiple baseline frameworks, the proposed PMDNN framework achieves a 1.91%–4.71% higher overall decision accuracy. These findings indicate that PMDNN can support pavement management systems in making decisions more closely aligned with expert judgment.
路面管理数据往往存在严重的类失衡,现有的项目级养护与修复(M&;R)决策模型普遍缺乏养护后评价机制。为了解决这些问题,本文提出了一个考虑数据不平衡并结合维护后评估(PMDNN)的项目级自动路面管理决策框架。首先,开发了一种条件表格生成对抗网络(CTGAN)来增强不平衡的M&;R数据集。接下来,构建了两个深度神经网络(dnn),分别用于路面性能预测和M&;R决策。最后,这两个dnn被嵌套以支持维护后评估,支持次优M&;R计划的迭代调整。结果表明,CTGAN有效地解决了数据不平衡问题,并能准确地模拟原始数据的分布。与其他数据增强模型相比,CTGAN生成的数据质量提高了4.7%-18.1%。此外,相对于多个基线框架,所提出的PMDNN框架的总体决策准确率提高了1.91%-4.71%。这些发现表明,pmnn可以支持路面管理系统做出更符合专家判断的决策。
{"title":"Project-level automated pavement maintenance and rehabilitation decision-making with data imbalance mitigation and post-maintenance evaluation","authors":"Qingwei Zeng ,&nbsp;Shunxin Yang ,&nbsp;Chang Xu ,&nbsp;Jitong Ding ,&nbsp;Qiwei Chen ,&nbsp;Guoyang Lu","doi":"10.1016/j.autcon.2026.106796","DOIUrl":"10.1016/j.autcon.2026.106796","url":null,"abstract":"<div><div>Pavement management data often suffers from severe class imbalance, and existing project-level maintenance and rehabilitation (M&amp;R) decision-making models generally lack post-maintenance evaluation mechanisms. To address these issues, this paper proposes a project-level automated pavement M&amp;R decision-making framework that considers data imbalance and incorporates post-maintenance evaluation (PMDNN). First, a Conditional Tabular Generative Adversarial Network (CTGAN) is developed to augment imbalanced M&amp;R datasets. Next, two deep neural networks (DNNs) are constructed, for pavement performance prediction and for M&amp;R decision-making, respectively. Finally, these two DNNs are nested to enable post-maintenance evaluation, supporting iterative adjustment of suboptimal M&amp;R plans. Results demonstrate that the CTGAN effectively addresses data imbalance and accurately simulates the distribution of the original data. Compared with other data augmentation models, the CTGAN generates data with 4.7%–18.1% higher quality. Additionally, relative to multiple baseline frameworks, the proposed PMDNN framework achieves a 1.91%–4.71% higher overall decision accuracy. These findings indicate that PMDNN can support pavement management systems in making decisions more closely aligned with expert judgment.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106796"},"PeriodicalIF":11.5,"publicationDate":"2026-01-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033334","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Addressing data scarcity in construction safety monitoring using low-rank adaptation (LoRA)-tuned domain-specific image generation 使用低秩自适应(LoRA)调优的特定领域图像生成解决建筑安全监测中的数据稀缺问题
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-01-21 DOI: 10.1016/j.autcon.2026.106786
Insoo Jeong , Junghoon Kim , Seungmo Lim , Jeongbin Hwang , Seokho Chi
This paper proposes a lightweight domain adaptation framework for construction safety monitoring by fine-tuning a pretrained text-to-image diffusion model using Low-Rank Adaptation (LoRA). To simulate high-risk construction environments underrepresented in training data, the model was adapted to environmental features and specific hazards, focusing on visually dominant scenarios including falls, struck-by, and caught-in incidents. To address data scarcity, Multi-LoRA fine-tuning was conducted using 20 images per hazard type (totaling 60 across three hazards) and 30 background images, enabling both contextual and hazard-specific adaptation. The generated images achieved the highest semantic consistency, yielding the top mean Contrastive Language-Image Pre-training (CLIP) scores with minimal variance, and improved visual realism by reducing the Fréchet Inception Distance (FID) by 86.72 points. Furthermore, a YOLOv8 model trained exclusively on these synthetic images achieved a mean average precision ([email protected]:0.95) of 94.1% on real-world frames, comparable to a baseline model trained on real data.
本文提出了一个轻量级的领域自适应框架,用于建筑安全监测,该框架采用低秩自适应(Low-Rank adaptation, LoRA)对预训练的文本到图像扩散模型进行微调。为了模拟训练数据中未被充分代表的高风险建筑环境,该模型适应了环境特征和特定危害,重点关注视觉上占主导地位的场景,包括跌倒、被撞和被困事故。为了解决数据短缺问题,Multi-LoRA对每种灾害类型使用20张图像(三种灾害共60张)和30张背景图像进行了微调,从而实现了上下文和特定灾害的适应。生成的图像达到了最高的语义一致性,以最小的方差产生最高的平均对比语言图像预训练(CLIP)分数,并通过将fr起始距离(FID)减少86.72分来提高视觉真实感。此外,专门在这些合成图像上训练的YOLOv8模型在真实帧上实现了94.1%的平均精度([email protected]:0.95),与在真实数据上训练的基线模型相当。
{"title":"Addressing data scarcity in construction safety monitoring using low-rank adaptation (LoRA)-tuned domain-specific image generation","authors":"Insoo Jeong ,&nbsp;Junghoon Kim ,&nbsp;Seungmo Lim ,&nbsp;Jeongbin Hwang ,&nbsp;Seokho Chi","doi":"10.1016/j.autcon.2026.106786","DOIUrl":"10.1016/j.autcon.2026.106786","url":null,"abstract":"<div><div>This paper proposes a lightweight domain adaptation framework for construction safety monitoring by fine-tuning a pretrained text-to-image diffusion model using Low-Rank Adaptation (LoRA). To simulate high-risk construction environments underrepresented in training data, the model was adapted to environmental features and specific hazards, focusing on visually dominant scenarios including falls, struck-by, and caught-in incidents. To address data scarcity, Multi-LoRA fine-tuning was conducted using 20 images per hazard type (totaling 60 across three hazards) and 30 background images, enabling both contextual and hazard-specific adaptation. The generated images achieved the highest semantic consistency, yielding the top mean Contrastive Language-Image Pre-training (CLIP) scores with minimal variance, and improved visual realism by reducing the Fréchet Inception Distance (FID) by 86.72 points. Furthermore, a YOLOv8 model trained exclusively on these synthetic images achieved a mean average precision ([email protected]:0.95) of 94.1% on real-world frames, comparable to a baseline model trained on real data.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106786"},"PeriodicalIF":11.5,"publicationDate":"2026-01-21","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146014826","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Automation in Construction
全部 Acc. Chem. Res. ACS Applied Bio Materials ACS Appl. Electron. Mater. ACS Appl. Energy Mater. ACS Appl. Mater. Interfaces ACS Appl. Nano Mater. ACS Appl. Polym. Mater. ACS BIOMATER-SCI ENG ACS Catal. ACS Cent. Sci. ACS Chem. Biol. ACS Chemical Health & Safety ACS Chem. Neurosci. ACS Comb. Sci. ACS Earth Space Chem. ACS Energy Lett. ACS Infect. Dis. ACS Macro Lett. ACS Mater. Lett. ACS Med. Chem. Lett. ACS Nano ACS Omega ACS Photonics ACS Sens. ACS Sustainable Chem. Eng. ACS Synth. Biol. Anal. Chem. BIOCHEMISTRY-US Bioconjugate Chem. BIOMACROMOLECULES Chem. Res. Toxicol. Chem. Rev. Chem. Mater. CRYST GROWTH DES ENERG FUEL Environ. Sci. Technol. Environ. Sci. Technol. Lett. Eur. J. Inorg. Chem. IND ENG CHEM RES Inorg. Chem. J. Agric. Food. Chem. J. Chem. Eng. Data J. Chem. Educ. J. Chem. Inf. Model. J. Chem. Theory Comput. J. Med. Chem. J. Nat. Prod. J PROTEOME RES J. Am. Chem. Soc. LANGMUIR MACROMOLECULES Mol. Pharmaceutics Nano Lett. Org. Lett. ORG PROCESS RES DEV ORGANOMETALLICS J. Org. Chem. J. Phys. Chem. J. Phys. Chem. A J. Phys. Chem. B J. Phys. Chem. C J. Phys. Chem. Lett. Analyst Anal. Methods Biomater. Sci. Catal. Sci. Technol. Chem. Commun. Chem. Soc. Rev. CHEM EDUC RES PRACT CRYSTENGCOMM Dalton Trans. Energy Environ. Sci. ENVIRON SCI-NANO ENVIRON SCI-PROC IMP ENVIRON SCI-WAT RES Faraday Discuss. Food Funct. Green Chem. Inorg. Chem. Front. Integr. Biol. J. Anal. At. Spectrom. J. Mater. Chem. A J. Mater. Chem. B J. Mater. Chem. C Lab Chip Mater. Chem. Front. Mater. Horiz. MEDCHEMCOMM Metallomics Mol. Biosyst. Mol. Syst. Des. Eng. Nanoscale Nanoscale Horiz. Nat. Prod. Rep. New J. Chem. Org. Biomol. Chem. Org. Chem. Front. PHOTOCH PHOTOBIO SCI PCCP Polym. Chem.
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1