首页 > 最新文献

Automation in Construction最新文献

英文 中文
Graph-driven embedding reinforcement and traceable LLM agent for reliable element alignment in construction report generation 图驱动的嵌入增强和可跟踪的LLM代理在施工报告生成中可靠的元素对齐
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-03 DOI: 10.1016/j.autcon.2026.106816
Zhenzhao Xia , Botao Zhong , Shuai Zhang , Tonghui Zhao , Miroslaw J. Skibniewski
Engineering report generation from construction-site Internet of Things (IoT) data using large language models (LLMs) remains challenging due to hallucinations. Ensuring traceability and reliability in information retrieval and multi-step reasoning is essential within retrieval-augmented generation (RAG) for LLM. This paper formalizes the RAG-LLM pipeline and proposes a dual-stream enhancement combining knowledge graph (KG) construction with reinforcement learning (RL)-based retriever tuning. The graph-guided module extracts structured engineering elements, while RL improves semantic alignment and tokenization of critical terms. Leveraging this dual-stream RAG, a traceable reporting agent is developed, providing end-to-end traceability of retrieval and reasoning, along with inter-step similarity measures. When collaborating with existing on-site IoT systems, the agent can extend automated monitoring to decision-making support. This paper presents a reliable approach for construction report generation and advances human-AI collaboration in construction management.
由于存在幻觉,使用大型语言模型(llm)从建筑工地物联网(IoT)数据生成工程报告仍然具有挑战性。在LLM的检索增强生成(RAG)中,确保信息检索和多步推理的可追溯性和可靠性至关重要。本文形式化了RAG-LLM管道,并提出了一种结合知识图(KG)构建和基于强化学习(RL)的检索器调谐的双流增强方法。图引导模块提取结构化工程元素,而强化学习改进了关键术语的语义对齐和标记化。利用这个双流RAG,开发了一个可跟踪的报告代理,提供端到端检索和推理的可跟踪性,以及步骤间的相似性度量。当与现有的现场物联网系统协作时,代理可以将自动监控扩展到决策支持。本文提出了一种可靠的施工报告生成方法,并推进了人工智能在施工管理中的协作。
{"title":"Graph-driven embedding reinforcement and traceable LLM agent for reliable element alignment in construction report generation","authors":"Zhenzhao Xia ,&nbsp;Botao Zhong ,&nbsp;Shuai Zhang ,&nbsp;Tonghui Zhao ,&nbsp;Miroslaw J. Skibniewski","doi":"10.1016/j.autcon.2026.106816","DOIUrl":"10.1016/j.autcon.2026.106816","url":null,"abstract":"<div><div>Engineering report generation from construction-site Internet of Things (IoT) data using large language models (LLMs) remains challenging due to hallucinations. Ensuring traceability and reliability in information retrieval and multi-step reasoning is essential within retrieval-augmented generation (RAG) for LLM. This paper formalizes the RAG-LLM pipeline and proposes a dual-stream enhancement combining knowledge graph (KG) construction with reinforcement learning (RL)-based retriever tuning. The graph-guided module extracts structured engineering elements, while RL improves semantic alignment and tokenization of critical terms. Leveraging this dual-stream RAG, a traceable reporting agent is developed, providing end-to-end traceability of retrieval and reasoning, along with inter-step similarity measures. When collaborating with existing on-site IoT systems, the agent can extend automated monitoring to decision-making support. This paper presents a reliable approach for construction report generation and advances human-AI collaboration in construction management.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106816"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146110225","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Digital twins in offsite construction: Current implementations, challenges, and future pathways 非现场建设中的数字孪生:当前实现、挑战和未来途径
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-02 DOI: 10.1016/j.autcon.2026.106820
Nima Moghimi, Oldouz Arshang, Farook Hamzeh
Digital Twin (DT) technology is emerging as a critical enabler for Off-Site Construction (OSC). However, current research remains fragmented. This paper synthesizes 50 publications using a mixed-methods approach, combining scientometric mapping with systematic qualitative analysis. Scientometric results reveal a bifurcated landscape, distinctly separating volumetric “Modular Construction” (logistics-focused) from component-based “Prefabrication” (geometry-focused). While applications in scheduling and monitoring are growing, widespread adoption is hindered by “Black-Box” AI opacity, data sovereignty issues, and fragmented standards. Furthermore, sustainability remains an implicit rather than explicit goal. The study concludes with a Strategic Research Roadmap charting the path toward autonomous ecosystems. It emphasizes the need for Neuro-symbolic AI, Operator 5.0 frameworks, and Digital Product Passports to bridge the gap between static monitoring and true Cognitive Digital Twins in OSC.
数字孪生(DT)技术正在成为非现场施工(OSC)的关键推动者。然而,目前的研究仍然是碎片化的。本文综合了50种出版物,采用混合方法,将科学计量制图与系统定性分析相结合。科学计量学的结果揭示了一个分叉的景观,明显地将体积“模块化结构”(以物流为重点)与基于组件的“预制”(以几何为重点)分开。虽然调度和监控方面的应用正在增长,但“黑箱”人工智能不透明、数据主权问题和支离破碎的标准阻碍了广泛采用。此外,可持续性仍然是一个隐含的目标,而不是明确的目标。该研究以战略研究路线图结束,绘制了通往自主生态系统的道路。它强调了对神经符号人工智能、操作员5.0框架和数字产品护照的需求,以弥合OSC中静态监控和真正的认知数字双胞胎之间的差距。
{"title":"Digital twins in offsite construction: Current implementations, challenges, and future pathways","authors":"Nima Moghimi,&nbsp;Oldouz Arshang,&nbsp;Farook Hamzeh","doi":"10.1016/j.autcon.2026.106820","DOIUrl":"10.1016/j.autcon.2026.106820","url":null,"abstract":"<div><div>Digital Twin (DT) technology is emerging as a critical enabler for Off-Site Construction (OSC). However, current research remains fragmented. This paper synthesizes 50 publications using a mixed-methods approach, combining scientometric mapping with systematic qualitative analysis. Scientometric results reveal a bifurcated landscape, distinctly separating volumetric “Modular Construction” (logistics-focused) from component-based “Prefabrication” (geometry-focused). While applications in scheduling and monitoring are growing, widespread adoption is hindered by “Black-Box” AI opacity, data sovereignty issues, and fragmented standards. Furthermore, sustainability remains an implicit rather than explicit goal. The study concludes with a Strategic Research Roadmap charting the path toward autonomous ecosystems. It emphasizes the need for Neuro-symbolic AI, Operator 5.0 frameworks, and Digital Product Passports to bridge the gap between static monitoring and true Cognitive Digital Twins in OSC.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106820"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146110849","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Computational Design Methods for geometry-driven upcycling of found objects in construction 建筑中发现物几何驱动升级回收的计算设计方法
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-04 DOI: 10.1016/j.autcon.2026.106803
Qiming Sun , Dominik Reisach , Silke Langenberg , Benjamin Dillenburger
Computational Design Methods (CDMs) have increasingly supported the use of Found Objects (FOs) for circular construction. These methods automate the geometric assignment of FOs to a target design, yet a comprehensive overview is lacking. In this context, this paper systematically reviews 142 publications on CDMs for upcycling FOs in construction. It categorizes existing workflows and identifies six key CDMs based on assignment logic and four geometric FO types. The review serves as a roadmap for future research and practical applications, aiding architects and engineers in informed decision-making. It emphasizes the potential of utilizing FOs’ inherent geometry as design drivers for economical and aesthetic architectural solutions. This paper also identifies challenges in scaling CDMs from prototypes to practical applications, such as structural performance and integration with existing workflows. Future research directions include developing AI-based methods, automating construction processes using CDMs, and advocating for sensitivity analysis to assess adaptability across design scenarios.
计算设计方法(CDMs)越来越多地支持在圆形建筑中使用发现对象(FOs)。这些方法自动地将fo分配到目标设计中,但缺乏全面的概述。在此背景下,本文系统地回顾了142篇关于清洁发展机制在建筑工程中升级利用的文献。它对现有工作流进行了分类,并根据分配逻辑和四种几何FO类型确定了六个关键cdm。该综述为未来的研究和实际应用提供了路线图,帮助建筑师和工程师做出明智的决策。它强调了利用FOs固有的几何形状作为经济和美学建筑解决方案的设计驱动力的潜力。本文还确定了将cdm从原型扩展到实际应用的挑战,例如结构性能和与现有工作流的集成。未来的研究方向包括开发基于人工智能的方法,使用cdm自动化施工过程,以及倡导敏感性分析以评估设计方案的适应性。
{"title":"Computational Design Methods for geometry-driven upcycling of found objects in construction","authors":"Qiming Sun ,&nbsp;Dominik Reisach ,&nbsp;Silke Langenberg ,&nbsp;Benjamin Dillenburger","doi":"10.1016/j.autcon.2026.106803","DOIUrl":"10.1016/j.autcon.2026.106803","url":null,"abstract":"<div><div>Computational Design Methods (CDMs) have increasingly supported the use of Found Objects (FOs) for circular construction. These methods automate the geometric assignment of FOs to a target design, yet a comprehensive overview is lacking. In this context, this paper systematically reviews 142 publications on CDMs for upcycling FOs in construction. It categorizes existing workflows and identifies six key CDMs based on assignment logic and four geometric FO types. The review serves as a roadmap for future research and practical applications, aiding architects and engineers in informed decision-making. It emphasizes the potential of utilizing FOs’ inherent geometry as design drivers for economical and aesthetic architectural solutions. This paper also identifies challenges in scaling CDMs from prototypes to practical applications, such as structural performance and integration with existing workflows. Future research directions include developing AI-based methods, automating construction processes using CDMs, and advocating for sensitivity analysis to assess adaptability across design scenarios.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106803"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146185056","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
GPS-free automated registration of UAV-captured façade image sequences to BIM using semantic key points 使用语义关键点将无人机捕获的farade图像序列自动注册到BIM中,无需gps
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-23 DOI: 10.1016/j.autcon.2026.106788
Cong Chen, Shenghan Zhang
Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.
无人驾驶飞行器(uav)已成为建筑物外观检查的重要工具。然而,由于立面上的重复模式,自动注册无人机拍摄的图像到建筑信息建模(BIM)模型,虽然对建筑维护很重要,仍然具有挑战性。现有的方法通常依赖于GPS数据,在城市环境中缺乏足够的精度。本文提出了一个无gps的自动化框架,通过利用重叠图像的信息将无人机捕获的图像序列注册到BIM模型。该框架包括三个关键部分:(1)使用ground SAM 2从图像中提取语义关键点;(2)实现虚拟无人机摄像机模型,实现BIM坐标与图像坐标之间关键点的双向投影;(3)开发粒子滤波运动模型,使用图像序列实现图像到bim的注册。该方法将各种数据类型注册到BIM模型中,包括重叠的视觉图像序列,红外(IR)-视觉对和farade缺陷。
{"title":"GPS-free automated registration of UAV-captured façade image sequences to BIM using semantic key points","authors":"Cong Chen,&nbsp;Shenghan Zhang","doi":"10.1016/j.autcon.2026.106788","DOIUrl":"10.1016/j.autcon.2026.106788","url":null,"abstract":"<div><div>Unmanned Aerial Vehicles (UAVs) have emerged as essential tools for building façade inspection. However, due to the repeating patterns on façades, automatically registering images taken by UAV to Building Information Modeling (BIM) models, though important for building maintenance, remains challenging. Existing methods often rely on GPS data, which lack sufficient accuracy in urban environments. This paper proposes a GPS-free automated framework to register UAV-captured image sequences to BIM models by leveraging information from overlapping images. The framework comprises three key components: (1) extracting semantic key points from images using the Grounded SAM 2; (2) implementing a virtual UAV camera model to enable bidirectional projection of key points between BIM coordinates and image coordinates; and (3) developing a particle filter motion model to achieve image-to-BIM registration using image sequences. The proposed method registers various data types to BIM models, including overlapping visual image sequences, infrared (IR)-visual pairs, and façade defects.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106788"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146033314","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Dependency-aware indoor 3D scene graph prediction via multimodal feature learning 基于多模态特征学习的依赖感知室内3D场景图预测
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-03 DOI: 10.1016/j.autcon.2026.106817
Shengnan Ke , Shibin Li , Jun Gong , Lingxiang Liu , Jianjun Luo , Bing Wang , Shengjun Tang
Accurate semantic understanding of indoor 3D point clouds is essential for constructing semantically rich architectural models and enabling component-level monitoring in smart building environments. This paper proposes a dependency-aware indoor 3D scene graph prediction framework that addresses two major limitations in existing methods. To address this, a Dependency-Aware Graph Reasoning Network (DAGRN) is introduced, integrating attention and message-passing mechanisms to learn context-dependent representations of objects and their relationships. Accordingly, a multimodal feature-enhanced learning module is proposed to align point cloud and image features and incorporate textual semantics from image–text models into a unified training scheme with triplet-level constraints ensuring semantic consistency. Extensive experiments on the 3RScan dataset demonstrate that the proposed method significantly outperforms existing approaches, achieving a 3.95% improvement in overall prediction metrics, laying a foundation for advanced semantic modeling in building automation.
在智能建筑环境中,对室内3D点云的准确语义理解对于构建语义丰富的建筑模型和实现组件级监控至关重要。本文提出了一种依赖感知的室内3D场景图预测框架,解决了现有方法的两个主要局限性。为了解决这个问题,引入了依赖感知图推理网络(DAGRN),集成了注意力和消息传递机制,以学习对象及其关系的上下文相关表示。为此,提出了一种多模态特征增强学习模块,将点云和图像特征对齐,并将图像-文本模型的文本语义整合到统一的训练方案中,采用三重约束保证语义一致性。在3RScan数据集上的大量实验表明,该方法显著优于现有方法,总体预测指标提高了3.95%,为建筑自动化领域的高级语义建模奠定了基础。
{"title":"Dependency-aware indoor 3D scene graph prediction via multimodal feature learning","authors":"Shengnan Ke ,&nbsp;Shibin Li ,&nbsp;Jun Gong ,&nbsp;Lingxiang Liu ,&nbsp;Jianjun Luo ,&nbsp;Bing Wang ,&nbsp;Shengjun Tang","doi":"10.1016/j.autcon.2026.106817","DOIUrl":"10.1016/j.autcon.2026.106817","url":null,"abstract":"<div><div>Accurate semantic understanding of indoor 3D point clouds is essential for constructing semantically rich architectural models and enabling component-level monitoring in smart building environments. This paper proposes a dependency-aware indoor 3D scene graph prediction framework that addresses two major limitations in existing methods. To address this, a Dependency-Aware Graph Reasoning Network (DAGRN) is introduced, integrating attention and message-passing mechanisms to learn context-dependent representations of objects and their relationships. Accordingly, a multimodal feature-enhanced learning module is proposed to align point cloud and image features and incorporate textual semantics from image–text models into a unified training scheme with triplet-level constraints ensuring semantic consistency. Extensive experiments on the 3RScan dataset demonstrate that the proposed method significantly outperforms existing approaches, achieving a 3.95% improvement in overall prediction metrics, laying a foundation for advanced semantic modeling in building automation.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106817"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146110227","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Efficient UAV trajectory optimization for fine-detailed 3D building reconstruction 面向精细三维建筑重建的高效无人机轨迹优化
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-14 DOI: 10.1016/j.autcon.2026.106775
Tianrui Shen, Lai Kang, Yingmei Wei, Shanshan Wan, Haixuan Wang, Chao Zuo
Using images captured by UAVs for high-fidelity 3D building reconstruction in architectural engineering is popular and effective nowadays; however, planning a flight trajectory that maximizes reconstruction quality with minimal flight time remains a critical challenge. This paper proposes a universal co-optimization framework that bridges reconstruction objectives with flight dynamics through an integrated planning paradigm. The proposed approach performs initial flight planning by solving a Traveling Salesman Problem over candidate viewpoints and updating them according to the unit-length contribution criterion. The adaptive radius is determined, and subsequently, the sphere-based corridor is constructed to enforce the trajectory passing all updated viewpoints within the corresponding spatial tolerances. Next, an optimal control problem is formulated and solved using a nonlinear solver to obtain the final flight trajectory satisfying both dynamic and safety constraints. Experimental comparisons with state-of-the-art methods on three public scenes and two real scenes captured by ourselves demonstrate that the proposed approach significantly improves flight efficiency, reducing travel distance and flight duration by approximately 10% to 40% with comparable or superior reconstruction quality.
利用无人机捕获的图像进行高保真的三维建筑重建是目前建筑工程中较为流行和有效的方法。然而,规划一个飞行轨迹,以最小的飞行时间最大限度地提高重建质量仍然是一个关键的挑战。本文提出了一个通用的协同优化框架,通过综合规划范式将重建目标与飞行动力学联系起来。该方法通过求解候选视点上的旅行推销员问题,并根据单位长度贡献准则对候选视点进行更新,从而实现初始飞行计划。确定自适应半径,然后构建基于球体的廊道,使轨迹在相应的空间容差范围内通过所有更新的视点。其次,建立了最优控制问题,并利用非线性求解器求解,得到了同时满足动力和安全约束的最终飞行轨迹。在三个公共场景和我们自己捕获的两个真实场景上与最先进的方法进行的实验比较表明,所提出的方法显着提高了飞行效率,将飞行距离和飞行时间减少了约10%至40%,并且具有相当或更好的重建质量。
{"title":"Efficient UAV trajectory optimization for fine-detailed 3D building reconstruction","authors":"Tianrui Shen,&nbsp;Lai Kang,&nbsp;Yingmei Wei,&nbsp;Shanshan Wan,&nbsp;Haixuan Wang,&nbsp;Chao Zuo","doi":"10.1016/j.autcon.2026.106775","DOIUrl":"10.1016/j.autcon.2026.106775","url":null,"abstract":"<div><div>Using images captured by UAVs for high-fidelity 3D building reconstruction in architectural engineering is popular and effective nowadays; however, planning a flight trajectory that maximizes reconstruction quality with minimal flight time remains a critical challenge. This paper proposes a universal co-optimization framework that bridges reconstruction objectives with flight dynamics through an integrated planning paradigm. The proposed approach performs initial flight planning by solving a Traveling Salesman Problem over candidate viewpoints and updating them according to the unit-length contribution criterion. The adaptive radius is determined, and subsequently, the sphere-based corridor is constructed to enforce the trajectory passing all updated viewpoints within the corresponding spatial tolerances. Next, an optimal control problem is formulated and solved using a nonlinear solver to obtain the final flight trajectory satisfying both dynamic and safety constraints. Experimental comparisons with state-of-the-art methods on three public scenes and two real scenes captured by ourselves demonstrate that the proposed approach significantly improves flight efficiency, reducing travel distance and flight duration by approximately 10% to 40% with comparable or superior reconstruction quality.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106775"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"145976062","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Construction site fall hazard identification and automated captioning using adapted vision-language models 使用自适应视觉语言模型的建筑工地坠落危险识别和自动字幕
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-19 DOI: 10.1016/j.autcon.2026.106790
Yongshuang Li, Feng Xu, Zhipeng Zhang, Xinyu Mei, He Huang
Falls are the primary safety hazard in construction, with traditional manual inspections being inefficient and error-prone, and existing computer vision methods lacking generalization in complex scenarios. This paper presents the Construction Safety Vision-Language Model (CS-VLM), a framework for construction site fall hazard identification and automated captioning, which integrates ModelScope Swift (MS-Swift) adapters and Low-Rank Adaptation (LoRA) technology for efficient fine-tuning of the Qwen2.5-7B-Instruct model. To support model training, a standardized image-text dataset for fall hazards is constructed using a Bidirectional Encoder Representations from Transformers (BERT) -based natural language conversion method. Experimental results demonstrate that CS-VLM achieves a Consensus-based Image Description Evaluation (CIDEr) score of 1.324, Semantic Propositional Image Caption Evaluation (SPICE) score of 0.391, and hazard identification F1-score of 90.2%, outperforming state-of-the-art methods in complex scenario adaptability while reducing computational costs. This research enables precise, standardized hazard description generation, facilitating proactive safety management and accident prevention in construction environments.
坠落是建筑施工中的主要安全隐患,传统的人工检查效率低下且容易出错,现有的计算机视觉方法在复杂场景下缺乏通用性。本文提出了建筑安全视觉语言模型(CS-VLM),这是一个用于建筑现场坠落危险识别和自动字幕的框架,它集成了ModelScope Swift (MS-Swift)适配器和低秩自适应(LoRA)技术,用于对qwen2.5 - 7b - directive模型进行有效微调。为了支持模型训练,使用基于变形金刚双向编码器表示(BERT)的自然语言转换方法构建了跌倒危险的标准化图像-文本数据集。实验结果表明,CS-VLM在基于共识的图像描述评价(CIDEr)得分为1.324,语义命题图像标题评价(SPICE)得分为0.391,危害识别f1得分为90.2%,在降低计算成本的同时,在复杂场景适应性方面优于现有方法。这项研究使精确、标准化的危险描述生成,促进建筑环境中的主动安全管理和事故预防。
{"title":"Construction site fall hazard identification and automated captioning using adapted vision-language models","authors":"Yongshuang Li,&nbsp;Feng Xu,&nbsp;Zhipeng Zhang,&nbsp;Xinyu Mei,&nbsp;He Huang","doi":"10.1016/j.autcon.2026.106790","DOIUrl":"10.1016/j.autcon.2026.106790","url":null,"abstract":"<div><div>Falls are the primary safety hazard in construction, with traditional manual inspections being inefficient and error-prone, and existing computer vision methods lacking generalization in complex scenarios. This paper presents the Construction Safety Vision-Language Model (CS-VLM), a framework for construction site fall hazard identification and automated captioning, which integrates ModelScope Swift (MS-Swift) adapters and Low-Rank Adaptation (LoRA) technology for efficient fine-tuning of the Qwen2.5-7B-Instruct model. To support model training, a standardized image-text dataset for fall hazards is constructed using a Bidirectional Encoder Representations from Transformers (BERT) -based natural language conversion method. Experimental results demonstrate that CS-VLM achieves a Consensus-based Image Description Evaluation (CIDEr) score of 1.324, Semantic Propositional Image Caption Evaluation (SPICE) score of 0.391, and hazard identification F1-score of 90.2%, outperforming state-of-the-art methods in complex scenario adaptability while reducing computational costs. This research enables precise, standardized hazard description generation, facilitating proactive safety management and accident prevention in construction environments.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106790"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000932","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Generic optimization of cross-layer pavement compaction quality using multi-domain intelligent compaction measurement values 基于多域智能压实测量值的跨层路面压实质量通用优化
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-19 DOI: 10.1016/j.autcon.2026.106764
Xuefei Wang , Jiaxue Yuan , Jiale Li , Jianmin Zhang , Guowei Ma
As a critical indicator for evaluating road compaction quality, the Intelligent Compaction Measurement Value (ICMV) still suffers from significant scene dependency and the absence of a unified material-structure coupled evaluation framework, particularly in cross-layer compaction assessment. This paper develops a multi-domain analytical framework that integrates vibration signal time, frequency, and time-frequency features based on field data collected from typical road structures, including soil subgrade, cement-stabilized base layer, and asphalt layers. Rolling pass tracking, compactness prediction modeling, and Shapley additive explanations (SHAP) are employed to identify the generic ICMV applicable to pavement structural layers. Furthermore, comparative analyses are conducted to examine the numerical characteristics and vibration response behaviors of the generic ICMV across various structural layers. Finally, a statistically driven stepwise method is applied to determine the engineering ranges of the generic ICMV, thereby establishing a theoretical paradigm for multi-layer intelligent compaction standards and contributing to the digital transformation of pavement engineering.
作为评价道路压实质量的重要指标,智能压实测量值(ICMV)存在着严重的场景依赖性和缺乏统一的材料-结构耦合评价框架的问题,特别是在跨层压实评价中。本文开发了一个多域分析框架,该框架基于从典型道路结构(包括土壤路基、水泥稳定基层和沥青层)收集的现场数据,集成了振动信号的时间、频率和时频特征。采用滚道跟踪、密实度预测模型和Shapley加性解释(SHAP)等方法,确定了适用于路面结构层的通用ICMV。在此基础上,对通用ICMV在不同结构层间的数值特性和振动响应行为进行了对比分析。最后,采用统计驱动的逐步方法确定通用ICMV的工程范围,从而建立多层智能压实标准的理论范式,为路面工程的数字化转型做出贡献。
{"title":"Generic optimization of cross-layer pavement compaction quality using multi-domain intelligent compaction measurement values","authors":"Xuefei Wang ,&nbsp;Jiaxue Yuan ,&nbsp;Jiale Li ,&nbsp;Jianmin Zhang ,&nbsp;Guowei Ma","doi":"10.1016/j.autcon.2026.106764","DOIUrl":"10.1016/j.autcon.2026.106764","url":null,"abstract":"<div><div>As a critical indicator for evaluating road compaction quality, the Intelligent Compaction Measurement Value (ICMV) still suffers from significant scene dependency and the absence of a unified material-structure coupled evaluation framework, particularly in cross-layer compaction assessment. This paper develops a multi-domain analytical framework that integrates vibration signal time, frequency, and time-frequency features based on field data collected from typical road structures, including soil subgrade, cement-stabilized base layer, and asphalt layers. Rolling pass tracking, compactness prediction modeling, and Shapley additive explanations (SHAP) are employed to identify the generic ICMV applicable to pavement structural layers. Furthermore, comparative analyses are conducted to examine the numerical characteristics and vibration response behaviors of the generic ICMV across various structural layers. Finally, a statistically driven stepwise method is applied to determine the engineering ranges of the generic ICMV, thereby establishing a theoretical paradigm for multi-layer intelligent compaction standards and contributing to the digital transformation of pavement engineering.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106764"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146000933","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
Distributed acoustic sensing-based real-time monitoring of far-field cracks in reinforced concrete bridge decks 基于分布式声传感的钢筋混凝土桥面远场裂缝实时监测
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-02-06 DOI: 10.1016/j.autcon.2026.106821
Yao Wang, Yi Bao
Monitoring cracks is critical for the safety and efficiency of the construction and operation of civil infrastructure. Distributed fiber optic sensors offer advantages for crack monitoring, but their applications are largely limited to near-field cracks. This paper presents an approach for in situ, real-time monitoring of far-field cracks using distributed acoustic sensing. The approach is developed through multi-physics modeling of a representative concrete highway bridge. The influence of key configuration parameters, including gauge length, channel spacing, and sampling rate, is evaluated for crack detection and localization. Results show that cracks located up to 6 m from a fiber optic cable are detected and localized with an average error of 0.94 m across 60 tests with varying crack scenarios and configurations. A cost-benefit analysis compares the proposed approach with state-of-the-art methods based on acoustic emission and distributed fiber optic sensing, demonstrating its benefits for far-field crack monitoring.
裂缝监测对民用基础设施建设和运行的安全性和效率至关重要。分布式光纤传感器为裂缝监测提供了优势,但其应用主要局限于近场裂缝。本文提出了一种利用分布式声传感技术对远场裂缝进行现场实时监测的方法。该方法是通过对一座具有代表性的混凝土公路桥进行多物理场建模而发展起来的。关键配置参数的影响,包括测量长度,通道间距,采样率,评估裂纹检测和定位。结果表明,在60次测试中,在不同的裂缝场景和结构下,检测到并定位了距离光缆6 m的裂缝,平均误差为0.94 m。成本效益分析将该方法与基于声发射和分布式光纤传感的最新方法进行了比较,证明了其在远场裂纹监测方面的优势。
{"title":"Distributed acoustic sensing-based real-time monitoring of far-field cracks in reinforced concrete bridge decks","authors":"Yao Wang,&nbsp;Yi Bao","doi":"10.1016/j.autcon.2026.106821","DOIUrl":"10.1016/j.autcon.2026.106821","url":null,"abstract":"<div><div>Monitoring cracks is critical for the safety and efficiency of the construction and operation of civil infrastructure. Distributed fiber optic sensors offer advantages for crack monitoring, but their applications are largely limited to near-field cracks. This paper presents an approach for in situ, real-time monitoring of far-field cracks using distributed acoustic sensing. The approach is developed through multi-physics modeling of a representative concrete highway bridge. The influence of key configuration parameters, including gauge length, channel spacing, and sampling rate, is evaluated for crack detection and localization. Results show that cracks located up to 6 m from a fiber optic cable are detected and localized with an average error of 0.94 m across 60 tests with varying crack scenarios and configurations. A cost-benefit analysis compares the proposed approach with state-of-the-art methods based on acoustic emission and distributed fiber optic sensing, demonstrating its benefits for far-field crack monitoring.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106821"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146134678","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
AI-driven decision support system for construction cost forecasting and consultation using optimized deep learning and language models 人工智能驱动的建筑成本预测和咨询决策支持系统,使用优化的深度学习和语言模型
IF 11.5 1区 工程技术 Q1 CONSTRUCTION & BUILDING TECHNOLOGY Pub Date : 2026-03-01 Epub Date: 2026-01-28 DOI: 10.1016/j.autcon.2026.106797
Jui-Sheng Chou, Mei-Yuan Lin, Nguyen-Ngan-Hanh Pham
Fluctuations in construction material prices significantly affect project budgets and bidding strategies via the Construction Cost Index (CCI). This paper develops an AI-driven decision-support system for construction cost forecasting and consultation, integrating deep learning and Large Language Models (LLMs) to enable intelligent CCI prediction. A multi-source data framework combines historical CCI records, macroeconomic indicators, and sentiment extracted from Traditional Chinese construction news. Time-series forecasting employs an Extended Long Short-Term Memory (xLSTM) network, while sentiment models are fine-tuned using Quantized Low-Rank Adaptation (QLoRA). Model hyperparameters for both the QLoRA-fine-tuned LLMs and the xLSTM forecasting models are optimized via the Pilgrimage Walk Optimization (PWO) algorithm, yielding two horizon-specific configurations for short- and medium-term forecasting. Experimental results demonstrate that integrating sentiment features and PWO-based tuning consistently improves forecasting accuracy relative to baseline models. The deployed platform integrates CCI forecasting, sentiment analytics, and retrieval-augmented consultation to provide interpretable forecasts that enhance cost control and decision-making in construction management.
建筑材料价格的波动通过建筑成本指数(CCI)显著影响项目预算和投标策略。本文开发了一个人工智能驱动的工程造价预测与咨询决策支持系统,将深度学习和大语言模型(llm)相结合,实现CCI的智能预测。一个多源数据框架结合了历史CCI记录、宏观经济指标和从中国传统建筑新闻中提取的情绪。时间序列预测采用扩展长短期记忆(xLSTM)网络,而情绪模型则使用量化低秩自适应(QLoRA)进行微调。qlora微调llm和xLSTM预测模型的模型超参数通过朝圣行走优化(ppo)算法进行优化,得到两种特定水平的短期和中期预测配置。实验结果表明,与基线模型相比,整合情感特征和基于pw的调优一致地提高了预测精度。部署的平台集成了CCI预测、情感分析和检索增强咨询,以提供可解释的预测,从而加强施工管理中的成本控制和决策。
{"title":"AI-driven decision support system for construction cost forecasting and consultation using optimized deep learning and language models","authors":"Jui-Sheng Chou,&nbsp;Mei-Yuan Lin,&nbsp;Nguyen-Ngan-Hanh Pham","doi":"10.1016/j.autcon.2026.106797","DOIUrl":"10.1016/j.autcon.2026.106797","url":null,"abstract":"<div><div>Fluctuations in construction material prices significantly affect project budgets and bidding strategies via the Construction Cost Index (CCI). This paper develops an AI-driven decision-support system for construction cost forecasting and consultation, integrating deep learning and Large Language Models (LLMs) to enable intelligent CCI prediction. A multi-source data framework combines historical CCI records, macroeconomic indicators, and sentiment extracted from Traditional Chinese construction news. Time-series forecasting employs an Extended Long Short-Term Memory (xLSTM) network, while sentiment models are fine-tuned using Quantized Low-Rank Adaptation (QLoRA). Model hyperparameters for both the QLoRA-fine-tuned LLMs and the xLSTM forecasting models are optimized via the Pilgrimage Walk Optimization (PWO) algorithm, yielding two horizon-specific configurations for short- and medium-term forecasting. Experimental results demonstrate that integrating sentiment features and PWO-based tuning consistently improves forecasting accuracy relative to baseline models. The deployed platform integrates CCI forecasting, sentiment analytics, and retrieval-augmented consultation to provide interpretable forecasts that enhance cost control and decision-making in construction management.</div></div>","PeriodicalId":8660,"journal":{"name":"Automation in Construction","volume":"183 ","pages":"Article 106797"},"PeriodicalIF":11.5,"publicationDate":"2026-03-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"146071938","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
引用次数: 0
期刊
Automation in Construction
全部 Acta Oceanolog. Sin. COMP BIOCHEM PHYS C Engineering Science and Technology, an International Journal Acta Geochimica Espacio Tiempo y Forma. Serie VI, Geografía European Journal of Biological Research FOLIA PHONIATR LOGO ECOLOGY Clean Technol. Environ. Policy Int. J. Climatol. Aquat. Geochem. Clean-Soil Air Water Études Caribéennes Ann. Phys. Geosci. Model Dev. EVOL MED PUBLIC HLTH ACTA PETROL SIN ACTA GEOL POL J. Adv. Model. Earth Syst. Environ. Chem. Conserv. Biol. Energy Storage Environmental Control in Biology Phys. Chem. Miner. 2013 Abstracts IEEE International Conference on Plasma Science (ICOPS) Chin. Phys. B Environmental Sustainability J. Cosmol. Astropart. Phys. Pure Appl. Geophys. Adv. Meteorol. Ecol. Indic. Clim. Change EXPERT REV ANTICANC J. Nanophotonics Contrib. Mineral. Petrol. Appl. Geochem. ACTA GEOL SIN-ENGL European Journal of Chemistry Geostand. Geoanal. Res. GROUNDWATER ECOTOXICOLOGY [1993] Proceedings Eighth Annual IEEE Symposium on Logic in Computer Science Eurasian Physical Technical Journal Entomologisk tidskrift Geochim. Cosmochim. Acta EUR PSYCHIAT Stud. Geophys. Geod. Environ. Mol. Mutagen. J. Atmos. Chem. J PHYS D APPL PHYS EXPERT REV RESP MED ARCHAEOMETRY Exp. Anim. Enzyme Research EYE VISION Expert Opin. Orphan Drugs Hydrol. Earth Syst. Sci. ERN: Other Microeconomics: General Equilibrium & Disequilibrium Models of Financial Markets (Topic) Gondwana Res. Hydrogeol. J. ERN: Stock Market Risk (Topic) APL Photonics Nat. Geosci. Q. J. Eng. Geol. Hydrogeol. ARCH ACOUST Energy Environ. Communications Earth & Environment Ecol. Processes Environ. Geochem. Health Environ. Technol. Innovation EUROSURVEILLANCE Appl. Clay Sci. Journal of Environmental Accounting and Management Ecol. Monogr. CRIT REV ENV SCI TEC Ecol. Eng. Org. Geochem. Energy Ecol Environ Conserv. Genet. Resour. ENVIRONMENT Int. J. Biometeorol. Asia-Pac. J. Atmos. Sci. Carbon Balance Manage. Environ. Prot. Eng. ECOSYSTEMS Atmos. Meas. Tech. Environ. Prog. Sustainable Energy J. Hydrol. ENG SANIT AMBIENT Environ. Eng. Manage. J. J. Mol. Spectrosc. Geobiology Am. Mineral. Ecol. Res. ENVIRON HEALTH-GLOB Aust. J. Earth Sci. BIOGEOSCIENCES Environ. Toxicol. Pharmacol. Ann. Glaciol. ECOL RESTOR
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1