An integrated approach for automatic safety inspection in construction: Domain knowledge with multimodal large language model

IF 9.9 1区 工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Advanced Engineering Informatics Pub Date : 2025-03-13 DOI:10.1016/j.aei.2025.103246
Yiheng Wang, Hanbin Luo, Weili Fang
{"title":"An integrated approach for automatic safety inspection in construction: Domain knowledge with multimodal large language model","authors":"Yiheng Wang,&nbsp;Hanbin Luo,&nbsp;Weili Fang","doi":"10.1016/j.aei.2025.103246","DOIUrl":null,"url":null,"abstract":"<div><div>This research addresses the challenge of dynamically integrating visual and textual data in construction safety inspections while enhancing adaptability to new safety hazards and ensuring faithful interpretation of safety rules. We propose a novel approach that seamlessly combines multi-modal techniques with domain knowledge, advancing beyond current methods that often struggle with multi-modal understanding and adaptation to new safety hazards. Our approach consists of three key components: (1) a fine-tuned multi-modal LLM for visual and textual processing, (2) a domain knowledge base for evolving safety standards adaptability and output faithfulness, and (3) a multi-step reasoning engine to tackle complex safety inspection tasks. We validate our approach using on-site data from Wuhan subway construction sites, demonstrating its capability to perform moderately accurate (0.57 hazard identification correctness), contextually relevant (0.96 on task relevancy), and faithful safety assessments (0.95 and 0.99 on reasoning faithfulness). The results suggest promising performance in construction scene perception, as well as textual analysis and reasoning. This approach represents an advancement in automatic construction safety inspection and contributes to the broader discourse on formalizing multi-modal processing of construction data, offering insights into creating more flexible and comprehensive safety management systems.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103246"},"PeriodicalIF":9.9000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625001399","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0

Abstract

This research addresses the challenge of dynamically integrating visual and textual data in construction safety inspections while enhancing adaptability to new safety hazards and ensuring faithful interpretation of safety rules. We propose a novel approach that seamlessly combines multi-modal techniques with domain knowledge, advancing beyond current methods that often struggle with multi-modal understanding and adaptation to new safety hazards. Our approach consists of three key components: (1) a fine-tuned multi-modal LLM for visual and textual processing, (2) a domain knowledge base for evolving safety standards adaptability and output faithfulness, and (3) a multi-step reasoning engine to tackle complex safety inspection tasks. We validate our approach using on-site data from Wuhan subway construction sites, demonstrating its capability to perform moderately accurate (0.57 hazard identification correctness), contextually relevant (0.96 on task relevancy), and faithful safety assessments (0.95 and 0.99 on reasoning faithfulness). The results suggest promising performance in construction scene perception, as well as textual analysis and reasoning. This approach represents an advancement in automatic construction safety inspection and contributes to the broader discourse on formalizing multi-modal processing of construction data, offering insights into creating more flexible and comprehensive safety management systems.
查看原文
分享 分享
微信好友 朋友圈 QQ好友 复制链接
本刊更多论文
建筑工程安全自动检测的集成方法:多模态大语言模型的领域知识
本研究解决了在建筑安全检查中动态整合视觉和文本数据的挑战,同时增强了对新的安全隐患的适应性,并确保对安全规则的忠实解释。我们提出了一种将多模态技术与领域知识无缝结合的新方法,超越了目前经常在多模态理解和适应新安全隐患方面遇到困难的方法。我们的方法由三个关键部分组成:(1)用于视觉和文本处理的微调多模态LLM,(2)用于不断发展的安全标准适应性和输出可靠性的领域知识库,以及(3)用于处理复杂安全检查任务的多步骤推理引擎。我们使用武汉地铁施工现场的数据验证了我们的方法,证明了它能够执行中等准确(0.57的危险识别正确性),情境相关(0.96的任务相关性)和忠实的安全评估(0.95和0.99的推理信度)。结果表明,该系统在建筑场景感知、文本分析和推理方面表现良好。这种方法代表了自动施工安全检查的进步,有助于更广泛地讨论形式化施工数据的多模式处理,为创建更灵活和全面的安全管理系统提供见解。
本文章由计算机程序翻译,如有差异,请以英文原文为准。
求助全文
约1分钟内获得全文 去求助
来源期刊
Advanced Engineering Informatics
Advanced Engineering Informatics 工程技术-工程:综合
CiteScore
12.40
自引率
18.20%
发文量
292
审稿时长
45 days
期刊介绍: Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.
期刊最新文献
IDS-Net: A novel framework for few-shot photovoltaic power prediction with interpretable dynamic selection and feature information fusion How does contextual fidelity impact how we think, talk, and act in AI-assisted engineering design? An improved penalty kriging method for mixed qualitative and quantitative factors Hybrid-sequence self-learning model: Unsupervised anomaly detection and localization in multivariate time series Fractional-order derivative polynomial grey particle filtering for milling tool remaining useful life prediction
×
引用
GB/T 7714-2015
复制
MLA
复制
APA
复制
导出至
BibTeX EndNote RefMan NoteFirst NoteExpress
×
×
提示
您的信息不完整,为了账户安全,请先补充。
现在去补充
×
提示
您因"违规操作"
具体请查看互助需知
我知道了
×
提示
现在去查看 取消
×
提示
确定
0
微信
客服QQ
Book学术公众号 扫码关注我们
反馈
×
意见反馈
请填写您的意见或建议
请填写您的手机或邮箱
已复制链接
已复制链接
快去分享给好友吧!
我知道了
×
扫码分享
扫码分享
Book学术官方微信
Book学术文献互助
Book学术文献互助群
群 号:604180095
Book学术
文献互助 智能选刊 最新文献 互助须知 联系我们:info@booksci.cn
Book学术提供免费学术资源搜索服务,方便国内外学者检索中英文文献。致力于提供最便捷和优质的服务体验。
Copyright © 2023 Book学术 All rights reserved.
ghs 京公网安备 11010802042870号 京ICP备2023020795号-1