{"title":"An integrated approach for automatic safety inspection in construction: Domain knowledge with multimodal large language model","authors":"Yiheng Wang, Hanbin Luo, Weili Fang","doi":"10.1016/j.aei.2025.103246","DOIUrl":null,"url":null,"abstract":"<div><div>This research addresses the challenge of dynamically integrating visual and textual data in construction safety inspections while enhancing adaptability to new safety hazards and ensuring faithful interpretation of safety rules. We propose a novel approach that seamlessly combines multi-modal techniques with domain knowledge, advancing beyond current methods that often struggle with multi-modal understanding and adaptation to new safety hazards. Our approach consists of three key components: (1) a fine-tuned multi-modal LLM for visual and textual processing, (2) a domain knowledge base for evolving safety standards adaptability and output faithfulness, and (3) a multi-step reasoning engine to tackle complex safety inspection tasks. We validate our approach using on-site data from Wuhan subway construction sites, demonstrating its capability to perform moderately accurate (0.57 hazard identification correctness), contextually relevant (0.96 on task relevancy), and faithful safety assessments (0.95 and 0.99 on reasoning faithfulness). The results suggest promising performance in construction scene perception, as well as textual analysis and reasoning. This approach represents an advancement in automatic construction safety inspection and contributes to the broader discourse on formalizing multi-modal processing of construction data, offering insights into creating more flexible and comprehensive safety management systems.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103246"},"PeriodicalIF":9.9000,"publicationDate":"2025-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625001399","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
This research addresses the challenge of dynamically integrating visual and textual data in construction safety inspections while enhancing adaptability to new safety hazards and ensuring faithful interpretation of safety rules. We propose a novel approach that seamlessly combines multi-modal techniques with domain knowledge, advancing beyond current methods that often struggle with multi-modal understanding and adaptation to new safety hazards. Our approach consists of three key components: (1) a fine-tuned multi-modal LLM for visual and textual processing, (2) a domain knowledge base for evolving safety standards adaptability and output faithfulness, and (3) a multi-step reasoning engine to tackle complex safety inspection tasks. We validate our approach using on-site data from Wuhan subway construction sites, demonstrating its capability to perform moderately accurate (0.57 hazard identification correctness), contextually relevant (0.96 on task relevancy), and faithful safety assessments (0.95 and 0.99 on reasoning faithfulness). The results suggest promising performance in construction scene perception, as well as textual analysis and reasoning. This approach represents an advancement in automatic construction safety inspection and contributes to the broader discourse on formalizing multi-modal processing of construction data, offering insights into creating more flexible and comprehensive safety management systems.
期刊介绍:
Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.