Xingming Liao , Chong Chen , Zhuowei Wang , Ying Liu , Tao Wang , Lianglun Cheng
{"title":"Large language model assisted fine-grained knowledge graph construction for robotic fault diagnosis","authors":"Xingming Liao , Chong Chen , Zhuowei Wang , Ying Liu , Tao Wang , Lianglun Cheng","doi":"10.1016/j.aei.2025.103134","DOIUrl":null,"url":null,"abstract":"<div><div>With the rapid deployment of industrial robots in manufacturing, the demand for advanced maintenance techniques to sustain operational efficiency has become crucial. Fault diagnosis Knowledge Graph (KG) is essential as it interlinks multi-source data related to industrial robot faults, capturing multi-level semantic associations among different fault events. However, the construction and application of fine-grained fault diagnosis KG face significant challenges due to the inherent complexity of nested entities in maintenance texts and the severe scarcity of annotated industrial data. In this study, we propose a Large Language Model (LLM) assisted data augmentation approach, which handles the complex nested entities in maintenance corpora and constructs a more fine-grained fault diagnosis KG. Firstly, the fine-grained ontology is constructed via LLM Assistance in Industrial Nested Named Entity Recognition (assInNNER). Then, an Industrial Nested Label Classification Template (INCT) is designed, enabling the use of nested entities in Attention-map aware keyword selection for the Industrial Nested Language Model (ANLM) data augmentation methods. ANLM can effectively improve the model’s performance in nested entity extraction when corpora are scarce. Subsequently, a Confidence Filtering Mechanism (CFM) is introduced to evaluate and select the generated data for enhancement, and assInNNER is further deployed to recall the negative samples corpus again to further improve performance. Experimental studies based on multi-source corpora demonstrate that compared to existing algorithms, our method achieves an average F1 increase of 8.25 %, 3.31 %, and 1.96 % in 5%, 10 %, and 25 % in few-shot settings, respectively.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103134"},"PeriodicalIF":8.0000,"publicationDate":"2025-01-23","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625000278","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}
引用次数: 0
Abstract
With the rapid deployment of industrial robots in manufacturing, the demand for advanced maintenance techniques to sustain operational efficiency has become crucial. Fault diagnosis Knowledge Graph (KG) is essential as it interlinks multi-source data related to industrial robot faults, capturing multi-level semantic associations among different fault events. However, the construction and application of fine-grained fault diagnosis KG face significant challenges due to the inherent complexity of nested entities in maintenance texts and the severe scarcity of annotated industrial data. In this study, we propose a Large Language Model (LLM) assisted data augmentation approach, which handles the complex nested entities in maintenance corpora and constructs a more fine-grained fault diagnosis KG. Firstly, the fine-grained ontology is constructed via LLM Assistance in Industrial Nested Named Entity Recognition (assInNNER). Then, an Industrial Nested Label Classification Template (INCT) is designed, enabling the use of nested entities in Attention-map aware keyword selection for the Industrial Nested Language Model (ANLM) data augmentation methods. ANLM can effectively improve the model’s performance in nested entity extraction when corpora are scarce. Subsequently, a Confidence Filtering Mechanism (CFM) is introduced to evaluate and select the generated data for enhancement, and assInNNER is further deployed to recall the negative samples corpus again to further improve performance. Experimental studies based on multi-source corpora demonstrate that compared to existing algorithms, our method achieves an average F1 increase of 8.25 %, 3.31 %, and 1.96 % in 5%, 10 %, and 25 % in few-shot settings, respectively.
期刊介绍:
Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.