Empowering LLMs by hybrid retrieval-augmented generation for domain-centric Q&A in smart manufacturing

IF 8 1区工程技术 Q1 COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE Advanced Engineering Informatics Pub Date : 2025-02-22 DOI:10.1016/j.aei.2025.103212

Yuwei Wan , Zheyuan Chen , Ying Liu , Chong Chen , Michael Packianather

{"title":"Empowering LLMs by hybrid retrieval-augmented generation for domain-centric Q&A in smart manufacturing","authors":"Yuwei Wan , Zheyuan Chen , Ying Liu , Chong Chen , Michael Packianather","doi":"10.1016/j.aei.2025.103212","DOIUrl":null,"url":null,"abstract":"<div><div>Large language models (LLMs) have shown remarkable performances in generic question-answering (QA) but often suffer from domain gaps and outdated knowledge in smart manufacturing (SM). Retrieval-augmented generation (RAG) based on LLMs has emerged as a potential approach by incorporating an external knowledge base. However, conventional vector-based RAG delivers rapid responses but often returns contextually vague results, while knowledge graph (KG)-based methods offer structured relational reasoning at the expense of scalability and efficiency. To address these challenges, a hybrid KG-Vector RAG framework that systematically integrates structured KG metadata with unstructured vector retrieval is proposed. Firstly, a metadata-enriched KG was constructed from domain corpora by systematically extracting and indexing structured information to capture essential domain-specific relationships. Secondly, semantic alignment was achieved by injecting domain-specific constraints to refine and enhance the contextual relevance of the knowledge representations. Lastly, a layered hybrid retrieval strategy was employed that combined the explicit reasoning capabilities of the KG with the efficient search power of vector-based similarity methods, and the resulting outputs were integrated via prompt engineering to generate comprehensive, context-aware responses. Evaluated on design for additive manufacturing (DfAM) tasks, the proposed approach achieved 77.8% exact match accuracy and 76.5% context precision. This study establishes a new paradigm for industrial LLM systems, which demonstrates that hybrid symbolic-neural architectures can overcome the precision-scalability trade-off in mission-critical manufacturing applications. Experimental results indicated that integrating structured KG information with vector-based retrieval and prompt engineering can enhance retrieval accuracy, contextual relevance, and efficiency in LLM-based Q&A systems for SM.</div></div>","PeriodicalId":50941,"journal":{"name":"Advanced Engineering Informatics","volume":"65 ","pages":"Article 103212"},"PeriodicalIF":8.0000,"publicationDate":"2025-02-22","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Advanced Engineering Informatics","FirstCategoryId":"5","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S1474034625001053","RegionNum":1,"RegionCategory":"工程技术","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"COMPUTER SCIENCE, ARTIFICIAL INTELLIGENCE","Score":null,"Total":0}

引用次数: 0

Abstract

Large language models (LLMs) have shown remarkable performances in generic question-answering (QA) but often suffer from domain gaps and outdated knowledge in smart manufacturing (SM). Retrieval-augmented generation (RAG) based on LLMs has emerged as a potential approach by incorporating an external knowledge base. However, conventional vector-based RAG delivers rapid responses but often returns contextually vague results, while knowledge graph (KG)-based methods offer structured relational reasoning at the expense of scalability and efficiency. To address these challenges, a hybrid KG-Vector RAG framework that systematically integrates structured KG metadata with unstructured vector retrieval is proposed. Firstly, a metadata-enriched KG was constructed from domain corpora by systematically extracting and indexing structured information to capture essential domain-specific relationships. Secondly, semantic alignment was achieved by injecting domain-specific constraints to refine and enhance the contextual relevance of the knowledge representations. Lastly, a layered hybrid retrieval strategy was employed that combined the explicit reasoning capabilities of the KG with the efficient search power of vector-based similarity methods, and the resulting outputs were integrated via prompt engineering to generate comprehensive, context-aware responses. Evaluated on design for additive manufacturing (DfAM) tasks, the proposed approach achieved 77.8% exact match accuracy and 76.5% context precision. This study establishes a new paradigm for industrial LLM systems, which demonstrates that hybrid symbolic-neural architectures can overcome the precision-scalability trade-off in mission-critical manufacturing applications. Experimental results indicated that integrating structured KG information with vector-based retrieval and prompt engineering can enhance retrieval accuracy, contextual relevance, and efficiency in LLM-based Q&A systems for SM.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

求助全文

约1分钟内获得全文去求助

来源期刊

Advanced Engineering Informatics 工程技术-工程：综合

CiteScore

12.40

自引率

18.20%

发文量

292

审稿时长

45 days

期刊介绍： Advanced Engineering Informatics is an international Journal that solicits research papers with an emphasis on 'knowledge' and 'engineering applications'. The Journal seeks original papers that report progress in applying methods of engineering informatics. These papers should have engineering relevance and help provide a scientific base for more reliable, spontaneous, and creative engineering decision-making. Additionally, papers should demonstrate the science of supporting knowledge-intensive engineering tasks and validate the generality, power, and scalability of new methods through rigorous evaluation, preferably both qualitatively and quantitatively. Abstracting and indexing for Advanced Engineering Informatics include Science Citation Index Expanded, Scopus and INSPEC.