Objective
This study aimed to develop a Nursing Retrieval-Augmented Generation (NurRAG) system based on large language models (LLMs) and to evaluate its accuracy and clinical applicability in nursing question answering.
Methods
A multidisciplinary team consisting of nursing experts, artificial intelligence researchers, and information engineers collaboratively designed the NurRAG framework following the principles of retrieval-augmented generation. The system included four functional modules: 1) construction of a nursing knowledge base through document normalization, embedding, and vector indexing; 2) nursing question filtering using a supervised classifier; 3) semantic retrieval and re-ranking for evidence selection; and 4) evidence-conditioned language model generation to produce citation-based nursing answers. The system was securely deployed on hospital intranet servers using Docker containers. Performance evaluation was conducted with 1,000 expert-verified nursing question–answer pairs. Semantic fidelity was assessed using Recall Oriented Understudy for Gisting Evaluation – Longest Common Subsequence (ROUGE-L), and clinical correctness was measured using Accuracy.
Results
The NurRAG system achieved significant improvements in both semantic fidelity and answer accuracy compared with conventional large language models. For ChatGLM2-6B, ROUGE-L increased from (30.73 ± 1.48) % to (64.27 ± 0.27) %, and accuracy increased from (49.08 ± 0.92) % to (75.83 ± 0.35) %. For LLaMA2-7B, ROUGE-L increased from (28.76 ± 0.89) % to (60.33 ± 0.21) %, and accuracy increased from (43.27 ± 0.83) % to (73.29 ± 0.33) %. All differences were statistically significant (P < 0.001). A quantitative case analysis further demonstrated that NurRAG effectively reduced hallucinated outputs and generated evidence-based, guideline-concordant nursing responses.
Conclusion
The NurRAG system integrates domain-specific retrieval with LLMs generation to provide accurate, reliable, and traceable evidence-based nursing answers. The findings demonstrate the system’s feasibility and potential to improve the accuracy of clinical knowledge access, support evidence-based nursing decision-making, and promote the safe application of artificial intelligence in nursing practice.
扫码关注我们
求助内容:
应助结果提醒方式:
