Objective
Given the substantial advancements in Large Language Models (LLMs), this study aimed to explore the effectiveness of using AI-generated medical diagnoses in the fine-tuning of the Llama-2 model, with the objective of optimizing the ICD10 coding process for gynecologic oncology. This study aimed to fine-tune the Llama-2-13B model using AI-generated diagnostic texts based on ICD10 descriptors, focusing on gynecologic oncology for initial validation.
Materials and methods
AI-generated diagnostic texts were rigorously confirmed to ensure medical coherence and reliability for fine-tuning. Four models were established: The original Llama-2-13B (Model 1); a model fine-tuned with basic ICD10 codes (Model 2); a model trained with an additional set of 10 AI-generated diagnosis statements per ICD10 code (Model 3); and the forth model trained with an additional set of 20 AI-generated statements per code (Model 4). Validation involved a set of 83 discharge records related to gynecologic oncology, derived from 2415 discharge records collected from January 1, 2020, and June 30, 2023.
Results
Validation results for the models showed significant improvement in the accuracy rates and Kappa scores: Model 1 (native Llama-2-13B) had an accuracy of 0.06 and a Kappa score of 0.04, Model 2 achieved 0.24 and 0.19, Model 3 reached 0.90 and 0.89, and Model 4 greatly improved to 0.95 and 0.94.
Conclusion
The use of prompts to generate diagnostic descriptions, coupled with AI-generated data for model fine-tuning, resulted in a substantial enhancement in the Llama-2-13B model’s capability to accurately determine ICD diagnostic codes from medical records. This methodology offers a cost-effective strategy, optimizes model accuracy, and underscores the potential for broader applications due to the LLM’s generative capabilities.
扫码关注我们
求助内容:
应助结果提醒方式:
