Introduction: Androgenetic alopecia (AGA) is a common cause of hair loss worldwide. Accurate patient education may improve treatment adherence and outcomes.
Objective: To compare the accuracy, readability, and user experience of ChatGPT 4.0, Gemini 1.5 Flash, and Deepseek R1 in answering common patient questions about AGA.
Methods: In February 2025, a cross-sectional study was conducted using 12 frequently asked patient questions on AGA, sourced from online platforms. The questions were submitted to ChatGPT 4.0, Gemini 1.5 Flash, and Deepseek R1. Two dermatologists independently assessed responses using a validated 4-point accuracy scale. Readability was measured with the Flesch-Kincaid Grade Level and Flesch Reading Ease Score. User experience was evaluated based on response speed, presence of visual aids, citation usage, and overall satisfaction. Inter-rater reliability was analyzed via Cohen's kappa, and statistical comparisons were made between models.
Results: ChatGPT 4.0 and Gemini 1.5 Flash successfully answered all 12 questions, with most responses rated as "satisfactory with minimal corrections." Deepseek R1 answered only five questions and frequently provided inaccurate content, especially when differentiating between AGA and cicatricial alopecia. It also lacked warnings about potential misinformation. Gemini 1.5 Flash included visual aids and citations, improving interpretability. All models generated responses at a high school reading level. In terms of user experience, ChatGPT 4.0 and Gemini 1.5 Flash outperformed Deepseek R1.
Conclusions: ChatGPT 4.0 and Gemini 1.5 Flash provided accurate, readable, and user-friendly responses on AGA-related questions, making them promising tools for patient education under physician guidance. Deepseek R1's limitations highlight the need for cautious implementation.
扫码关注我们
求助内容:
应助结果提醒方式:
