Background: The performance of GPT-4 in nursing examinations within the Chinese context has not yet been thoroughly evaluated.
Objective: To assess the performance of GPT-4 on multiple-choice and open-ended questions derived from nursing examinations in the Chinese context.
Methods: The data sets of the Chinese National Nursing Licensure Examination spanning 2021 to 2023 were used to evaluate the accuracy of GPT-4 in multiple-choice questions. The performance of GPT-4 on open-ended questions was examined using 18 case-based questions.
Results: For multiple-choice questions, GPT-4 achieved an accuracy of 71.0% (511/720). For open-ended questions, the responses were evaluated for cosine similarity, logical consistency, and information quality, all of which were found to be at a moderate level.
Conclusion: GPT-4 performed well at addressing queries on basic knowledge. However, it has notable limitations in answering open-ended questions. Nursing educators should weigh the benefits and challenges of GPT-4 for integration into nursing education.