Xiao-Wen Tan, Wen-Fang Chen, Na-Na Wang, Hui-Yu Li, Juan Li, Yu-Mei Cao, Meng-Qi Zhu, Kun Li, Ting-Ling Zhang, Dian Fu
{"title":"[中国不同大语言模式对 PCa 相关围手术期护理和健康教育咨询的响应效率]。","authors":"Xiao-Wen Tan, Wen-Fang Chen, Na-Na Wang, Hui-Yu Li, Juan Li, Yu-Mei Cao, Meng-Qi Zhu, Kun Li, Ting-Ling Zhang, Dian Fu","doi":"","DOIUrl":null,"url":null,"abstract":"<p><strong>Objective: </strong>To evaluate the efficiency of the four domestic language models, ERNIE Bot, ChatGLM2, Spark Desk and Qwen-14B-Chat, all with a massive user base and significant social attention, in response to consultations about PCa-related perioperative nursing and health education.</p><p><strong>Methods: </strong>We designed a questionnaire that includes 15 questions commonly concerned by patients undergoing radical prostatectomy and 2 common nursing cases, and inputted the questions into each of the four language models for simulation consultation. Three nursing experts assessed the model responses based on a pre-designed Likert 5-point scale in terms of accuracy, comprehensiveness, understandability, humanistic care, and case analysis. We evaluated and compared the performance of the four models using visualization tools and statistical analyses.</p><p><strong>Results: </strong>All the models generated high-quality texts with no misleading information and exhibited satisfactory performance. Qwen-14B-Chat scored the highest in all aspects and showed relatively stable outputs in multiple tests compared with ChatGLM2. Spark Desk performed well in terms of understandability but lacked comprehensiveness and humanistic care. Both Qwen-14B-Chat and ChatGLM2 demonstrated excellent performance in case analysis. The overall performance of ERNIE Bot was slightly inferior. All things considered, Qwen-14B-Chat was superior to the other three models in consultations about PCa-related perioperative nursing and health education.</p><p><strong>Conclusion: </strong>In PCa-related perioperative nursing, large language models represented by Qwen-14B-Chat are expected to become powerful auxiliary tools to provide patients with more medical expertise and information support, so as to improve the patient compliance and the quality of clinical treatment and nursing.</p>","PeriodicalId":24012,"journal":{"name":"中华男科学杂志","volume":"30 2","pages":"151-156"},"PeriodicalIF":0.0000,"publicationDate":"2024-02-01","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":"{\"title\":\"[Efficiency of different large language models in China in response to consultations about PCa-related perioperative nursing and health education].\",\"authors\":\"Xiao-Wen Tan, Wen-Fang Chen, Na-Na Wang, Hui-Yu Li, Juan Li, Yu-Mei Cao, Meng-Qi Zhu, Kun Li, Ting-Ling Zhang, Dian Fu\",\"doi\":\"\",\"DOIUrl\":null,\"url\":null,\"abstract\":\"<p><strong>Objective: </strong>To evaluate the efficiency of the four domestic language models, ERNIE Bot, ChatGLM2, Spark Desk and Qwen-14B-Chat, all with a massive user base and significant social attention, in response to consultations about PCa-related perioperative nursing and health education.</p><p><strong>Methods: </strong>We designed a questionnaire that includes 15 questions commonly concerned by patients undergoing radical prostatectomy and 2 common nursing cases, and inputted the questions into each of the four language models for simulation consultation. Three nursing experts assessed the model responses based on a pre-designed Likert 5-point scale in terms of accuracy, comprehensiveness, understandability, humanistic care, and case analysis. We evaluated and compared the performance of the four models using visualization tools and statistical analyses.</p><p><strong>Results: </strong>All the models generated high-quality texts with no misleading information and exhibited satisfactory performance. Qwen-14B-Chat scored the highest in all aspects and showed relatively stable outputs in multiple tests compared with ChatGLM2. Spark Desk performed well in terms of understandability but lacked comprehensiveness and humanistic care. Both Qwen-14B-Chat and ChatGLM2 demonstrated excellent performance in case analysis. The overall performance of ERNIE Bot was slightly inferior. All things considered, Qwen-14B-Chat was superior to the other three models in consultations about PCa-related perioperative nursing and health education.</p><p><strong>Conclusion: </strong>In PCa-related perioperative nursing, large language models represented by Qwen-14B-Chat are expected to become powerful auxiliary tools to provide patients with more medical expertise and information support, so as to improve the patient compliance and the quality of clinical treatment and nursing.</p>\",\"PeriodicalId\":24012,\"journal\":{\"name\":\"中华男科学杂志\",\"volume\":\"30 2\",\"pages\":\"151-156\"},\"PeriodicalIF\":0.0000,\"publicationDate\":\"2024-02-01\",\"publicationTypes\":\"Journal Article\",\"fieldsOfStudy\":null,\"isOpenAccess\":false,\"openAccessPdf\":\"\",\"citationCount\":\"0\",\"resultStr\":null,\"platform\":\"Semanticscholar\",\"paperid\":null,\"PeriodicalName\":\"中华男科学杂志\",\"FirstCategoryId\":\"3\",\"ListUrlMain\":\"\",\"RegionNum\":0,\"RegionCategory\":null,\"ArticlePicture\":[],\"TitleCN\":null,\"AbstractTextCN\":null,\"PMCID\":null,\"EPubDate\":\"\",\"PubModel\":\"\",\"JCR\":\"Q4\",\"JCRName\":\"Medicine\",\"Score\":null,\"Total\":0}","platform":"Semanticscholar","paperid":null,"PeriodicalName":"中华男科学杂志","FirstCategoryId":"3","ListUrlMain":"","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q4","JCRName":"Medicine","Score":null,"Total":0}
[Efficiency of different large language models in China in response to consultations about PCa-related perioperative nursing and health education].
Objective: To evaluate the efficiency of the four domestic language models, ERNIE Bot, ChatGLM2, Spark Desk and Qwen-14B-Chat, all with a massive user base and significant social attention, in response to consultations about PCa-related perioperative nursing and health education.
Methods: We designed a questionnaire that includes 15 questions commonly concerned by patients undergoing radical prostatectomy and 2 common nursing cases, and inputted the questions into each of the four language models for simulation consultation. Three nursing experts assessed the model responses based on a pre-designed Likert 5-point scale in terms of accuracy, comprehensiveness, understandability, humanistic care, and case analysis. We evaluated and compared the performance of the four models using visualization tools and statistical analyses.
Results: All the models generated high-quality texts with no misleading information and exhibited satisfactory performance. Qwen-14B-Chat scored the highest in all aspects and showed relatively stable outputs in multiple tests compared with ChatGLM2. Spark Desk performed well in terms of understandability but lacked comprehensiveness and humanistic care. Both Qwen-14B-Chat and ChatGLM2 demonstrated excellent performance in case analysis. The overall performance of ERNIE Bot was slightly inferior. All things considered, Qwen-14B-Chat was superior to the other three models in consultations about PCa-related perioperative nursing and health education.
Conclusion: In PCa-related perioperative nursing, large language models represented by Qwen-14B-Chat are expected to become powerful auxiliary tools to provide patients with more medical expertise and information support, so as to improve the patient compliance and the quality of clinical treatment and nursing.
期刊介绍:
National journal of andrology was founded in June 1995. It is a core journal of andrology and reproductive medicine, published monthly, and is publicly distributed at home and abroad. The main columns include expert talks, monographs (basic research, clinical research, evidence-based medicine, traditional Chinese medicine), reviews, clinical experience exchanges, case reports, etc. Priority is given to various fund-funded projects, especially the 12th Five-Year National Support Plan and the National Natural Science Foundation funded projects. This journal is included in about 20 domestic databases, including the National Science and Technology Paper Statistical Source Journal (China Science and Technology Core Journal), the Source Journal of the China Science Citation Database, the Statistical Source Journal of the China Academic Journal Comprehensive Evaluation Database (CAJCED), the Full-text Collection Journal of the China Journal Full-text Database (CJFD), the Overview of the Chinese Core Journals (2017 Edition), and the Source Journal of the Top Academic Papers of China's Fine Science and Technology Journals (F5000). It has been included in the full text of the American Chemical Abstracts, the American MEDLINE, the American EBSCO, and the database.