Evaluating the interactions of Medical Doctors with chatbots based on large language models: Insights from a nationwide study in the Greek healthcare sector using ChatGPT

IF 9 1区心理学 Q1 PSYCHOLOGY, EXPERIMENTAL Computers in Human Behavior Pub Date : 2024-08-10 DOI:10.1016/j.chb.2024.108404

Loukas Triantafyllopoulos , Georgios Feretzakis , Lazaros Tzelves , Aikaterini Sakagianni , Vassilios S. Verykios , Dimitris Kalles

{"title":"Evaluating the interactions of Medical Doctors with chatbots based on large language models: Insights from a nationwide study in the Greek healthcare sector using ChatGPT","authors":"Loukas Triantafyllopoulos , Georgios Feretzakis , Lazaros Tzelves , Aikaterini Sakagianni , Vassilios S. Verykios , Dimitris Kalles","doi":"10.1016/j.chb.2024.108404","DOIUrl":null,"url":null,"abstract":"<div><p>In this AI-focused era, researchers are delving into AI applications in healthcare, with ChatGPT being a primary focus. This Greek study involved 182 doctors from various regions, utilizing a custom web application connected to ChatGPT 4.0. Doctors from diverse departments and experience levels engaged with ChatGPT, which provided tailored responses. Over a month, data was collected using a form with a 1-to-5 rating scale. The results showed varying satisfaction levels across four criteria: clarity, response time, accuracy, and overall satisfaction. ChatGPT's response speed received high ratings (3.85/5.0), whereas clarity of information was moderately rated (3.43/5.0). A significant observation was the correlation between a doctor's experience and their satisfaction with ChatGPT. More experienced doctors (over 21 years) reported lower satisfaction (2.80–3.74/5.0) compared to their less experienced counterparts (3.43–4.20/5.0). At the medical field level, Internal Medicine showed higher satisfaction in evaluation criteria (ranging from 3.56 to 3.88), compared to other fields, while Psychiatry scored higher overall, with ratings from 3.63 to 5.00. The study also compared two departments: Urology and Internal Medicine, with the latter being more satisfied with the accuracy, and clarity of provided information, response time, and overall compared to Urology. These findings illuminate the specific needs of the health sector and highlight both the potential and areas for improvement in ChatGPT's provision of specialized medical information. Despite current limitations, ChatGPT, in its present version, offers a valuable resource to the medical community, signaling further advancements and potential integration into healthcare practices.</p></div>","PeriodicalId":48471,"journal":{"name":"Computers in Human Behavior","volume":"161 ","pages":"Article 108404"},"PeriodicalIF":9.0000,"publicationDate":"2024-08-10","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"Computers in Human Behavior","FirstCategoryId":"102","ListUrlMain":"https://www.sciencedirect.com/science/article/pii/S0747563224002723","RegionNum":1,"RegionCategory":"心理学","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"Q1","JCRName":"PSYCHOLOGY, EXPERIMENTAL","Score":null,"Total":0}

引用次数: 0

Abstract

In this AI-focused era, researchers are delving into AI applications in healthcare, with ChatGPT being a primary focus. This Greek study involved 182 doctors from various regions, utilizing a custom web application connected to ChatGPT 4.0. Doctors from diverse departments and experience levels engaged with ChatGPT, which provided tailored responses. Over a month, data was collected using a form with a 1-to-5 rating scale. The results showed varying satisfaction levels across four criteria: clarity, response time, accuracy, and overall satisfaction. ChatGPT's response speed received high ratings (3.85/5.0), whereas clarity of information was moderately rated (3.43/5.0). A significant observation was the correlation between a doctor's experience and their satisfaction with ChatGPT. More experienced doctors (over 21 years) reported lower satisfaction (2.80–3.74/5.0) compared to their less experienced counterparts (3.43–4.20/5.0). At the medical field level, Internal Medicine showed higher satisfaction in evaluation criteria (ranging from 3.56 to 3.88), compared to other fields, while Psychiatry scored higher overall, with ratings from 3.63 to 5.00. The study also compared two departments: Urology and Internal Medicine, with the latter being more satisfied with the accuracy, and clarity of provided information, response time, and overall compared to Urology. These findings illuminate the specific needs of the health sector and highlight both the potential and areas for improvement in ChatGPT's provision of specialized medical information. Despite current limitations, ChatGPT, in its present version, offers a valuable resource to the medical community, signaling further advancements and potential integration into healthcare practices.

查看原文

微信好友朋友圈 QQ好友复制链接

本刊更多论文

基于大型语言模型评估医生与聊天机器人的互动：使用 ChatGPT 对希腊医疗保健行业进行全国性研究的启示

在这个以人工智能为重点的时代，研究人员正在深入研究人工智能在医疗保健领域的应用，而 ChatGPT 则是其中的一个重点。这项希腊研究涉及来自不同地区的 182 名医生，使用的是与 ChatGPT 4.0 连接的定制网络应用程序。来自不同科室、具有不同经验水平的医生参与了 ChatGPT，ChatGPT 提供了量身定制的回复。在一个多月的时间里，我们使用 1-5 级评分表收集了数据。结果显示，医生对清晰度、响应时间、准确性和总体满意度这四项标准的满意度各不相同。ChatGPT 的响应速度获得了较高的评分（3.85/5.0），而信息清晰度则获得了中等评分（3.43/5.0）。一个重要的观察结果是医生的经验与他们对 ChatGPT 的满意度之间的相关性。经验丰富的医生（21 年以上）的满意度（2.80-3.74/5.0）低于经验不足的医生（3.43-4.20/5.0）。在医学领域层面，与其他领域相比，内科在评价标准方面的满意度较高（从 3.56 到 3.88 不等），而精神病学的总体评分较高，从 3.63 到 5.00 不等。研究还对两个科室进行了比较：与泌尿科相比，内科对所提供信息的准确性和清晰度、响应时间和整体满意度更高。这些研究结果阐明了卫生部门的特殊需求，并强调了 ChatGPT 在提供专业医疗信息方面的潜力和需要改进的地方。尽管目前还存在一些局限性，但 ChatGPT 目前的版本为医疗界提供了宝贵的资源，预示着它将进一步发展并有可能融入医疗实践。

本文章由计算机程序翻译，如有差异，请以英文原文为准。

求助全文

约1分钟内获得全文去求助

来源期刊

Computers in Human Behavior Multiple-

CiteScore

19.10

自引率

4.00%

发文量

381

审稿时长

40 days

期刊介绍： Computers in Human Behavior is a scholarly journal that explores the psychological aspects of computer use. It covers original theoretical works, research reports, literature reviews, and software and book reviews. The journal examines both the use of computers in psychology, psychiatry, and related fields, and the psychological impact of computer use on individuals, groups, and society. Articles discuss topics such as professional practice, training, research, human development, learning, cognition, personality, and social interactions. It focuses on human interactions with computers, considering the computer as a medium through which human behaviors are shaped and expressed. Professionals interested in the psychological aspects of computer use will find this journal valuable, even with limited knowledge of computers.